NASA Astrophysics Data System (ADS)
Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.
2018-05-01
Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.
Multivariate Longitudinal Analysis with Bivariate Correlation Test
Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory
2016-01-01
In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model’s parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated. PMID:27537692
Multivariate Longitudinal Analysis with Bivariate Correlation Test.
Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory
2016-01-01
In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model's parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated.
Inference of reactive transport model parameters using a Bayesian multivariate approach
NASA Astrophysics Data System (ADS)
Carniato, Luca; Schoups, Gerrit; van de Giesen, Nick
2014-08-01
Parameter estimation of subsurface transport models from multispecies data requires the definition of an objective function that includes different types of measurements. Common approaches are weighted least squares (WLS), where weights are specified a priori for each measurement, and weighted least squares with weight estimation (WLS(we)) where weights are estimated from the data together with the parameters. In this study, we formulate the parameter estimation task as a multivariate Bayesian inference problem. The WLS and WLS(we) methods are special cases in this framework, corresponding to specific prior assumptions about the residual covariance matrix. The Bayesian perspective allows for generalizations to cases where residual correlation is important and for efficient inference by analytically integrating out the variances (weights) and selected covariances from the joint posterior. Specifically, the WLS and WLS(we) methods are compared to a multivariate (MV) approach that accounts for specific residual correlations without the need for explicit estimation of the error parameters. When applied to inference of reactive transport model parameters from column-scale data on dissolved species concentrations, the following results were obtained: (1) accounting for residual correlation between species provides more accurate parameter estimation for high residual correlation levels whereas its influence for predictive uncertainty is negligible, (2) integrating out the (co)variances leads to an efficient estimation of the full joint posterior with a reduced computational effort compared to the WLS(we) method, and (3) in the presence of model structural errors, none of the methods is able to identify the correct parameter values.
Improving the realism of hydrologic model through multivariate parameter estimation
NASA Astrophysics Data System (ADS)
Rakovec, Oldrich; Kumar, Rohini; Attinger, Sabine; Samaniego, Luis
2017-04-01
Increased availability and quality of near real-time observations should improve understanding of predictive skills of hydrological models. Recent studies have shown the limited capability of river discharge data alone to adequately constrain different components of distributed model parameterizations. In this study, the GRACE satellite-based total water storage (TWS) anomaly is used to complement the discharge data with an aim to improve the fidelity of mesoscale hydrologic model (mHM) through multivariate parameter estimation. The study is conducted in 83 European basins covering a wide range of hydro-climatic regimes. The model parameterization complemented with the TWS anomalies leads to statistically significant improvements in (1) discharge simulations during low-flow period, and (2) evapotranspiration estimates which are evaluated against independent (FLUXNET) data. Overall, there is no significant deterioration in model performance for the discharge simulations when complemented by information from the TWS anomalies. However, considerable changes in the partitioning of precipitation into runoff components are noticed by in-/exclusion of TWS during the parameter estimation. A cross-validation test carried out to assess the transferability and robustness of the calibrated parameters to other locations further confirms the benefit of complementary TWS data. In particular, the evapotranspiration estimates show more robust performance when TWS data are incorporated during the parameter estimation, in comparison with the benchmark model constrained against discharge only. This study highlights the value for incorporating multiple data sources during parameter estimation to improve the overall realism of hydrologic model and its applications over large domains. Rakovec, O., Kumar, R., Attinger, S. and Samaniego, L. (2016): Improving the realism of hydrologic model functioning through multivariate parameter estimation. Water Resour. Res., 52, http://dx.doi.org/10.1002/2016WR019430
Critical elements on fitting the Bayesian multivariate Poisson Lognormal model
NASA Astrophysics Data System (ADS)
Zamzuri, Zamira Hasanah binti
2015-10-01
Motivated by a problem on fitting multivariate models to traffic accident data, a detailed discussion of the Multivariate Poisson Lognormal (MPL) model is presented. This paper reveals three critical elements on fitting the MPL model: the setting of initial estimates, hyperparameters and tuning parameters. These issues have not been highlighted in the literature. Based on simulation studies conducted, we have shown that to use the Univariate Poisson Model (UPM) estimates as starting values, at least 20,000 iterations are needed to obtain reliable final estimates. We also illustrated the sensitivity of the specific hyperparameter, which if it is not given extra attention, may affect the final estimates. The last issue is regarding the tuning parameters where they depend on the acceptance rate. Finally, a heuristic algorithm to fit the MPL model is presented. This acts as a guide to ensure that the model works satisfactorily given any data set.
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
1998-01-01
Pseudo-Maximum Likelihood (p-ML) and Asymptotically Distribution Free (ADF) estimation methods for estimating dynamic factor model parameters within a covariance structure framework were compared through a Monte Carlo simulation. Both methods appear to give consistent model parameter estimates, but only ADF gives standard errors and chi-square…
Dong, Chunjiao; Clarke, David B; Yan, Xuedong; Khattak, Asad; Huang, Baoshan
2014-09-01
Crash data are collected through police reports and integrated with road inventory data for further analysis. Integrated police reports and inventory data yield correlated multivariate data for roadway entities (e.g., segments or intersections). Analysis of such data reveals important relationships that can help focus on high-risk situations and coming up with safety countermeasures. To understand relationships between crash frequencies and associated variables, while taking full advantage of the available data, multivariate random-parameters models are appropriate since they can simultaneously consider the correlation among the specific crash types and account for unobserved heterogeneity. However, a key issue that arises with correlated multivariate data is the number of crash-free samples increases, as crash counts have many categories. In this paper, we describe a multivariate random-parameters zero-inflated negative binomial (MRZINB) regression model for jointly modeling crash counts. The full Bayesian method is employed to estimate the model parameters. Crash frequencies at urban signalized intersections in Tennessee are analyzed. The paper investigates the performance of MZINB and MRZINB regression models in establishing the relationship between crash frequencies, pavement conditions, traffic factors, and geometric design features of roadway intersections. Compared to the MZINB model, the MRZINB model identifies additional statistically significant factors and provides better goodness of fit in developing the relationships. The empirical results show that MRZINB model possesses most of the desirable statistical properties in terms of its ability to accommodate unobserved heterogeneity and excess zero counts in correlated data. Notably, in the random-parameters MZINB model, the estimated parameters vary significantly across intersections for different crash types. Copyright © 2014 Elsevier Ltd. All rights reserved.
Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang
2010-07-01
We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root- n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.
Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang
2013-01-01
We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided. PMID:24790286
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kaplanoglu, Erkan; Safak, Koray K.; Varol, H. Selcuk
2009-01-12
An experiment based method is proposed for parameter estimation of a class of linear multivariable systems. The method was applied to a pressure-level control process. Experimental time domain input/output data was utilized in a gray-box modeling approach. Prior knowledge of the form of the system transfer function matrix elements is assumed to be known. Continuous-time system transfer function matrix parameters were estimated in real-time by the least-squares method. Simulation results of experimentally determined system transfer function matrix compare very well with the experimental results. For comparison and as an alternative to the proposed real-time estimation method, we also implemented anmore » offline identification method using artificial neural networks and obtained fairly good results. The proposed methods can be implemented conveniently on a desktop PC equipped with a data acquisition board for parameter estimation of moderately complex linear multivariable systems.« less
Mathew, Boby; Holand, Anna Marie; Koistinen, Petri; Léon, Jens; Sillanpää, Mikko J
2016-02-01
A novel reparametrization-based INLA approach as a fast alternative to MCMC for the Bayesian estimation of genetic parameters in multivariate animal model is presented. Multi-trait genetic parameter estimation is a relevant topic in animal and plant breeding programs because multi-trait analysis can take into account the genetic correlation between different traits and that significantly improves the accuracy of the genetic parameter estimates. Generally, multi-trait analysis is computationally demanding and requires initial estimates of genetic and residual correlations among the traits, while those are difficult to obtain. In this study, we illustrate how to reparametrize covariance matrices of a multivariate animal model/animal models using modified Cholesky decompositions. This reparametrization-based approach is used in the Integrated Nested Laplace Approximation (INLA) methodology to estimate genetic parameters of multivariate animal model. Immediate benefits are: (1) to avoid difficulties of finding good starting values for analysis which can be a problem, for example in Restricted Maximum Likelihood (REML); (2) Bayesian estimation of (co)variance components using INLA is faster to execute than using Markov Chain Monte Carlo (MCMC) especially when realized relationship matrices are dense. The slight drawback is that priors for covariance matrices are assigned for elements of the Cholesky factor but not directly to the covariance matrix elements as in MCMC. Additionally, we illustrate the concordance of the INLA results with the traditional methods like MCMC and REML approaches. We also present results obtained from simulated data sets with replicates and field data in rice.
NASA Astrophysics Data System (ADS)
Sadegh, Mojtaba; Ragno, Elisa; AghaKouchak, Amir
2017-06-01
We present a newly developed Multivariate Copula Analysis Toolbox (MvCAT) which includes a wide range of copula families with different levels of complexity. MvCAT employs a Bayesian framework with a residual-based Gaussian likelihood function for inferring copula parameters and estimating the underlying uncertainties. The contribution of this paper is threefold: (a) providing a Bayesian framework to approximate the predictive uncertainties of fitted copulas, (b) introducing a hybrid-evolution Markov Chain Monte Carlo (MCMC) approach designed for numerical estimation of the posterior distribution of copula parameters, and (c) enabling the community to explore a wide range of copulas and evaluate them relative to the fitting uncertainties. We show that the commonly used local optimization methods for copula parameter estimation often get trapped in local minima. The proposed method, however, addresses this limitation and improves describing the dependence structure. MvCAT also enables evaluation of uncertainties relative to the length of record, which is fundamental to a wide range of applications such as multivariate frequency analysis.
A simplified parsimonious higher order multivariate Markov chain model
NASA Astrophysics Data System (ADS)
Wang, Chao; Yang, Chuan-sheng
2017-09-01
In this paper, a simplified parsimonious higher-order multivariate Markov chain model (SPHOMMCM) is presented. Moreover, parameter estimation method of TPHOMMCM is give. Numerical experiments shows the effectiveness of TPHOMMCM.
ASCAL: A Microcomputer Program for Estimating Logistic IRT Item Parameters.
ERIC Educational Resources Information Center
Vale, C. David; Gialluca, Kathleen A.
ASCAL is a microcomputer-based program for calibrating items according to the three-parameter logistic model of item response theory. It uses a modified multivariate Newton-Raphson procedure for estimating item parameters. This study evaluated this procedure using Monte Carlo Simulation Techniques. The current version of ASCAL was then compared to…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rupšys, P.
A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.
A tridiagonal parsimonious higher order multivariate Markov chain model
NASA Astrophysics Data System (ADS)
Wang, Chao; Yang, Chuan-sheng
2017-09-01
In this paper, we present a tridiagonal parsimonious higher-order multivariate Markov chain model (TPHOMMCM). Moreover, estimation method of the parameters in TPHOMMCM is give. Numerical experiments illustrate the effectiveness of TPHOMMCM.
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models
ERIC Educational Resources Information Center
Price, Larry R.
2012-01-01
The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Multivariate Meta-Analysis of Genetic Association Studies: A Simulation Study
Neupane, Binod; Beyene, Joseph
2015-01-01
In a meta-analysis with multiple end points of interests that are correlated between or within studies, multivariate approach to meta-analysis has a potential to produce more precise estimates of effects by exploiting the correlation structure between end points. However, under random-effects assumption the multivariate estimation is more complex (as it involves estimation of more parameters simultaneously) than univariate estimation, and sometimes can produce unrealistic parameter estimates. Usefulness of multivariate approach to meta-analysis of the effects of a genetic variant on two or more correlated traits is not well understood in the area of genetic association studies. In such studies, genetic variants are expected to roughly maintain Hardy-Weinberg equilibrium within studies, and also their effects on complex traits are generally very small to modest and could be heterogeneous across studies for genuine reasons. We carried out extensive simulation to explore the comparative performance of multivariate approach with most commonly used univariate inverse-variance weighted approach under random-effects assumption in various realistic meta-analytic scenarios of genetic association studies of correlated end points. We evaluated the performance with respect to relative mean bias percentage, and root mean square error (RMSE) of the estimate and coverage probability of corresponding 95% confidence interval of the effect for each end point. Our simulation results suggest that multivariate approach performs similarly or better than univariate method when correlations between end points within or between studies are at least moderate and between-study variation is similar or larger than average within-study variation for meta-analyses of 10 or more genetic studies. Multivariate approach produces estimates with smaller bias and RMSE especially for the end point that has randomly or informatively missing summary data in some individual studies, when the missing data in the endpoint are imputed with null effects and quite large variance. PMID:26196398
Multivariate Meta-Analysis of Genetic Association Studies: A Simulation Study.
Neupane, Binod; Beyene, Joseph
2015-01-01
In a meta-analysis with multiple end points of interests that are correlated between or within studies, multivariate approach to meta-analysis has a potential to produce more precise estimates of effects by exploiting the correlation structure between end points. However, under random-effects assumption the multivariate estimation is more complex (as it involves estimation of more parameters simultaneously) than univariate estimation, and sometimes can produce unrealistic parameter estimates. Usefulness of multivariate approach to meta-analysis of the effects of a genetic variant on two or more correlated traits is not well understood in the area of genetic association studies. In such studies, genetic variants are expected to roughly maintain Hardy-Weinberg equilibrium within studies, and also their effects on complex traits are generally very small to modest and could be heterogeneous across studies for genuine reasons. We carried out extensive simulation to explore the comparative performance of multivariate approach with most commonly used univariate inverse-variance weighted approach under random-effects assumption in various realistic meta-analytic scenarios of genetic association studies of correlated end points. We evaluated the performance with respect to relative mean bias percentage, and root mean square error (RMSE) of the estimate and coverage probability of corresponding 95% confidence interval of the effect for each end point. Our simulation results suggest that multivariate approach performs similarly or better than univariate method when correlations between end points within or between studies are at least moderate and between-study variation is similar or larger than average within-study variation for meta-analyses of 10 or more genetic studies. Multivariate approach produces estimates with smaller bias and RMSE especially for the end point that has randomly or informatively missing summary data in some individual studies, when the missing data in the endpoint are imputed with null effects and quite large variance.
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.
Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman
2017-03-01
We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
NASA Astrophysics Data System (ADS)
Wang, Chao; Yang, Chuan-sheng
2017-09-01
In this paper, we present a simplified parsimonious higher-order multivariate Markov chain model with new convergence condition. (TPHOMMCM-NCC). Moreover, estimation method of the parameters in TPHOMMCM-NCC is give. Numerical experiments illustrate the effectiveness of TPHOMMCM-NCC.
Optimized tuner selection for engine performance estimation
NASA Technical Reports Server (NTRS)
Simon, Donald L. (Inventor); Garg, Sanjay (Inventor)
2013-01-01
A methodology for minimizing the error in on-line Kalman filter-based aircraft engine performance estimation applications is presented. This technique specifically addresses the underdetermined estimation problem, where there are more unknown parameters than available sensor measurements. A systematic approach is applied to produce a model tuning parameter vector of appropriate dimension to enable estimation by a Kalman filter, while minimizing the estimation error in the parameters of interest. Tuning parameter selection is performed using a multi-variable iterative search routine which seeks to minimize the theoretical mean-squared estimation error. Theoretical Kalman filter estimation error bias and variance values are derived at steady-state operating conditions, and the tuner selection routine is applied to minimize these values. The new methodology yields an improvement in on-line engine performance estimation accuracy.
Multivariate meta-analysis with an increasing number of parameters
Boca, Simina M.; Pfeiffer, Ruth M.; Sampson, Joshua N.
2017-01-01
Summary Meta-analysis can average estimates of multiple parameters, such as a treatment’s effect on multiple outcomes, across studies. Univariate meta-analysis (UVMA) considers each parameter individually, while multivariate meta-analysis (MVMA) considers the parameters jointly and accounts for the correlation between their estimates. The performance of MVMA and UVMA has been extensively compared in scenarios with two parameters. Our objective is to compare the performance of MVMA and UVMA as the number of parameters, p, increases. Specifically, we show that (i) for fixed-effect meta-analysis, the benefit from using MVMA can substantially increase as p increases; (ii) for random effects meta-analysis, the benefit from MVMA can increase as p increases, but the potential improvement is modest in the presence of high between-study variability and the actual improvement is further reduced by the need to estimate an increasingly large between study covariance matrix; and (iii) when there is little to no between study variability, the loss of efficiency due to choosing random effects MVMA over fixed-effect MVMA increases as p increases. We demonstrate these three features through theory, simulation, and a meta-analysis of risk factors for Non-Hodgkin Lymphoma. PMID:28195655
Multiple imputation for handling missing outcome data when estimating the relative risk.
Sullivan, Thomas R; Lee, Katherine J; Ryan, Philip; Salter, Amy B
2017-09-06
Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework.
A mixed-effects regression model for longitudinal multivariate ordinal data.
Liu, Li C; Hedeker, Donald
2006-03-01
A mixed-effects item response theory model that allows for three-level multivariate ordinal outcomes and accommodates multiple random subject effects is proposed for analysis of multivariate ordinal outcomes in longitudinal studies. This model allows for the estimation of different item factor loadings (item discrimination parameters) for the multiple outcomes. The covariates in the model do not have to follow the proportional odds assumption and can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is proposed utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher scoring solution, which provides standard errors for all model parameters, is used. An analysis of a longitudinal substance use data set, where four items of substance use behavior (cigarette use, alcohol use, marijuana use, and getting drunk or high) are repeatedly measured over time, is used to illustrate application of the proposed model.
Estimating and Testing the Sources of Evoked Potentials in the Brain.
ERIC Educational Resources Information Center
Huizenga, Hilde M.; Molenaar, Peter C. M.
1994-01-01
The source of an event-related brain potential (ERP) is estimated from multivariate measures of ERP on the head under several mathematical and physical constraints on the parameters of the source model. Statistical aspects of estimation are discussed, and new tests are proposed. (SLD)
Ramdani, Sofiane; Bonnet, Vincent; Tallon, Guillaume; Lagarde, Julien; Bernard, Pierre Louis; Blain, Hubert
2016-08-01
Entropy measures are often used to quantify the regularity of postural sway time series. Recent methodological developments provided both multivariate and multiscale approaches allowing the extraction of complexity features from physiological signals; see "Dynamical complexity of human responses: A multivariate data-adaptive framework," in Bulletin of Polish Academy of Science and Technology, vol. 60, p. 433, 2012. The resulting entropy measures are good candidates for the analysis of bivariate postural sway signals exhibiting nonstationarity and multiscale properties. These methods are dependant on several input parameters such as embedding parameters. Using two data sets collected from institutionalized frail older adults, we numerically investigate the behavior of a recent multivariate and multiscale entropy estimator; see "Multivariate multiscale entropy: A tool for complexity analysis of multichannel data," Physics Review E, vol. 84, p. 061918, 2011. We propose criteria for the selection of the input parameters. Using these optimal parameters, we statistically compare the multivariate and multiscale entropy values of postural sway data of non-faller subjects to those of fallers. These two groups are discriminated by the resulting measures over multiple time scales. We also demonstrate that the typical parameter settings proposed in the literature lead to entropy measures that do not distinguish the two groups. This last result confirms the importance of the selection of appropriate input parameters.
Mino, H
2007-01-01
To estimate the parameters, the impulse response (IR) functions of some linear time-invariant systems generating intensity processes, in Shot-Noise-Driven Doubly Stochastic Poisson Process (SND-DSPP) in which multivariate presynaptic spike trains and postsynaptic spike trains can be assumed to be modeled by the SND-DSPPs. An explicit formula for estimating the IR functions from observations of multivariate input processes of the linear systems and the corresponding counting process (output process) is derived utilizing the expectation maximization (EM) algorithm. The validity of the estimation formula was verified through Monte Carlo simulations in which two presynaptic spike trains and one postsynaptic spike train were assumed to be observable. The IR functions estimated on the basis of the proposed identification method were close to the true IR functions. The proposed method will play an important role in identifying the input-output relationship of pre- and postsynaptic neural spike trains in practical situations.
Boosted Multivariate Trees for Longitudinal Data
Pande, Amol; Li, Liang; Rajeswaran, Jeevanantham; Ehrlinger, John; Kogalur, Udaya B.; Blackstone, Eugene H.; Ishwaran, Hemant
2017-01-01
Machine learning methods provide a powerful approach for analyzing longitudinal data in which repeated measurements are observed for a subject over time. We boost multivariate trees to fit a novel flexible semi-nonparametric marginal model for longitudinal data. In this model, features are assumed to be nonparametric, while feature-time interactions are modeled semi-nonparametrically utilizing P-splines with estimated smoothing parameter. In order to avoid overfitting, we describe a relatively simple in sample cross-validation method which can be used to estimate the optimal boosting iteration and which has the surprising added benefit of stabilizing certain parameter estimates. Our new multivariate tree boosting method is shown to be highly flexible, robust to covariance misspecification and unbalanced designs, and resistant to overfitting in high dimensions. Feature selection can be used to identify important features and feature-time interactions. An application to longitudinal data of forced 1-second lung expiratory volume (FEV1) for lung transplant patients identifies an important feature-time interaction and illustrates the ease with which our method can find complex relationships in longitudinal data. PMID:29249866
NASA Astrophysics Data System (ADS)
Widodo, Edy; Kariyam
2017-03-01
To determine the input variable settings that create the optimal compromise in response variable used Response Surface Methodology (RSM). There are three primary steps in the RSM problem, namely data collection, modelling, and optimization. In this study focused on the establishment of response surface models, using the assumption that the data produced is correct. Usually the response surface model parameters are estimated by OLS. However, this method is highly sensitive to outliers. Outliers can generate substantial residual and often affect the estimator models. Estimator models produced can be biased and could lead to errors in the determination of the optimal point of fact, that the main purpose of RSM is not reached. Meanwhile, in real life, the collected data often contain some response variable and a set of independent variables. Treat each response separately and apply a single response procedures can result in the wrong interpretation. So we need a development model for the multi-response case. Therefore, it takes a multivariate model of the response surface that is resistant to outliers. As an alternative, in this study discussed on M-estimation as a parameter estimator in multivariate response surface models containing outliers. As an illustration presented a case study on the experimental results to the enhancement of the surface layer of aluminium alloy air by shot peening.
Multivariate meta-analysis with an increasing number of parameters.
Boca, Simina M; Pfeiffer, Ruth M; Sampson, Joshua N
2017-05-01
Meta-analysis can average estimates of multiple parameters, such as a treatment's effect on multiple outcomes, across studies. Univariate meta-analysis (UVMA) considers each parameter individually, while multivariate meta-analysis (MVMA) considers the parameters jointly and accounts for the correlation between their estimates. The performance of MVMA and UVMA has been extensively compared in scenarios with two parameters. Our objective is to compare the performance of MVMA and UVMA as the number of parameters, p, increases. Specifically, we show that (i) for fixed-effect (FE) meta-analysis, the benefit from using MVMA can substantially increase as p increases; (ii) for random effects (RE) meta-analysis, the benefit from MVMA can increase as p increases, but the potential improvement is modest in the presence of high between-study variability and the actual improvement is further reduced by the need to estimate an increasingly large between study covariance matrix; and (iii) when there is little to no between-study variability, the loss of efficiency due to choosing RE MVMA over FE MVMA increases as p increases. We demonstrate these three features through theory, simulation, and a meta-analysis of risk factors for non-Hodgkin lymphoma. © Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Decker, Anna L.; Hubbard, Alan; Crespi, Catherine M.; Seto, Edmund Y.W.; Wang, May C.
2015-01-01
While child and adolescent obesity is a serious public health concern, few studies have utilized parameters based on the causal inference literature to examine the potential impacts of early intervention. The purpose of this analysis was to estimate the causal effects of early interventions to improve physical activity and diet during adolescence on body mass index (BMI), a measure of adiposity, using improved techniques. The most widespread statistical method in studies of child and adolescent obesity is multi-variable regression, with the parameter of interest being the coefficient on the variable of interest. This approach does not appropriately adjust for time-dependent confounding, and the modeling assumptions may not always be met. An alternative parameter to estimate is one motivated by the causal inference literature, which can be interpreted as the mean change in the outcome under interventions to set the exposure of interest. The underlying data-generating distribution, upon which the estimator is based, can be estimated via a parametric or semi-parametric approach. Using data from the National Heart, Lung, and Blood Institute Growth and Health Study, a 10-year prospective cohort study of adolescent girls, we estimated the longitudinal impact of physical activity and diet interventions on 10-year BMI z-scores via a parameter motivated by the causal inference literature, using both parametric and semi-parametric estimation approaches. The parameters of interest were estimated with a recently released R package, ltmle, for estimating means based upon general longitudinal treatment regimes. We found that early, sustained intervention on total calories had a greater impact than a physical activity intervention or non-sustained interventions. Multivariable linear regression yielded inflated effect estimates compared to estimates based on targeted maximum-likelihood estimation and data-adaptive super learning. Our analysis demonstrates that sophisticated, optimal semiparametric estimation of longitudinal treatment-specific means via ltmle provides an incredibly powerful, yet easy-to-use tool, removing impediments for putting theory into practice. PMID:26046009
Zhang, Fang; Wagner, Anita K; Soumerai, Stephen B; Ross-Degnan, Dennis
2009-02-01
Interrupted time series (ITS) is a strong quasi-experimental research design, which is increasingly applied to estimate the effects of health services and policy interventions. We describe and illustrate two methods for estimating confidence intervals (CIs) around absolute and relative changes in outcomes calculated from segmented regression parameter estimates. We used multivariate delta and bootstrapping methods (BMs) to construct CIs around relative changes in level and trend, and around absolute changes in outcome based on segmented linear regression analyses of time series data corrected for autocorrelated errors. Using previously published time series data, we estimated CIs around the effect of prescription alerts for interacting medications with warfarin on the rate of prescriptions per 10,000 warfarin users per month. Both the multivariate delta method (MDM) and the BM produced similar results. BM is preferred for calculating CIs of relative changes in outcomes of time series studies, because it does not require large sample sizes when parameter estimates are obtained correctly from the model. Caution is needed when sample size is small.
Multivariate meta-analysis for non-linear and other multi-parameter associations
Gasparrini, A; Armstrong, B; Kenward, M G
2012-01-01
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
Falcaro, Milena; Pickles, Andrew
2007-02-10
We focus on the analysis of multivariate survival times with highly structured interdependency and subject to interval censoring. Such data are common in developmental genetics and genetic epidemiology. We propose a flexible mixed probit model that deals naturally with complex but uninformative censoring. The recorded ages of onset are treated as possibly censored ordinal outcomes with the interval censoring mechanism seen as arising from a coarsened measurement of a continuous variable observed as falling between subject-specific thresholds. This bypasses the requirement for the failure times to be observed as falling into non-overlapping intervals. The assumption of a normal age-of-onset distribution of the standard probit model is relaxed by embedding within it a multivariate Box-Cox transformation whose parameters are jointly estimated with the other parameters of the model. Complex decompositions of the underlying multivariate normal covariance matrix of the transformed ages of onset become possible. The new methodology is here applied to a multivariate study of the ages of first use of tobacco and first consumption of alcohol without parental permission in twins. The proposed model allows estimation of the genetic and environmental effects that are shared by both of these risk behaviours as well as those that are specific. 2006 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.
2014-12-01
Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.
Heggeseth, Brianna C; Jewell, Nicholas P
2013-07-20
Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence-that is, given the mixture component label, the outcomes for subjects are independent. In this paper, we study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and mixing probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well separated or if the assumed correlation is close to the truth even when the covariance is misspecified. We also present a robust standard error estimator and show that it outperforms conventional estimators in simulations and can indicate that the model is misspecified. Body mass index data from a national longitudinal study are used to demonstrate the effects of misspecification on potential inferences made in practice. Copyright © 2013 John Wiley & Sons, Ltd.
PERIODIC AUTOREGRESSIVE-MOVING AVERAGE (PARMA) MODELING WITH APPLICATIONS TO WATER RESOURCES.
Vecchia, A.V.
1985-01-01
Results involving correlation properties and parameter estimation for autogressive-moving average models with periodic parameters are presented. A multivariate representation of the PARMA model is used to derive parameter space restrictions and difference equations for the periodic autocorrelations. Close approximation to the likelihood function for Gaussian PARMA processes results in efficient maximum-likelihood estimation procedures. Terms in the Fourier expansion of the parameters are sequentially included, and a selection criterion is given for determining the optimal number of harmonics to be included. Application of the techniques is demonstrated through analysis of a monthly streamflow time series.
Jeon, Jihyoun; Hsu, Li; Gorfine, Malka
2012-07-01
Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
Verdam, Mathilde G. E.; Oort, Frans J.
2014-01-01
Highlights Application of Kronecker product to construct parsimonious structural equation models for multivariate longitudinal data. A method for the investigation of measurement bias with Kronecker product restricted models. Application of these methods to health-related quality of life data from bone metastasis patients, collected at 13 consecutive measurement occasions. The use of curves to facilitate substantive interpretation of apparent measurement bias. Assessment of change in common factor means, after accounting for apparent measurement bias. Longitudinal measurement invariance is usually investigated with a longitudinal factor model (LFM). However, with multiple measurement occasions, the number of parameters to be estimated increases with a multiple of the number of measurement occasions. To guard against too low ratios of numbers of subjects and numbers of parameters, we can use Kronecker product restrictions to model the multivariate longitudinal structure of the data. These restrictions can be imposed on all parameter matrices, including measurement invariance restrictions on factor loadings and intercepts. The resulting models are parsimonious and have attractive interpretation, but require different methods for the investigation of measurement bias. Specifically, additional parameter matrices are introduced to accommodate possible violations of measurement invariance. These additional matrices consist of measurement bias parameters that are either fixed at zero or free to be estimated. In cases of measurement bias, it is also possible to model the bias over time, e.g., with linear or non-linear curves. Measurement bias detection with Kronecker product restricted models will be illustrated with multivariate longitudinal data from 682 bone metastasis patients whose health-related quality of life (HRQL) was measured at 13 consecutive weeks. PMID:25295016
Verdam, Mathilde G E; Oort, Frans J
2014-01-01
Application of Kronecker product to construct parsimonious structural equation models for multivariate longitudinal data.A method for the investigation of measurement bias with Kronecker product restricted models.Application of these methods to health-related quality of life data from bone metastasis patients, collected at 13 consecutive measurement occasions.The use of curves to facilitate substantive interpretation of apparent measurement bias.Assessment of change in common factor means, after accounting for apparent measurement bias.Longitudinal measurement invariance is usually investigated with a longitudinal factor model (LFM). However, with multiple measurement occasions, the number of parameters to be estimated increases with a multiple of the number of measurement occasions. To guard against too low ratios of numbers of subjects and numbers of parameters, we can use Kronecker product restrictions to model the multivariate longitudinal structure of the data. These restrictions can be imposed on all parameter matrices, including measurement invariance restrictions on factor loadings and intercepts. The resulting models are parsimonious and have attractive interpretation, but require different methods for the investigation of measurement bias. Specifically, additional parameter matrices are introduced to accommodate possible violations of measurement invariance. These additional matrices consist of measurement bias parameters that are either fixed at zero or free to be estimated. In cases of measurement bias, it is also possible to model the bias over time, e.g., with linear or non-linear curves. Measurement bias detection with Kronecker product restricted models will be illustrated with multivariate longitudinal data from 682 bone metastasis patients whose health-related quality of life (HRQL) was measured at 13 consecutive weeks.
Multivariate Non-Symmetric Stochastic Models for Spatial Dependence Models
NASA Astrophysics Data System (ADS)
Haslauer, C. P.; Bárdossy, A.
2017-12-01
A copula based multivariate framework allows more flexibility to describe different kind of dependences than what is possible using models relying on the confining assumption of symmetric Gaussian models: different quantiles can be modelled with a different degree of dependence; it will be demonstrated how this can be expected given process understanding. maximum likelihood based multivariate quantitative parameter estimation yields stable and reliable results; not only improved results in cross-validation based measures of uncertainty are obtained but also a more realistic spatial structure of uncertainty compared to second order models of dependence; as much information as is available is included in the parameter estimation: incorporation of censored measurements (e.g., below detection limit, or ones that are above the sensitive range of the measurement device) yield to more realistic spatial models; the proportion of true zeros can be jointly estimated with and distinguished from censored measurements which allow estimates about the age of a contaminant in the system; secondary information (categorical and on the rational scale) has been used to improve the estimation of the primary variable; These copula based multivariate statistical techniques are demonstrated based on hydraulic conductivity observations at the Borden (Canada) site, the MADE site (USA), and a large regional groundwater quality data-set in south-west Germany. Fields of spatially distributed K were simulated with identical marginal simulation, identical second order spatial moments, yet substantially differing solute transport characteristics when numerical tracer tests were performed. A statistical methodology is shown that allows the delineation of a boundary layer separating homogenous parts of a spatial data-set. The effects of this boundary layer (macro structure) and the spatial dependence of K (micro structure) on solute transport behaviour is shown.
Copula-based analysis of rhythm
NASA Astrophysics Data System (ADS)
García, J. E.; González-López, V. A.; Viola, M. L. Lanfredi
2016-06-01
In this paper we establish stochastic profiles of the rhythm for three languages: English, Japanese and Spanish. We model the increase or decrease of the acoustical energy, collected into three bands coming from the acoustic signal. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination of the partitions corresponding to the three marginal processes, one for each band of energy, and the partition coming from to the multivariate Markov chain. Then, all the partitions are linked using a copula, in order to estimate the transition probabilities.
Multivariate Time Series Decomposition into Oscillation Components.
Matsuda, Takeru; Komaki, Fumiyasu
2017-08-01
Many time series are considered to be a superposition of several oscillation components. We have proposed a method for decomposing univariate time series into oscillation components and estimating their phases (Matsuda & Komaki, 2017 ). In this study, we extend that method to multivariate time series. We assume that several oscillators underlie the given multivariate time series and that each variable corresponds to a superposition of the projections of the oscillators. Thus, the oscillators superpose on each variable with amplitude and phase modulation. Based on this idea, we develop gaussian linear state-space models and use them to decompose the given multivariate time series. The model parameters are estimated from data using the empirical Bayes method, and the number of oscillators is determined using the Akaike information criterion. Therefore, the proposed method extracts underlying oscillators in a data-driven manner and enables investigation of phase dynamics in a given multivariate time series. Numerical results show the effectiveness of the proposed method. From monthly mean north-south sunspot number data, the proposed method reveals an interesting phase relationship.
Hagar, Yolanda C; Harvey, Danielle J; Beckett, Laurel A
2016-08-30
We develop a multivariate cure survival model to estimate lifetime patterns of colorectal cancer screening. Screening data cover long periods of time, with sparse observations for each person. Some events may occur before the study begins or after the study ends, so the data are both left-censored and right-censored, and some individuals are never screened (the 'cured' population). We propose a multivariate parametric cure model that can be used with left-censored and right-censored data. Our model allows for the estimation of the time to screening as well as the average number of times individuals will be screened. We calculate likelihood functions based on the observations for each subject using a distribution that accounts for within-subject correlation and estimate parameters using Markov chain Monte Carlo methods. We apply our methods to the estimation of lifetime colorectal cancer screening behavior in the SEER-Medicare data set. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Berger, Lukas; Kleinheinz, Konstantin; Attili, Antonio; Bisetti, Fabrizio; Pitsch, Heinz; Mueller, Michael E.
2018-05-01
Modelling unclosed terms in partial differential equations typically involves two steps: First, a set of known quantities needs to be specified as input parameters for a model, and second, a specific functional form needs to be defined to model the unclosed terms by the input parameters. Both steps involve a certain modelling error, with the former known as the irreducible error and the latter referred to as the functional error. Typically, only the total modelling error, which is the sum of functional and irreducible error, is assessed, but the concept of the optimal estimator enables the separate analysis of the total and the irreducible errors, yielding a systematic modelling error decomposition. In this work, attention is paid to the techniques themselves required for the practical computation of irreducible errors. Typically, histograms are used for optimal estimator analyses, but this technique is found to add a non-negligible spurious contribution to the irreducible error if models with multiple input parameters are assessed. Thus, the error decomposition of an optimal estimator analysis becomes inaccurate, and misleading conclusions concerning modelling errors may be drawn. In this work, numerically accurate techniques for optimal estimator analyses are identified and a suitable evaluation of irreducible errors is presented. Four different computational techniques are considered: a histogram technique, artificial neural networks, multivariate adaptive regression splines, and an additive model based on a kernel method. For multiple input parameter models, only artificial neural networks and multivariate adaptive regression splines are found to yield satisfactorily accurate results. Beyond a certain number of input parameters, the assessment of models in an optimal estimator analysis even becomes practically infeasible if histograms are used. The optimal estimator analysis in this paper is applied to modelling the filtered soot intermittency in large eddy simulations using a dataset of a direct numerical simulation of a non-premixed sooting turbulent flame.
Statistical analysis of multivariate atmospheric variables. [cloud cover
NASA Technical Reports Server (NTRS)
Tubbs, J. D.
1979-01-01
Topics covered include: (1) estimation in discrete multivariate distributions; (2) a procedure to predict cloud cover frequencies in the bivariate case; (3) a program to compute conditional bivariate normal parameters; (4) the transformation of nonnormal multivariate to near-normal; (5) test of fit for the extreme value distribution based upon the generalized minimum chi-square; (6) test of fit for continuous distributions based upon the generalized minimum chi-square; (7) effect of correlated observations on confidence sets based upon chi-square statistics; and (8) generation of random variates from specified distributions.
Gain-scheduling multivariable LPV control of an irrigation canal system.
Bolea, Yolanda; Puig, Vicenç
2016-07-01
The purpose of this paper is to present a multivariable linear parameter varying (LPV) controller with a gain scheduling Smith Predictor (SP) scheme applicable to open-flow canal systems. This LPV controller based on SP is designed taking into account the uncertainty in the estimation of delay and the variation of plant parameters according to the operating point. This new methodology can be applied to a class of delay systems that can be represented by a set of models that can be factorized into a rational multivariable model in series with left/right diagonal (multiple) delays, such as, the case of irrigation canals. A multiple pool canal system is used to test and validate the proposed control approach. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Simple Penalties on Maximum-Likelihood Estimates of Genetic Parameters to Reduce Sampling Variation
Meyer, Karin
2016-01-01
Multivariate estimates of genetic parameters are subject to substantial sampling variation, especially for smaller data sets and more than a few traits. A simple modification of standard, maximum-likelihood procedures for multivariate analyses to estimate genetic covariances is described, which can improve estimates by substantially reducing their sampling variances. This is achieved by maximizing the likelihood subject to a penalty. Borrowing from Bayesian principles, we propose a mild, default penalty—derived assuming a Beta distribution of scale-free functions of the covariance components to be estimated—rather than laboriously attempting to determine the stringency of penalization from the data. An extensive simulation study is presented, demonstrating that such penalties can yield very worthwhile reductions in loss, i.e., the difference from population values, for a wide range of scenarios and without distorting estimates of phenotypic covariances. Moreover, mild default penalties tend not to increase loss in difficult cases and, on average, achieve reductions in loss of similar magnitude to computationally demanding schemes to optimize the degree of penalization. Pertinent details required for the adaptation of standard algorithms to locate the maximum of the likelihood function are outlined. PMID:27317681
Overview and benchmark analysis of fuel cell parameters estimation for energy management purposes
NASA Astrophysics Data System (ADS)
Kandidayeni, M.; Macias, A.; Amamou, A. A.; Boulon, L.; Kelouwani, S.; Chaoui, H.
2018-03-01
Proton exchange membrane fuel cells (PEMFCs) have become the center of attention for energy conversion in many areas such as automotive industry, where they confront a high dynamic behavior resulting in their characteristics variation. In order to ensure appropriate modeling of PEMFCs, accurate parameters estimation is in demand. However, parameter estimation of PEMFC models is highly challenging due to their multivariate, nonlinear, and complex essence. This paper comprehensively reviews PEMFC models parameters estimation methods with a specific view to online identification algorithms, which are considered as the basis of global energy management strategy design, to estimate the linear and nonlinear parameters of a PEMFC model in real time. In this respect, different PEMFC models with different categories and purposes are discussed first. Subsequently, a thorough investigation of PEMFC parameter estimation methods in the literature is conducted in terms of applicability. Three potential algorithms for online applications, Recursive Least Square (RLS), Kalman filter, and extended Kalman filter (EKF), which has escaped the attention in previous works, have been then utilized to identify the parameters of two well-known semi-empirical models in the literature, Squadrito et al. and Amphlett et al. Ultimately, the achieved results and future challenges are discussed.
NASA Astrophysics Data System (ADS)
Schaffrin, Burkhard; Felus, Yaron A.
2008-06-01
The multivariate total least-squares (MTLS) approach aims at estimating a matrix of parameters, Ξ, from a linear model ( Y- E Y = ( X- E X ) · Ξ) that includes an observation matrix, Y, another observation matrix, X, and matrices of randomly distributed errors, E Y and E X . Two special cases of the MTLS approach include the standard multivariate least-squares approach where only the observation matrix, Y, is perturbed by random errors and, on the other hand, the data least-squares approach where only the coefficient matrix X is affected by random errors. In a previous contribution, the authors derived an iterative algorithm to solve the MTLS problem by using the nonlinear Euler-Lagrange conditions. In this contribution, new lemmas are developed to analyze the iterative algorithm, modify it, and compare it with a new ‘closed form’ solution that is based on the singular-value decomposition. For an application, the total least-squares approach is used to estimate the affine transformation parameters that convert cadastral data from the old to the new Israeli datum. Technical aspects of this approach, such as scaling the data and fixing the columns in the coefficient matrix are investigated. This case study illuminates the issue of “symmetry” in the treatment of two sets of coordinates for identical point fields, a topic that had already been emphasized by Teunissen (1989, Festschrift to Torben Krarup, Geodetic Institute Bull no. 58, Copenhagen, Denmark, pp 335-342). The differences between the standard least-squares and the TLS approach are analyzed in terms of the estimated variance component and a first-order approximation of the dispersion matrix of the estimated parameters.
Julien, Clavel; Leandro, Aristide; Hélène, Morlon
2018-06-19
Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
Optimal Tuner Selection for Kalman Filter-Based Aircraft Engine Performance Estimation
NASA Technical Reports Server (NTRS)
Simon, Donald L.; Garg, Sanjay
2010-01-01
A linear point design methodology for minimizing the error in on-line Kalman filter-based aircraft engine performance estimation applications is presented. This technique specifically addresses the underdetermined estimation problem, where there are more unknown parameters than available sensor measurements. A systematic approach is applied to produce a model tuning parameter vector of appropriate dimension to enable estimation by a Kalman filter, while minimizing the estimation error in the parameters of interest. Tuning parameter selection is performed using a multi-variable iterative search routine which seeks to minimize the theoretical mean-squared estimation error. This paper derives theoretical Kalman filter estimation error bias and variance values at steady-state operating conditions, and presents the tuner selection routine applied to minimize these values. Results from the application of the technique to an aircraft engine simulation are presented and compared to the conventional approach of tuner selection. Experimental simulation results are found to be in agreement with theoretical predictions. The new methodology is shown to yield a significant improvement in on-line engine performance estimation accuracy
Park, Eun Sug; Symanski, Elaine; Han, Daikwon; Spiegelman, Clifford
2015-06-01
A major difficulty with assessing source-specific health effects is that source-specific exposures cannot be measured directly; rather, they need to be estimated by a source-apportionment method such as multivariate receptor modeling. The uncertainty in source apportionment (uncertainty in source-specific exposure estimates and model uncertainty due to the unknown number of sources and identifiability conditions) has been largely ignored in previous studies. Also, spatial dependence of multipollutant data collected from multiple monitoring sites has not yet been incorporated into multivariate receptor modeling. The objectives of this project are (1) to develop a multipollutant approach that incorporates both sources of uncertainty in source-apportionment into the assessment of source-specific health effects and (2) to develop enhanced multivariate receptor models that can account for spatial correlations in the multipollutant data collected from multiple sites. We employed a Bayesian hierarchical modeling framework consisting of multivariate receptor models, health-effects models, and a hierarchical model on latent source contributions. For the health model, we focused on the time-series design in this project. Each combination of number of sources and identifiability conditions (additional constraints on model parameters) defines a different model. We built a set of plausible models with extensive exploratory data analyses and with information from previous studies, and then computed posterior model probability to estimate model uncertainty. Parameter estimation and model uncertainty estimation were implemented simultaneously by Markov chain Monte Carlo (MCMC*) methods. We validated the methods using simulated data. We illustrated the methods using PM2.5 (particulate matter ≤ 2.5 μm in aerodynamic diameter) speciation data and mortality data from Phoenix, Arizona, and Houston, Texas. The Phoenix data included counts of cardiovascular deaths and daily PM2.5 speciation data from 1995-1997. The Houston data included respiratory mortality data and 24-hour PM2.5 speciation data sampled every six days from a region near the Houston Ship Channel in years 2002-2005. We also developed a Bayesian spatial multivariate receptor modeling approach that, while simultaneously dealing with the unknown number of sources and identifiability conditions, incorporated spatial correlations in the multipollutant data collected from multiple sites into the estimation of source profiles and contributions based on the discrete process convolution model for multivariate spatial processes. This new modeling approach was applied to 24-hour ambient air concentrations of 17 volatile organic compounds (VOCs) measured at nine monitoring sites in Harris County, Texas, during years 2000 to 2005. Simulation results indicated that our methods were accurate in identifying the true model and estimated parameters were close to the true values. The results from our methods agreed in general with previous studies on the source apportionment of the Phoenix data in terms of estimated source profiles and contributions. However, we had a greater number of statistically insignificant findings, which was likely a natural consequence of incorporating uncertainty in the estimated source contributions into the health-effects parameter estimation. For the Houston data, a model with five sources (that seemed to be Sulfate-Rich Secondary Aerosol, Motor Vehicles, Industrial Combustion, Soil/Crustal Matter, and Sea Salt) showed the highest posterior model probability among the candidate models considered when fitted simultaneously to the PM2.5 and mortality data. There was a statistically significant positive association between respiratory mortality and same-day PM2.5 concentrations attributed to one of the sources (probably industrial combustion). The Bayesian spatial multivariate receptor modeling approach applied to the VOC data led to a highest posterior model probability for a model with five sources (that seemed to be refinery, petrochemical production, gasoline evaporation, natural gas, and vehicular exhaust) among several candidate models, with the number of sources varying between three and seven and with different identifiability conditions. Our multipollutant approach assessing source-specific health effects is more advantageous than a single-pollutant approach in that it can estimate total health effects from multiple pollutants and can also identify emission sources that are responsible for adverse health effects. Our Bayesian approach can incorporate not only uncertainty in the estimated source contributions, but also model uncertainty that has not been addressed in previous studies on assessing source-specific health effects. The new Bayesian spatial multivariate receptor modeling approach enables predictions of source contributions at unmonitored sites, minimizing exposure misclassification and providing improved exposure estimates along with their uncertainty estimates, as well as accounting for uncertainty in the number of sources and identifiability conditions.
ERIC Educational Resources Information Center
Nevitt, Johnathan; Hancock, Gregory R.
Though common structural equation modeling (SEM) methods are predicated upon the assumption of multivariate normality, applied researchers often find themselves with data clearly violating this assumption and without sufficient sample size to use distribution-free estimation methods. Fortunately, promising alternatives are being integrated into…
Multivariate space - time analysis of PRE-STORM precipitation
NASA Technical Reports Server (NTRS)
Polyak, Ilya; North, Gerald R.; Valdes, Juan B.
1994-01-01
This paper presents the methodologies and results of the multivariate modeling and two-dimensional spectral and correlation analysis of PRE-STORM rainfall gauge data. Estimated parameters of the models for the specific spatial averages clearly indicate the eastward and southeastward wave propagation of rainfall fluctuations. A relationship between the coefficients of the diffusion equation and the parameters of the stochastic model of rainfall fluctuations is derived that leads directly to the exclusive use of rainfall data to estimate advection speed (about 12 m/s) as well as other coefficients of the diffusion equation of the corresponding fields. The statistical methodology developed here can be used for confirmation of physical models by comparison of the corresponding second-moment statistics of the observed and simulated data, for generating multiple samples of any size, for solving the inverse problem of the hydrodynamic equations, and for application in some other areas of meteorological and climatological data analysis and modeling.
Enhancing e-waste estimates: Improving data quality by multivariate Input–Output Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Feng, E-mail: fwang@unu.edu; Design for Sustainability Lab, Faculty of Industrial Design Engineering, Delft University of Technology, Landbergstraat 15, 2628CE Delft; Huisman, Jaco
2013-11-15
Highlights: • A multivariate Input–Output Analysis method for e-waste estimates is proposed. • Applying multivariate analysis to consolidate data can enhance e-waste estimates. • We examine the influence of model selection and data quality on e-waste estimates. • Datasets of all e-waste related variables in a Dutch case study have been provided. • Accurate modeling of time-variant lifespan distributions is critical for estimate. - Abstract: Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lackmore » of high quality data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase data quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various data points. By applying this method, the data consolidation steps can generate more accurate time-series datasets from available data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve data quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.« less
Rosen, Sophia; Davidov, Ori
2012-07-20
Multivariate outcomes are often measured longitudinally. For example, in hearing loss studies, hearing thresholds for each subject are measured repeatedly over time at several frequencies. Thus, each patient is associated with a multivariate longitudinal outcome. The multivariate mixed-effects model is a useful tool for the analysis of such data. There are situations in which the parameters of the model are subject to some restrictions or constraints. For example, it is known that hearing thresholds, at every frequency, increase with age. Moreover, this age-related threshold elevation is monotone in frequency, that is, the higher the frequency, the higher, on average, is the rate of threshold elevation. This means that there is a natural ordering among the different frequencies in the rate of hearing loss. In practice, this amounts to imposing a set of constraints on the different frequencies' regression coefficients modeling the mean effect of time and age at entry to the study on hearing thresholds. The aforementioned constraints should be accounted for in the analysis. The result is a multivariate longitudinal model with restricted parameters. We propose estimation and testing procedures for such models. We show that ignoring the constraints may lead to misleading inferences regarding the direction and the magnitude of various effects. Moreover, simulations show that incorporating the constraints substantially improves the mean squared error of the estimates and the power of the tests. We used this methodology to analyze a real hearing loss study. Copyright © 2012 John Wiley & Sons, Ltd.
Rosa, Maria J; Mehta, Mitul A; Pich, Emilio M; Risterucci, Celine; Zelaya, Fernando; Reinders, Antje A T S; Williams, Steve C R; Dazzan, Paola; Doyle, Orla M; Marquand, Andre F
2015-01-01
An increasing number of neuroimaging studies are based on either combining more than one data modality (inter-modal) or combining more than one measurement from the same modality (intra-modal). To date, most intra-modal studies using multivariate statistics have focused on differences between datasets, for instance relying on classifiers to differentiate between effects in the data. However, to fully characterize these effects, multivariate methods able to measure similarities between datasets are needed. One classical technique for estimating the relationship between two datasets is canonical correlation analysis (CCA). However, in the context of high-dimensional data the application of CCA is extremely challenging. A recent extension of CCA, sparse CCA (SCCA), overcomes this limitation, by regularizing the model parameters while yielding a sparse solution. In this work, we modify SCCA with the aim of facilitating its application to high-dimensional neuroimaging data and finding meaningful multivariate image-to-image correspondences in intra-modal studies. In particular, we show how the optimal subset of variables can be estimated independently and we look at the information encoded in more than one set of SCCA transformations. We illustrate our framework using Arterial Spin Labeling data to investigate multivariate similarities between the effects of two antipsychotic drugs on cerebral blood flow.
Direct calculation of modal parameters from matrix orthogonal polynomials
NASA Astrophysics Data System (ADS)
El-Kafafy, Mahmoud; Guillaume, Patrick
2011-10-01
The object of this paper is to introduce a new technique to derive the global modal parameter (i.e. system poles) directly from estimated matrix orthogonal polynomials. This contribution generalized the results given in Rolain et al. (1994) [5] and Rolain et al. (1995) [6] for scalar orthogonal polynomials to multivariable (matrix) orthogonal polynomials for multiple input multiple output (MIMO) system. Using orthogonal polynomials improves the numerical properties of the estimation process. However, the derivation of the modal parameters from the orthogonal polynomials is in general ill-conditioned if not handled properly. The transformation of the coefficients from orthogonal polynomials basis to power polynomials basis is known to be an ill-conditioned transformation. In this paper a new approach is proposed to compute the system poles directly from the multivariable orthogonal polynomials. High order models can be used without any numerical problems. The proposed method will be compared with existing methods (Van Der Auweraer and Leuridan (1987) [4] Chen and Xu (2003) [7]). For this comparative study, simulated as well as experimental data will be used.
Ronald E. McRoberts; Grant M. Domke; Qi Chen; Erik Næsset; Terje Gobakken
2016-01-01
The relatively small sampling intensities used by national forest inventories are often insufficient to produce the desired precision for estimates of population parameters unless the estimation process is augmented with auxiliary information, usually in the form of remotely sensed data. The k-Nearest Neighbors (k-NN) technique is a non-parametric,multivariate approach...
Copula-based prediction of economic movements
NASA Astrophysics Data System (ADS)
García, J. E.; González-López, V. A.; Hirsh, I. D.
2016-06-01
In this paper we model the discretized returns of two paired time series BM&FBOVESPA Dividend Index and BM&FBOVESPA Public Utilities Index using multivariate Markov models. The discretization corresponds to three categories, high losses, high profits and the complementary periods of the series. In technical terms, the maximal memory that can be considered for a Markov model, can be derived from the size of the alphabet and dataset. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination, of the partitions corresponding to the two marginal processes and the partition corresponding to the multivariate Markov chain. In order to estimate the transition probabilities, all the partitions are linked using a copula. In our application this strategy provides a significant improvement in the movement predictions.
Up-scaling of multi-variable flood loss models from objects to land use units at the meso-scale
NASA Astrophysics Data System (ADS)
Kreibich, Heidi; Schröter, Kai; Merz, Bruno
2016-05-01
Flood risk management increasingly relies on risk analyses, including loss modelling. Most of the flood loss models usually applied in standard practice have in common that complex damaging processes are described by simple approaches like stage-damage functions. Novel multi-variable models significantly improve loss estimation on the micro-scale and may also be advantageous for large-scale applications. However, more input parameters also reveal additional uncertainty, even more in upscaling procedures for meso-scale applications, where the parameters need to be estimated on a regional area-wide basis. To gain more knowledge about challenges associated with the up-scaling of multi-variable flood loss models the following approach is applied: Single- and multi-variable micro-scale flood loss models are up-scaled and applied on the meso-scale, namely on basis of ATKIS land-use units. Application and validation is undertaken in 19 municipalities, which were affected during the 2002 flood by the River Mulde in Saxony, Germany by comparison to official loss data provided by the Saxon Relief Bank (SAB).In the meso-scale case study based model validation, most multi-variable models show smaller errors than the uni-variable stage-damage functions. The results show the suitability of the up-scaling approach, and, in accordance with micro-scale validation studies, that multi-variable models are an improvement in flood loss modelling also on the meso-scale. However, uncertainties remain high, stressing the importance of uncertainty quantification. Thus, the development of probabilistic loss models, like BT-FLEMO used in this study, which inherently provide uncertainty information are the way forward.
Benoit, Julia S; Chan, Wenyaw; Doody, Rachelle S
2015-01-01
Parameter dependency within data sets in simulation studies is common, especially in models such as Continuous-Time Markov Chains (CTMC). Additionally, the literature lacks a comprehensive examination of estimation performance for the likelihood-based general multi-state CTMC. Among studies attempting to assess the estimation, none have accounted for dependency among parameter estimates. The purpose of this research is twofold: 1) to develop a multivariate approach for assessing accuracy and precision for simulation studies 2) to add to the literature a comprehensive examination of the estimation of a general 3-state CTMC model. Simulation studies are conducted to analyze longitudinal data with a trinomial outcome using a CTMC with and without covariates. Measures of performance including bias, component-wise coverage probabilities, and joint coverage probabilities are calculated. An application is presented using Alzheimer's disease caregiver stress levels. Comparisons of joint and component-wise parameter estimates yield conflicting inferential results in simulations from models with and without covariates. In conclusion, caution should be taken when conducting simulation studies aiming to assess performance and choice of inference should properly reflect the purpose of the simulation.
Effect of microwave radiation on Jayadhar cotton fibers: WAXS studies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Niranjana, A. R., E-mail: arnphysics@gmail.com; Mahesh, S. S., E-mail: arnphysics@gmail.com; Divakara, S., E-mail: arnphysics@gmail.com
Thermal effect in the form of micro wave energy on Jayadhar cotton fiber has been investigated. Microstructural parameters have been estimated using wide angle x-ray scattering (WAXS) data and line profile analysis program developed by us. Physical properties like tensile strength are correlated with X-ray results. We observe that the microwave radiation do affect significantly many parameters and we have suggested a multivariate analysis of these parameters to arrive at a significant result.
Aerodynamic parameter estimation via Fourier modulating function techniques
NASA Technical Reports Server (NTRS)
Pearson, A. E.
1995-01-01
Parameter estimation algorithms are developed in the frequency domain for systems modeled by input/output ordinary differential equations. The approach is based on Shinbrot's method of moment functionals utilizing Fourier based modulating functions. Assuming white measurement noises for linear multivariable system models, an adaptive weighted least squares algorithm is developed which approximates a maximum likelihood estimate and cannot be biased by unknown initial or boundary conditions in the data owing to a special property attending Shinbrot-type modulating functions. Application is made to perturbation equation modeling of the longitudinal and lateral dynamics of a high performance aircraft using flight-test data. Comparative studies are included which demonstrate potential advantages of the algorithm relative to some well established techniques for parameter identification. Deterministic least squares extensions of the approach are made to the frequency transfer function identification problem for linear systems and to the parameter identification problem for a class of nonlinear-time-varying differential system models.
Estimating brain connectivity when few data points are available: Perspectives and limitations.
Antonacci, Yuri; Toppi, Jlenia; Caschera, Stefano; Anzolin, Alessandra; Mattia, Donatella; Astolfi, Laura
2017-07-01
Methods based on the use of multivariate autoregressive modeling (MVAR) have proved to be an accurate and flexible tool for the estimation of brain functional connectivity. The multivariate approach, however, implies the use of a model whose complexity (in terms of number of parameters) increases quadratically with the number of signals included in the problem. This can often lead to an underdetermined problem and to the condition of multicollinearity. The aim of this paper is to introduce and test an approach based on Ridge Regression combined with a modified version of the statistics usually adopted for these methods, to broaden the estimation of brain connectivity to those conditions in which current methods fail, due to the lack of enough data points. We tested the performances of this new approach, in comparison with the classical approach based on ordinary least squares (OLS), by means of a simulation study implementing different ground-truth networks, under different network sizes and different levels of data points. Simulation results showed that the new approach provides better performances, in terms of accuracy of the parameters estimation and false positives/false negatives rates, in all conditions related to a low data points/model dimension ratio, and may thus be exploited to estimate and validate estimated patterns at single-trial level or when short time data segments are available.
NASA Astrophysics Data System (ADS)
Åberg Lindell, M.; Andersson, P.; Grape, S.; Håkansson, A.; Thulin, M.
2018-07-01
In addition to verifying operator declared parameters of spent nuclear fuel, the ability to experimentally infer such parameters with a minimum of intrusiveness is of great interest and has been long-sought after in the nuclear safeguards community. It can also be anticipated that such ability would be of interest for quality assurance in e.g. recycling facilities in future Generation IV nuclear fuel cycles. One way to obtain information regarding spent nuclear fuel is to measure various gamma-ray intensities using high-resolution gamma-ray spectroscopy. While intensities from a few isotopes obtained from such measurements have traditionally been used pairwise, the approach in this work is to simultaneously analyze correlations between all available isotopes, using multivariate analysis techniques. Based on this approach, a methodology for inferring burnup, cooling time, and initial fissile content of PWR fuels using passive gamma-ray spectroscopy data has been investigated. PWR nuclear fuels, of UOX and MOX type, and their gamma-ray emissions, were simulated using the Monte Carlo code Serpent. Data comprising relative isotope activities was analyzed with decision trees and support vector machines, for predicting fuel parameters and their associated uncertainties. From this work it may be concluded that up to a cooling time of twenty years, the 95% prediction intervals of burnup, cooling time and initial fissile content could be inferred to within approximately 7 MWd/kgHM, 8 months, and 1.4 percentage points, respectively. An attempt aiming to estimate the plutonium content in spent UOX fuel, using the developed multivariate analysis model, is also presented. The results for Pu mass estimation are promising and call for further studies.
Inouye, David I.; Ravikumar, Pradeep; Dhillon, Inderjit S.
2016-01-01
We develop Square Root Graphical Models (SQR), a novel class of parametric graphical models that provides multivariate generalizations of univariate exponential family distributions. Previous multivariate graphical models (Yang et al., 2015) did not allow positive dependencies for the exponential and Poisson generalizations. However, in many real-world datasets, variables clearly have positive dependencies. For example, the airport delay time in New York—modeled as an exponential distribution—is positively related to the delay time in Boston. With this motivation, we give an example of our model class derived from the univariate exponential distribution that allows for almost arbitrary positive and negative dependencies with only a mild condition on the parameter matrix—a condition akin to the positive definiteness of the Gaussian covariance matrix. Our Poisson generalization allows for both positive and negative dependencies without any constraints on the parameter values. We also develop parameter estimation methods using node-wise regressions with ℓ1 regularization and likelihood approximation methods using sampling. Finally, we demonstrate our exponential generalization on a synthetic dataset and a real-world dataset of airport delay times. PMID:27563373
Jafari, Masoumeh; Salimifard, Maryam; Dehghani, Maryam
2014-07-01
This paper presents an efficient method for identification of nonlinear Multi-Input Multi-Output (MIMO) systems in the presence of colored noises. The method studies the multivariable nonlinear Hammerstein and Wiener models, in which, the nonlinear memory-less block is approximated based on arbitrary vector-based basis functions. The linear time-invariant (LTI) block is modeled by an autoregressive moving average with exogenous (ARMAX) model which can effectively describe the moving average noises as well as the autoregressive and the exogenous dynamics. According to the multivariable nature of the system, a pseudo-linear-in-the-parameter model is obtained which includes two different kinds of unknown parameters, a vector and a matrix. Therefore, the standard least squares algorithm cannot be applied directly. To overcome this problem, a Hierarchical Least Squares Iterative (HLSI) algorithm is used to simultaneously estimate the vector and the matrix of unknown parameters as well as the noises. The efficiency of the proposed identification approaches are investigated through three nonlinear MIMO case studies. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Wynant, Willy; Abrahamowicz, Michal
2016-11-01
Standard optimization algorithms for maximizing likelihood may not be applicable to the estimation of those flexible multivariable models that are nonlinear in their parameters. For applications where the model's structure permits separating estimation of mutually exclusive subsets of parameters into distinct steps, we propose the alternating conditional estimation (ACE) algorithm. We validate the algorithm, in simulations, for estimation of two flexible extensions of Cox's proportional hazards model where the standard maximum partial likelihood estimation does not apply, with simultaneous modeling of (1) nonlinear and time-dependent effects of continuous covariates on the hazard, and (2) nonlinear interaction and main effects of the same variable. We also apply the algorithm in real-life analyses to estimate nonlinear and time-dependent effects of prognostic factors for mortality in colon cancer. Analyses of both simulated and real-life data illustrate good statistical properties of the ACE algorithm and its ability to yield new potentially useful insights about the data structure. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
2013-01-01
Background Peripheral artery disease (PAD) represents atherosclerotic disease and is a risk factor for death in peritoneal dialysis (PD) patients, who tend to show an atherogenic lipid profile. In this study, we investigated the relationship between lipid profile and ankle-brachial index (ABI) as an index of atherosclerosis in PD patients with controlled serum low-density lipoprotein (LDL) cholesterol level. Methods Thirty-five PD patients, whose serum LDL cholesterol level was controlled at less than 120mg/dl, were enrolled in this cross-sectional study in Japan. The proportions of cholesterol level to total cholesterol level (cholesterol proportion) in 20 lipoprotein fractions and the mean size of lipoprotein particles were measured using an improved method, namely, high-performance gel permeation chromatography. Multivariate linear regression analysis was adjusted for diabetes mellitus and cardiovascular and/or cerebrovascular diseases. Results The mean (standard deviation) age was 61.6 (10.5) years; PD vintage, 38.5 (28.1) months; ABI, 1.07 (0.22). A low ABI (0.9 or lower) was observed in 7 patients (low-ABI group). The low-ABI group showed significantly higher cholesterol proportions in the chylomicron fraction and large very-low-density lipoproteins (VLDLs) (Fractions 3–5) than the high-ABI group (ABI>0.9). Adjusted multivariate linear regression analysis showed that ABI was negatively associated with serum VLDL cholesterol level (parameter estimate=-0.00566, p=0.0074); the cholesterol proportions in large VLDLs (Fraction 4, parameter estimate=-3.82, p=0.038; Fraction 5, parameter estimate=-3.62, p=0.0039) and medium VLDL (Fraction 6, parameter estimate=-3.25, p=0.014); and the size of VLDL particles (parameter estimate=-0.0352, p=0.032). Conclusions This study showed that the characteristics of VLDL particles were associated with ABI among PD patients. Lowering serum VLDL level may be an effective therapy against atherosclerosis in PD patients after the control of serum LDL cholesterol level. PMID:24093487
A stepwise, multi-objective, multi-variable parameter optimization method for the APEX model
USDA-ARS?s Scientific Manuscript database
Proper parameterization enables hydrological models to make reliable estimates of non-point source pollution for effective control measures. The automatic calibration of hydrologic models requires significant computational power limiting its application. The study objective was to develop and eval...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kanemoto, S.; Andoh, Y.; Sandoz, S.A.
1984-10-01
A method for evaluating reactor stability in boiling water reactors has been developed. The method is based on multivariate autoregressive (M-AR) modeling of steady-state neutron and process noise signals. In this method, two kinds of power spectral densities (PSDs) for the measured neutron signal and the corresponding noise source signal are separately identified by the M-AR modeling. The closed- and open-loop stability parameters are evaluated from these PSDs. The method is applied to actual plant noise data that were measured together with artificial perturbation test data. Stability parameters identified from noise data are compared to those from perturbation test data,more » and it is shown that both results are in good agreement. In addition to these stability estimations, driving noise sources for the neutron signal are evaluated by the M-AR modeling. Contributions from void, core flow, and pressure noise sources are quantitatively evaluated, and the void noise source is shown to be the most dominant.« less
Extracting galactic structure parameters from multivariated density estimation
NASA Technical Reports Server (NTRS)
Chen, B.; Creze, M.; Robin, A.; Bienayme, O.
1992-01-01
Multivariate statistical analysis, including includes cluster analysis (unsupervised classification), discriminant analysis (supervised classification) and principle component analysis (dimensionlity reduction method), and nonparameter density estimation have been successfully used to search for meaningful associations in the 5-dimensional space of observables between observed points and the sets of simulated points generated from a synthetic approach of galaxy modelling. These methodologies can be applied as the new tools to obtain information about hidden structure otherwise unrecognizable, and place important constraints on the space distribution of various stellar populations in the Milky Way. In this paper, we concentrate on illustrating how to use nonparameter density estimation to substitute for the true densities in both of the simulating sample and real sample in the five-dimensional space. In order to fit model predicted densities to reality, we derive a set of equations which include n lines (where n is the total number of observed points) and m (where m: the numbers of predefined groups) unknown parameters. A least-square estimation will allow us to determine the density law of different groups and components in the Galaxy. The output from our software, which can be used in many research fields, will also give out the systematic error between the model and the observation by a Bayes rule.
Multidimensional stochastic approximation using locally contractive functions
NASA Technical Reports Server (NTRS)
Lawton, W. M.
1975-01-01
A Robbins-Monro type multidimensional stochastic approximation algorithm which converges in mean square and with probability one to the fixed point of a locally contractive regression function is developed. The algorithm is applied to obtain maximum likelihood estimates of the parameters for a mixture of multivariate normal distributions.
Enhancing e-waste estimates: improving data quality by multivariate Input-Output Analysis.
Wang, Feng; Huisman, Jaco; Stevels, Ab; Baldé, Cornelis Peter
2013-11-01
Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high quality data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase data quality. An advanced, flexible and multivariate Input-Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various data points. By applying this method, the data consolidation steps can generate more accurate time-series datasets from available data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve data quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Demidenko, Eugene
2017-09-01
The exact density distribution of the nonlinear least squares estimator in the one-parameter regression model is derived in closed form and expressed through the cumulative distribution function of the standard normal variable. Several proposals to generalize this result are discussed. The exact density is extended to the estimating equation (EE) approach and the nonlinear regression with an arbitrary number of linear parameters and one intrinsically nonlinear parameter. For a very special nonlinear regression model, the derived density coincides with the distribution of the ratio of two normally distributed random variables previously obtained by Fieller (1932), unlike other approximations previously suggested by other authors. Approximations to the density of the EE estimators are discussed in the multivariate case. Numerical complications associated with the nonlinear least squares are illustrated, such as nonexistence and/or multiple solutions, as major factors contributing to poor density approximation. The nonlinear Markov-Gauss theorem is formulated based on the near exact EE density approximation.
State-space self-tuner for on-line adaptive control
NASA Technical Reports Server (NTRS)
Shieh, L. S.
1994-01-01
Dynamic systems, such as flight vehicles, satellites and space stations, operating in real environments, constantly face parameter and/or structural variations owing to nonlinear behavior of actuators, failure of sensors, changes in operating conditions, disturbances acting on the system, etc. In the past three decades, adaptive control has been shown to be effective in dealing with dynamic systems in the presence of parameter uncertainties, structural perturbations, random disturbances and environmental variations. Among the existing adaptive control methodologies, the state-space self-tuning control methods, initially proposed by us, are shown to be effective in designing advanced adaptive controllers for multivariable systems. In our approaches, we have embedded the standard Kalman state-estimation algorithm into an online parameter estimation algorithm. Thus, the advanced state-feedback controllers can be easily established for digital adaptive control of continuous-time stochastic multivariable systems. A state-space self-tuner for a general multivariable stochastic system has been developed and successfully applied to the space station for on-line adaptive control. Also, a technique for multistage design of an optimal momentum management controller for the space station has been developed and reported in. Moreover, we have successfully developed various digital redesign techniques which can convert a continuous-time controller to an equivalent digital controller. As a result, the expensive and unreliable continuous-time controller can be implemented using low-cost and high performance microprocessors. Recently, we have developed a new hybrid state-space self tuner using a new dual-rate sampling scheme for on-line adaptive control of continuous-time uncertain systems.
NASA Astrophysics Data System (ADS)
Han, X.; Li, X.; He, G.; Kumbhar, P.; Montzka, C.; Kollet, S.; Miyoshi, T.; Rosolem, R.; Zhang, Y.; Vereecken, H.; Franssen, H.-J. H.
2015-08-01
Data assimilation has become a popular method to integrate observations from multiple sources with land surface models to improve predictions of the water and energy cycles of the soil-vegetation-atmosphere continuum. Multivariate data assimilation refers to the simultaneous assimilation of observation data from multiple model state variables into a simulation model. In recent years, several land data assimilation systems have been developed in different research agencies. Because of the software availability or adaptability, these systems are not easy to apply for the purpose of multivariate land data assimilation research. We developed an open source multivariate land data assimilation framework (DasPy) which is implemented using the Python script language mixed with the C++ and Fortran programming languages. LETKF (Local Ensemble Transform Kalman Filter) is implemented as the main data assimilation algorithm, and uncertainties in the data assimilation can be introduced by perturbed atmospheric forcing data, and represented by perturbed soil and vegetation parameters and model initial conditions. The Community Land Model (CLM) was integrated as the model operator. The implementation allows also parameter estimation (soil properties and/or leaf area index) on the basis of the joint state and parameter estimation approach. The Community Microwave Emission Modelling platform (CMEM), COsmic-ray Soil Moisture Interaction Code (COSMIC) and the Two-Source Formulation (TSF) were integrated as observation operators for the assimilation of L-band passive microwave, cosmic-ray soil moisture probe and land surface temperature measurements, respectively. DasPy has been evaluated in several assimilation studies of neutron count intensity (soil moisture), L-band brightness temperature and land surface temperature. DasPy is parallelized using the hybrid Message Passing Interface and Open Multi-Processing techniques. All the input and output data flows are organized efficiently using the commonly used NetCDF file format. Online 1-D and 2-D visualization of data assimilation results is also implemented to facilitate the post simulation analysis. In summary, DasPy is a ready to use open source parallel multivariate land data assimilation framework.
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.
Ma, Yan; Mazumdar, Madhu
2011-10-30
Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach
ERIC Educational Resources Information Center
Klauer, Karl Christoph
2010-01-01
Multinomial processing tree models are widely used in many areas of psychology. A hierarchical extension of the model class is proposed, using a multivariate normal distribution of person-level parameters with the mean and covariance matrix to be estimated from the data. The hierarchical model allows one to take variability between persons into…
A model for incomplete longitudinal multivariate ordinal data.
Liu, Li C
2008-12-30
In studies where multiple outcome items are repeatedly measured over time, missing data often occur. A longitudinal item response theory model is proposed for analysis of multivariate ordinal outcomes that are repeatedly measured. Under the MAR assumption, this model accommodates missing data at any level (missing item at any time point and/or missing time point). It allows for multiple random subject effects and the estimation of item discrimination parameters for the multiple outcome items. The covariates in the model can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is described utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher-scoring solution, which provides standard errors for all model parameters, is used. A data set from a longitudinal prevention study is used to motivate the application of the proposed model. In this study, multiple ordinal items of health behavior are repeatedly measured over time. Because of a planned missing design, subjects answered only two-third of all items at a given point. Copyright 2008 John Wiley & Sons, Ltd.
Hurtado Rúa, Sandra M; Mazumdar, Madhu; Strawderman, Robert L
2015-12-30
Bayesian meta-analysis is an increasingly important component of clinical research, with multivariate meta-analysis a promising tool for studies with multiple endpoints. Model assumptions, including the choice of priors, are crucial aspects of multivariate Bayesian meta-analysis (MBMA) models. In a given model, two different prior distributions can lead to different inferences about a particular parameter. A simulation study was performed in which the impact of families of prior distributions for the covariance matrix of a multivariate normal random effects MBMA model was analyzed. Inferences about effect sizes were not particularly sensitive to prior choice, but the related covariance estimates were. A few families of prior distributions with small relative biases, tight mean squared errors, and close to nominal coverage for the effect size estimates were identified. Our results demonstrate the need for sensitivity analysis and suggest some guidelines for choosing prior distributions in this class of problems. The MBMA models proposed here are illustrated in a small meta-analysis example from the periodontal field and a medium meta-analysis from the study of stroke. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Ayalew, Wondossen; Aliy, Mohammed; Negussie, Enyew
2017-11-01
This study estimated the genetic parameters for productive and reproductive traits. The data included production and reproduction records of animals that have calved between 1979 and 2013. The genetic parameters were estimated using multivariate mixed models (DMU) package, fitting univariate and multivariate mixed models with average information restricted maximum likelihood algorithm. The estimates of heritability for milk production traits from the first three lactation records were 0.03±0.03 for lactation length (LL), 0.17±0.04 for lactation milk yield (LMY), and 0.15±0.04 for 305 days milk yield (305-d MY). For reproductive traits the heritability estimates were, 0.09±0.03 for days open (DO), 0.11±0.04 for calving interval (CI), and 0.47±0.06 for age at first calving (AFC). The repeatability estimates for production traits were 0.12±0.02, for LL, 0.39±0.02 for LMY, and 0.25±0.02 for 305-d MY. For reproductive traits the estimates of repeatability were 0.19±0.02 for DO, and to 0.23±0.02 for CI. The phenotypic correlations between production and reproduction traits ranged from 0.08±0.04 for LL and AFC to 0.42±0.02 for LL and DO. The genetic correlation among production traits were generally high (>0.7) and between reproductive traits the estimates ranged from 0.06±0.13 for AFC and DO to 0.99±0.01 between CI and DO. Genetic correlations of productive traits with reproductive traits were ranged from -0.02 to 0.99. The high heritability estimates observed for AFC indicated that reasonable genetic improvement for this trait might be possible through selection. The h2 and r estimates for reproductive traits were slightly different from single versus multi-trait analyses of reproductive traits with production traits. As single-trait method is biased due to selection on milk yield, a multi-trait evaluation of fertility with milk yield is recommended.
Optimal Tuner Selection for Kalman-Filter-Based Aircraft Engine Performance Estimation
NASA Technical Reports Server (NTRS)
Simon, Donald L.; Garg, Sanjay
2011-01-01
An emerging approach in the field of aircraft engine controls and system health management is the inclusion of real-time, onboard models for the inflight estimation of engine performance variations. This technology, typically based on Kalman-filter concepts, enables the estimation of unmeasured engine performance parameters that can be directly utilized by controls, prognostics, and health-management applications. A challenge that complicates this practice is the fact that an aircraft engine s performance is affected by its level of degradation, generally described in terms of unmeasurable health parameters such as efficiencies and flow capacities related to each major engine module. Through Kalman-filter-based estimation techniques, the level of engine performance degradation can be estimated, given that there are at least as many sensors as health parameters to be estimated. However, in an aircraft engine, the number of sensors available is typically less than the number of health parameters, presenting an under-determined estimation problem. A common approach to address this shortcoming is to estimate a subset of the health parameters, referred to as model tuning parameters. The problem/objective is to optimally select the model tuning parameters to minimize Kalman-filterbased estimation error. A tuner selection technique has been developed that specifically addresses the under-determined estimation problem, where there are more unknown parameters than available sensor measurements. A systematic approach is applied to produce a model tuning parameter vector of appropriate dimension to enable estimation by a Kalman filter, while minimizing the estimation error in the parameters of interest. Tuning parameter selection is performed using a multi-variable iterative search routine that seeks to minimize the theoretical mean-squared estimation error of the Kalman filter. This approach can significantly reduce the error in onboard aircraft engine parameter estimation applications such as model-based diagnostic, controls, and life usage calculations. The advantage of the innovation is the significant reduction in estimation errors that it can provide relative to the conventional approach of selecting a subset of health parameters to serve as the model tuning parameter vector. Because this technique needs only to be performed during the system design process, it places no additional computation burden on the onboard Kalman filter implementation. The technique has been developed for aircraft engine onboard estimation applications, as this application typically presents an under-determined estimation problem. However, this generic technique could be applied to other industries using gas turbine engine technology.
Thermal signature identification system (TheSIS): a spread spectrum temperature cycling method
NASA Astrophysics Data System (ADS)
Merritt, Scott
2015-03-01
NASA GSFC's Thermal Signature Identification System (TheSIS) 1) measures the high order dynamic responses of optoelectronic components to direct sequence spread-spectrum temperature cycling, 2) estimates the parameters of multiple autoregressive moving average (ARMA) or other models the of the responses, 3) and selects the most appropriate model using the Akaike Information Criterion (AIC). Using the AIC-tested model and parameter vectors from TheSIS, one can 1) select high-performing components on a multivariate basis, i.e., with multivariate Figures of Merit (FOMs), 2) detect subtle reversible shifts in performance, and 3) investigate irreversible changes in component or subsystem performance, e.g. aging. We show examples of the TheSIS methodology for passive and active components and systems, e.g. fiber Bragg gratings (FBGs) and DFB lasers with coupled temperature control loops, respectively.
NASA Astrophysics Data System (ADS)
Wahl, Thomas; Jensen, Jürgen; Mudersbach, Christoph
2010-05-01
Storm surges along the German North Sea coastline led to major damages in the past and the risk of inundation is expected to increase in the course of an ongoing climate change. The knowledge of the characteristics of possible storm surges is essential for the performance of integrated risk analyses, e.g. based on the source-pathway-receptor concept. The latter includes the storm surge simulation/analyses (source), modelling of dike/dune breach scenarios (pathway) and the quantification of potential losses (receptor). In subproject 1b of the German joint research project XtremRisK (www.xtremrisk.de), a stochastic storm surge generator for the south-eastern North Sea area is developed. The input data for the multivariate model are high resolution sea level observations from tide gauges during extreme events. Based on 25 parameters (19 sea level parameters and 6 time parameters) observed storm surge hydrographs consisting of three tides are parameterised. Followed by the adaption of common parametric probability distributions and a large number of Monte-Carlo-Simulations, the final reconstruction leads to a set of 100.000 (default) synthetic storm surge events with a one-minute resolution. Such a data set can potentially serve as the basis for a large number of applications. For risk analyses, storm surges with peak water levels exceeding the design water levels are of special interest. The occurrence probabilities of the simulated extreme events are estimated based on multivariate statistics, considering the parameters "peak water level" and "fullness/intensity". In the past, most studies considered only the peak water levels during extreme events, which might not be the most important parameter in any cases. Here, a 2D-Archimedian copula model is used for the estimation of the joint probabilities of the selected parameters, accounting for the structures of dependence overlooking the margins. In coordination with subproject 1a, the results will be used as the input for the XtremRisK subprojects 2 to 4. The project is funded by the German Federal Ministry of Education and Research (BMBF) (Project No. 03 F 0483 B).
Park, Eun Sug; Hopke, Philip K; Oh, Man-Suk; Symanski, Elaine; Han, Daikwon; Spiegelman, Clifford H
2014-07-01
There has been increasing interest in assessing health effects associated with multiple air pollutants emitted by specific sources. A major difficulty with achieving this goal is that the pollution source profiles are unknown and source-specific exposures cannot be measured directly; rather, they need to be estimated by decomposing ambient measurements of multiple air pollutants. This estimation process, called multivariate receptor modeling, is challenging because of the unknown number of sources and unknown identifiability conditions (model uncertainty). The uncertainty in source-specific exposures (source contributions) as well as uncertainty in the number of major pollution sources and identifiability conditions have been largely ignored in previous studies. A multipollutant approach that can deal with model uncertainty in multivariate receptor models while simultaneously accounting for parameter uncertainty in estimated source-specific exposures in assessment of source-specific health effects is presented in this paper. The methods are applied to daily ambient air measurements of the chemical composition of fine particulate matter ([Formula: see text]), weather data, and counts of cardiovascular deaths from 1995 to 1997 for Phoenix, AZ, USA. Our approach for evaluating source-specific health effects yields not only estimates of source contributions along with their uncertainties and associated health effects estimates but also estimates of model uncertainty (posterior model probabilities) that have been ignored in previous studies. The results from our methods agreed in general with those from the previously conducted workshop/studies on the source apportionment of PM health effects in terms of number of major contributing sources, estimated source profiles, and contributions. However, some of the adverse source-specific health effects identified in the previous studies were not statistically significant in our analysis, which probably resulted because we incorporated parameter uncertainty in estimated source contributions that has been ignored in the previous studies into the estimation of health effects parameters. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Bansal, Ravi; Staib, Lawrence H.; Laine, Andrew F.; Xu, Dongrong; Liu, Jun; Posecion, Lainie F.; Peterson, Bradley S.
2010-01-01
Images from different individuals typically cannot be registered precisely because anatomical features within the images differ across the people imaged and because the current methods for image registration have inherent technological limitations that interfere with perfect registration. Quantifying the inevitable error in image registration is therefore of crucial importance in assessing the effects that image misregistration may have on subsequent analyses in an imaging study. We have developed a mathematical framework for quantifying errors in registration by computing the confidence intervals of the estimated parameters (3 translations, 3 rotations, and 1 global scale) for the similarity transformation. The presence of noise in images and the variability in anatomy across individuals ensures that estimated registration parameters are always random variables. We assume a functional relation among intensities across voxels in the images, and we use the theory of nonlinear, least-squares estimation to show that the parameters are multivariate Gaussian distributed. We then use the covariance matrix of this distribution to compute the confidence intervals of the transformation parameters. These confidence intervals provide a quantitative assessment of the registration error across the images. Because transformation parameters are nonlinearly related to the coordinates of landmark points in the brain, we subsequently show that the coordinates of those landmark points are also multivariate Gaussian distributed. Using these distributions, we then compute the confidence intervals of the coordinates for landmark points in the image. Each of these confidence intervals in turn provides a quantitative assessment of the registration error at a particular landmark point. Because our method is computationally intensive, however, its current implementation is limited to assessing the error of the parameters in the similarity transformation across images. We assessed the performance of our method in computing the error in estimated similarity parameters by applying that method to real world dataset. Our results showed that the size of the confidence intervals computed using our method decreased – i.e. our confidence in the registration of images from different individuals increased – for increasing amounts of blur in the images. Moreover, the size of the confidence intervals increased for increasing amounts of noise, misregistration, and differing anatomy. Thus, our method precisely quantified confidence in the registration of images that contain varying amounts of misregistration and varying anatomy across individuals. PMID:19138877
NASA Astrophysics Data System (ADS)
Bi, Yiming; Tang, Liang; Shan, Peng; Xie, Qiong; Hu, Yong; Peng, Silong; Tan, Jie; Li, Changwen
2014-08-01
Interference such as baseline drift and light scattering can degrade the model predictability in multivariate analysis of near-infrared (NIR) spectra. Usually interference can be represented by an additive and a multiplicative factor. In order to eliminate these interferences, correction parameters are needed to be estimated from spectra. However, the spectra are often mixed of physical light scattering effects and chemical light absorbance effects, making it difficult for parameter estimation. Herein, a novel algorithm was proposed to find a spectral region automatically that the interesting chemical absorbance and noise are low, that is, finding an interference dominant region (IDR). Based on the definition of IDR, a two-step method was proposed to find the optimal IDR and the corresponding correction parameters estimated from IDR. Finally, the correction was performed to the full spectral range using previously obtained parameters for the calibration set and test set, respectively. The method can be applied to multi target systems with one IDR suitable for all targeted analytes. Tested on two benchmark data sets of near-infrared spectra, the performance of the proposed method provided considerable improvement compared with full spectral estimation methods and comparable with other state-of-art methods.
An improved method for bivariate meta-analysis when within-study correlations are unknown.
Hong, Chuan; D Riley, Richard; Chen, Yong
2018-03-01
Multivariate meta-analysis, which jointly analyzes multiple and possibly correlated outcomes in a single analysis, is becoming increasingly popular in recent years. An attractive feature of the multivariate meta-analysis is its ability to account for the dependence between multiple estimates from the same study. However, standard inference procedures for multivariate meta-analysis require the knowledge of within-study correlations, which are usually unavailable. This limits standard inference approaches in practice. Riley et al proposed a working model and an overall synthesis correlation parameter to account for the marginal correlation between outcomes, where the only data needed are those required for a separate univariate random-effects meta-analysis. As within-study correlations are not required, the Riley method is applicable to a wide variety of evidence synthesis situations. However, the standard variance estimator of the Riley method is not entirely correct under many important settings. As a consequence, the coverage of a function of pooled estimates may not reach the nominal level even when the number of studies in the multivariate meta-analysis is large. In this paper, we improve the Riley method by proposing a robust variance estimator, which is asymptotically correct even when the model is misspecified (ie, when the likelihood function is incorrect). Simulation studies of a bivariate meta-analysis, in a variety of settings, show a function of pooled estimates has improved performance when using the proposed robust variance estimator. In terms of individual pooled estimates themselves, the standard variance estimator and robust variance estimator give similar results to the original method, with appropriate coverage. The proposed robust variance estimator performs well when the number of studies is relatively large. Therefore, we recommend the use of the robust method for meta-analyses with a relatively large number of studies (eg, m≥50). When the sample size is relatively small, we recommend the use of the robust method under the working independence assumption. We illustrate the proposed method through 2 meta-analyses. Copyright © 2017 John Wiley & Sons, Ltd.
PyDREAM: high-dimensional parameter inference for biological models in python.
Shockley, Erin M; Vrugt, Jasper A; Lopez, Carlos F; Valencia, Alfonso
2018-02-15
Biological models contain many parameters whose values are difficult to measure directly via experimentation and therefore require calibration against experimental data. Markov chain Monte Carlo (MCMC) methods are suitable to estimate multivariate posterior model parameter distributions, but these methods may exhibit slow or premature convergence in high-dimensional search spaces. Here, we present PyDREAM, a Python implementation of the (Multiple-Try) Differential Evolution Adaptive Metropolis [DREAM(ZS)] algorithm developed by Vrugt and ter Braak (2008) and Laloy and Vrugt (2012). PyDREAM achieves excellent performance for complex, parameter-rich models and takes full advantage of distributed computing resources, facilitating parameter inference and uncertainty estimation of CPU-intensive biological models. PyDREAM is freely available under the GNU GPLv3 license from the Lopez lab GitHub repository at http://github.com/LoLab-VU/PyDREAM. c.lopez@vanderbilt.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Halliday, David M; Senik, Mohd Harizal; Stevenson, Carl W; Mason, Rob
2016-08-01
The ability to infer network structure from multivariate neuronal signals is central to computational neuroscience. Directed network analyses typically use parametric approaches based on auto-regressive (AR) models, where networks are constructed from estimates of AR model parameters. However, the validity of using low order AR models for neurophysiological signals has been questioned. A recent article introduced a non-parametric approach to estimate directionality in bivariate data, non-parametric approaches are free from concerns over model validity. We extend the non-parametric framework to include measures of directed conditional independence, using scalar measures that decompose the overall partial correlation coefficient summatively by direction, and a set of functions that decompose the partial coherence summatively by direction. A time domain partial correlation function allows both time and frequency views of the data to be constructed. The conditional independence estimates are conditioned on a single predictor. The framework is applied to simulated cortical neuron networks and mixtures of Gaussian time series data with known interactions. It is applied to experimental data consisting of local field potential recordings from bilateral hippocampus in anaesthetised rats. The framework offers a non-parametric approach to estimation of directed interactions in multivariate neuronal recordings, and increased flexibility in dealing with both spike train and time series data. The framework offers a novel alternative non-parametric approach to estimate directed interactions in multivariate neuronal recordings, and is applicable to spike train and time series data. Copyright © 2016 Elsevier B.V. All rights reserved.
DasPy – Open Source Multivariate Land Data Assimilation Framework with High Performance Computing
NASA Astrophysics Data System (ADS)
Han, Xujun; Li, Xin; Montzka, Carsten; Kollet, Stefan; Vereecken, Harry; Hendricks Franssen, Harrie-Jan
2015-04-01
Data assimilation has become a popular method to integrate observations from multiple sources with land surface models to improve predictions of the water and energy cycles of the soil-vegetation-atmosphere continuum. In recent years, several land data assimilation systems have been developed in different research agencies. Because of the software availability or adaptability, these systems are not easy to apply for the purpose of multivariate land data assimilation research. Multivariate data assimilation refers to the simultaneous assimilation of observation data for multiple model state variables into a simulation model. Our main motivation was to develop an open source multivariate land data assimilation framework (DasPy) which is implemented using the Python script language mixed with C++ and Fortran language. This system has been evaluated in several soil moisture, L-band brightness temperature and land surface temperature assimilation studies. The implementation allows also parameter estimation (soil properties and/or leaf area index) on the basis of the joint state and parameter estimation approach. LETKF (Local Ensemble Transform Kalman Filter) is implemented as the main data assimilation algorithm, and uncertainties in the data assimilation can be represented by perturbed atmospheric forcings, perturbed soil and vegetation properties and model initial conditions. The CLM4.5 (Community Land Model) was integrated as the model operator. The CMEM (Community Microwave Emission Modelling Platform), COSMIC (COsmic-ray Soil Moisture Interaction Code) and the two source formulation were integrated as observation operators for assimilation of L-band passive microwave, cosmic-ray soil moisture probe and land surface temperature measurements, respectively. DasPy is parallelized using the hybrid MPI (Message Passing Interface) and OpenMP (Open Multi-Processing) techniques. All the input and output data flow is organized efficiently using the commonly used NetCDF file format. Online 1D and 2D visualization of data assimilation results is also implemented to facilitate the post simulation analysis. In summary, DasPy is a ready to use open source parallel multivariate land data assimilation framework.
Detecting event-related changes of multivariate phase coupling in dynamic brain networks.
Canolty, Ryan T; Cadieu, Charles F; Koepsell, Kilian; Ganguly, Karunesh; Knight, Robert T; Carmena, Jose M
2012-04-01
Oscillatory phase coupling within large-scale brain networks is a topic of increasing interest within systems, cognitive, and theoretical neuroscience. Evidence shows that brain rhythms play a role in controlling neuronal excitability and response modulation (Haider B, McCormick D. Neuron 62: 171-189, 2009) and regulate the efficacy of communication between cortical regions (Fries P. Trends Cogn Sci 9: 474-480, 2005) and distinct spatiotemporal scales (Canolty RT, Knight RT. Trends Cogn Sci 14: 506-515, 2010). In this view, anatomically connected brain areas form the scaffolding upon which neuronal oscillations rapidly create and dissolve transient functional networks (Lakatos P, Karmos G, Mehta A, Ulbert I, Schroeder C. Science 320: 110-113, 2008). Importantly, testing these hypotheses requires methods designed to accurately reflect dynamic changes in multivariate phase coupling within brain networks. Unfortunately, phase coupling between neurophysiological signals is commonly investigated using suboptimal techniques. Here we describe how a recently developed probabilistic model, phase coupling estimation (PCE; Cadieu C, Koepsell K Neural Comput 44: 3107-3126, 2010), can be used to investigate changes in multivariate phase coupling, and we detail the advantages of this model over the commonly employed phase-locking value (PLV; Lachaux JP, Rodriguez E, Martinerie J, Varela F. Human Brain Map 8: 194-208, 1999). We show that the N-dimensional PCE is a natural generalization of the inherently bivariate PLV. Using simulations, we show that PCE accurately captures both direct and indirect (network mediated) coupling between network elements in situations where PLV produces erroneous results. We present empirical results on recordings from humans and nonhuman primates and show that the PCE-estimated coupling values are different from those using the bivariate PLV. Critically on these empirical recordings, PCE output tends to be sparser than the PLVs, indicating fewer significant interactions and perhaps a more parsimonious description of the data. Finally, the physical interpretation of PCE parameters is straightforward: the PCE parameters correspond to interaction terms in a network of coupled oscillators. Forward modeling of a network of coupled oscillators with parameters estimated by PCE generates synthetic data with statistical characteristics identical to empirical signals. Given these advantages over the PLV, PCE is a useful tool for investigating multivariate phase coupling in distributed brain networks.
Cooley, Richard L.
1993-01-01
A new method is developed to efficiently compute exact Scheffé-type confidence intervals for output (or other function of parameters) g(β) derived from a groundwater flow model. The method is general in that parameter uncertainty can be specified by any statistical distribution having a log probability density function (log pdf) that can be expanded in a Taylor series. However, for this study parameter uncertainty is specified by a statistical multivariate beta distribution that incorporates hydrogeologic information in the form of the investigator's best estimates of parameters and a grouping of random variables representing possible parameter values so that each group is defined by maximum and minimum bounds and an ordering according to increasing value. The new method forms the confidence intervals from maximum and minimum limits of g(β) on a contour of a linear combination of (1) the quadratic form for the parameters used by Cooley and Vecchia (1987) and (2) the log pdf for the multivariate beta distribution. Three example problems are used to compare characteristics of the confidence intervals for hydraulic head obtained using different weights for the linear combination. Different weights generally produced similar confidence intervals, whereas the method of Cooley and Vecchia (1987) often produced much larger confidence intervals.
Walling, Craig A; Morrissey, Michael B; Foerster, Katharina; Clutton-Brock, Tim H; Pemberton, Josephine M; Kruuk, Loeske E B
2014-12-01
Evolutionary theory predicts that genetic constraints should be widespread, but empirical support for their existence is surprisingly rare. Commonly applied univariate and bivariate approaches to detecting genetic constraints can underestimate their prevalence, with important aspects potentially tractable only within a multivariate framework. However, multivariate genetic analyses of data from natural populations are challenging because of modest sample sizes, incomplete pedigrees, and missing data. Here we present results from a study of a comprehensive set of life history traits (juvenile survival, age at first breeding, annual fecundity, and longevity) for both males and females in a wild, pedigreed, population of red deer (Cervus elaphus). We use factor analytic modeling of the genetic variance-covariance matrix ( G: ) to reduce the dimensionality of the problem and take a multivariate approach to estimating genetic constraints. We consider a range of metrics designed to assess the effect of G: on the deflection of a predicted response to selection away from the direction of fastest adaptation and on the evolvability of the traits. We found limited support for genetic constraint through genetic covariances between traits, both within sex and between sexes. We discuss these results with respect to other recent findings and to the problems of estimating these parameters for natural populations. Copyright © 2014 Walling et al.
Walling, Craig A.; Morrissey, Michael B.; Foerster, Katharina; Clutton-Brock, Tim H.; Pemberton, Josephine M.; Kruuk, Loeske E. B.
2014-01-01
Evolutionary theory predicts that genetic constraints should be widespread, but empirical support for their existence is surprisingly rare. Commonly applied univariate and bivariate approaches to detecting genetic constraints can underestimate their prevalence, with important aspects potentially tractable only within a multivariate framework. However, multivariate genetic analyses of data from natural populations are challenging because of modest sample sizes, incomplete pedigrees, and missing data. Here we present results from a study of a comprehensive set of life history traits (juvenile survival, age at first breeding, annual fecundity, and longevity) for both males and females in a wild, pedigreed, population of red deer (Cervus elaphus). We use factor analytic modeling of the genetic variance–covariance matrix (G) to reduce the dimensionality of the problem and take a multivariate approach to estimating genetic constraints. We consider a range of metrics designed to assess the effect of G on the deflection of a predicted response to selection away from the direction of fastest adaptation and on the evolvability of the traits. We found limited support for genetic constraint through genetic covariances between traits, both within sex and between sexes. We discuss these results with respect to other recent findings and to the problems of estimating these parameters for natural populations. PMID:25278555
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Qichun; Zhou, Jinglin; Wang, Hong
In this paper, stochastic coupling attenuation is investigated for a class of multi-variable bilinear stochastic systems and a novel output feedback m-block backstepping controller with linear estimator is designed, where gradient descent optimization is used to tune the design parameters of the controller. It has been shown that the trajectories of the closed-loop stochastic systems are bounded in probability sense and the stochastic coupling of the system outputs can be effectively attenuated by the proposed control algorithm. Moreover, the stability of the stochastic systems is analyzed and the effectiveness of the proposed method has been demonstrated using a simulated example.
Modeling a multivariable reactor and on-line model predictive control.
Yu, D W; Yu, D L
2005-10-01
A nonlinear first principle model is developed for a laboratory-scaled multivariable chemical reactor rig in this paper and the on-line model predictive control (MPC) is implemented to the rig. The reactor has three variables-temperature, pH, and dissolved oxygen with nonlinear dynamics-and is therefore used as a pilot system for the biochemical industry. A nonlinear discrete-time model is derived for each of the three output variables and their model parameters are estimated from the real data using an adaptive optimization method. The developed model is used in a nonlinear MPC scheme. An accurate multistep-ahead prediction is obtained for MPC, where the extended Kalman filter is used to estimate system unknown states. The on-line control is implemented and a satisfactory tracking performance is achieved. The MPC is compared with three decentralized PID controllers and the advantage of the nonlinear MPC over the PID is clearly shown.
Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.
Liu, Han; Wang, Lie; Zhao, Tuo
2015-08-01
We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence O (1/ ϵ ), where ϵ is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.
Graphical Models for Ordinal Data
Guo, Jian; Levina, Elizaveta; Michailidis, George; Zhu, Ji
2014-01-01
A graphical model for ordinal variables is considered, where it is assumed that the data are generated by discretizing the marginal distributions of a latent multivariate Gaussian distribution. The relationships between these ordinal variables are then described by the underlying Gaussian graphical model and can be inferred by estimating the corresponding concentration matrix. Direct estimation of the model is computationally expensive, but an approximate EM-like algorithm is developed to provide an accurate estimate of the parameters at a fraction of the computational cost. Numerical evidence based on simulation studies shows the strong performance of the algorithm, which is also illustrated on data sets on movie ratings and an educational survey. PMID:26120267
Bayesian statistics and Monte Carlo methods
NASA Astrophysics Data System (ADS)
Koch, K. R.
2018-03-01
The Bayesian approach allows an intuitive way to derive the methods of statistics. Probability is defined as a measure of the plausibility of statements or propositions. Three rules are sufficient to obtain the laws of probability. If the statements refer to the numerical values of variables, the so-called random variables, univariate and multivariate distributions follow. They lead to the point estimation by which unknown quantities, i.e. unknown parameters, are computed from measurements. The unknown parameters are random variables, they are fixed quantities in traditional statistics which is not founded on Bayes' theorem. Bayesian statistics therefore recommends itself for Monte Carlo methods, which generate random variates from given distributions. Monte Carlo methods, of course, can also be applied in traditional statistics. The unknown parameters, are introduced as functions of the measurements, and the Monte Carlo methods give the covariance matrix and the expectation of these functions. A confidence region is derived where the unknown parameters are situated with a given probability. Following a method of traditional statistics, hypotheses are tested by determining whether a value for an unknown parameter lies inside or outside the confidence region. The error propagation of a random vector by the Monte Carlo methods is presented as an application. If the random vector results from a nonlinearly transformed vector, its covariance matrix and its expectation follow from the Monte Carlo estimate. This saves a considerable amount of derivatives to be computed, and errors of the linearization are avoided. The Monte Carlo method is therefore efficient. If the functions of the measurements are given by a sum of two or more random vectors with different multivariate distributions, the resulting distribution is generally not known. TheMonte Carlo methods are then needed to obtain the covariance matrix and the expectation of the sum.
NASA Astrophysics Data System (ADS)
Ghosh, Sandipan; Bhattacharya, Kamala
2012-12-01
Each geomorphic hazard involves a degree of risk which incorporates quantification of the probability that a hazard will be harmful. At present, the categorization of sub-watersheds into erosion risk is considered as the fundamental step to conserve the soil loss. Development of badlands over the laterites of Birbhum district is an indicative of excessive soil loss in the monsoonal wet-dry type of climate. Slope erosion and channel erosion have generated huge amount of sediment from the small watersheds during intense monsoonal rainfall (June-September). The adjoining areas of Rampurhat I Block, Birbhum (West Bengal) and Shikaripara Block, Dumka (Jharkhand) have lost the lateritic soil cover at a rate of 20-40 ton/ha/year (Sarkar et al. 2005). In order to estimate the progressive removal of soil particles from the gully-catchments of the above-mentioned area, different morphometric parameters, soil parameters, hydrologic parameters and empirical models are employed. Side by side, the study is carried out to categorize the gully-catchments into different magnitude of erosion risk using several multivariate statistical techniques.
Alpha-canonical form representation of the open loop dynamics of the Space Shuttle main engine
NASA Technical Reports Server (NTRS)
Duyar, Almet; Eldem, Vasfi; Merrill, Walter C.; Guo, Ten-Huei
1991-01-01
A parameter and structure estimation technique for multivariable systems is used to obtain a state space representation of open loop dynamics of the space shuttle main engine in alpha-canonical form. The parameterization being used is both minimal and unique. The simplified linear model may be used for fault detection studies and control system design and development.
Smith, Jason F.; Chen, Kewei; Pillai, Ajay S.; Horwitz, Barry
2013-01-01
The number and variety of connectivity estimation methods is likely to continue to grow over the coming decade. Comparisons between methods are necessary to prune this growth to only the most accurate and robust methods. However, the nature of connectivity is elusive with different methods potentially attempting to identify different aspects of connectivity. Commonalities of connectivity definitions across methods upon which base direct comparisons can be difficult to derive. Here, we explicitly define “effective connectivity” using a common set of observation and state equations that are appropriate for three connectivity methods: dynamic causal modeling (DCM), multivariate autoregressive modeling (MAR), and switching linear dynamic systems for fMRI (sLDSf). In addition while deriving this set, we show how many other popular functional and effective connectivity methods are actually simplifications of these equations. We discuss implications of these connections for the practice of using one method to simulate data for another method. After mathematically connecting the three effective connectivity methods, simulated fMRI data with varying numbers of regions and task conditions is generated from the common equation. This simulated data explicitly contains the type of the connectivity that the three models were intended to identify. Each method is applied to the simulated data sets and the accuracy of parameter identification is analyzed. All methods perform above chance levels at identifying correct connectivity parameters. The sLDSf method was superior in parameter estimation accuracy to both DCM and MAR for all types of comparisons. PMID:23717258
Toward DSM-V: mapping the alcohol use disorder continuum in college students.
Hagman, Brett T; Cohn, Amy M
2011-11-01
The present study examined the dimensionality of DSM-IV Alcohol Use Disorder (AUD) criteria using Item Response Theory (IRT) methods and tested the validity of the proposed DSM-V AUD guidelines in a sample of college students. Participants were 396 college students who reported any alcohol use in the past 90 days and were aged 18 years or older. We conducted factor analyses to determine whether a one- or two-factor model provided a better fit to the AUD criteria. IRT analyses estimated item severity and discrimination parameters for each criterion. Multivariate analyses examined differences among the DSM-V diagnostic cut-off (AUD vs. No AUD) and severity qualifiers (no diagnosis, moderate, severe) across several validating measures of alcohol use. A dominant single-factor model provided the best fit to the AUD criteria. IRT analyses indicated that abuse and dependence criteria were intermixed along the latent continuum. The "legal problems" criterion had the highest severity parameter and the tolerance criterion had the lowest severity parameter. The abuse criterion "social/interpersonal problems" and dependence criterion "activities to obtain alcohol" had the highest discrimination parameter estimates. Multivariate analysis indicated that the DSM-V cut-off point, and severity qualifier groups were distinguishable on several measures of alcohol consumption, drinking consequences, and drinking restraint. Findings suggest that the AUD criteria reflect a latent variable that represents a primary disorder and provide support for the proposed DSM-V AUD criteria in a sample of college students. Continued research in other high-risk samples of college students is needed. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Fast clustering using adaptive density peak detection.
Wang, Xiao-Feng; Xu, Yifan
2017-12-01
Common limitations of clustering methods include the slow algorithm convergence, the instability of the pre-specification on a number of intrinsic parameters, and the lack of robustness to outliers. A recent clustering approach proposed a fast search algorithm of cluster centers based on their local densities. However, the selection of the key intrinsic parameters in the algorithm was not systematically investigated. It is relatively difficult to estimate the "optimal" parameters since the original definition of the local density in the algorithm is based on a truncated counting measure. In this paper, we propose a clustering procedure with adaptive density peak detection, where the local density is estimated through the nonparametric multivariate kernel estimation. The model parameter is then able to be calculated from the equations with statistical theoretical justification. We also develop an automatic cluster centroid selection method through maximizing an average silhouette index. The advantage and flexibility of the proposed method are demonstrated through simulation studies and the analysis of a few benchmark gene expression data sets. The method only needs to perform in one single step without any iteration and thus is fast and has a great potential to apply on big data analysis. A user-friendly R package ADPclust is developed for public use.
Marginally specified priors for non-parametric Bayesian estimation
Kessler, David C.; Hoff, Peter D.; Dunson, David B.
2014-01-01
Summary Prior specification for non-parametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. A statistician is unlikely to have informed opinions about all aspects of such a parameter but will have real information about functionals of the parameter, such as the population mean or variance. The paper proposes a new framework for non-parametric Bayes inference in which the prior distribution for a possibly infinite dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a non-parametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard non-parametric prior distributions in common use and inherit the large support of the standard priors on which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard non-parametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modelling of high dimensional sparse contingency tables. PMID:25663813
NASA Astrophysics Data System (ADS)
Mizukami, N.; Clark, M. P.; Newman, A. J.; Wood, A.; Gutmann, E. D.
2017-12-01
Estimating spatially distributed model parameters is a grand challenge for large domain hydrologic modeling, especially in the context of hydrologic model applications such as streamflow forecasting. Multi-scale Parameter Regionalization (MPR) is a promising technique that accounts for the effects of fine-scale geophysical attributes (e.g., soil texture, land cover, topography, climate) on model parameters and nonlinear scaling effects on model parameters. MPR computes model parameters with transfer functions (TFs) that relate geophysical attributes to model parameters at the native input data resolution and then scales them using scaling functions to the spatial resolution of the model implementation. One of the biggest challenges in the use of MPR is identification of TFs for each model parameter: both functional forms and geophysical predictors. TFs used to estimate the parameters of hydrologic models typically rely on previous studies or were derived in an ad-hoc, heuristic manner, potentially not utilizing maximum information content contained in the geophysical attributes for optimal parameter identification. Thus, it is necessary to first uncover relationships among geophysical attributes, model parameters, and hydrologic processes (i.e., hydrologic signatures) to obtain insight into which and to what extent geophysical attributes are related to model parameters. We perform multivariate statistical analysis on a large-sample catchment data set including various geophysical attributes as well as constrained VIC model parameters at 671 unimpaired basins over the CONUS. We first calibrate VIC model at each catchment to obtain constrained parameter sets. Additionally, parameter sets sampled during the calibration process are used for sensitivity analysis using various hydrologic signatures as objectives to understand the relationships among geophysical attributes, parameters, and hydrologic processes.
Bayesian experimental design for models with intractable likelihoods.
Drovandi, Christopher C; Pettitt, Anthony N
2013-12-01
In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables. © 2013, The International Biometric Society.
Zhang, Yongsheng; Wei, Heng; Zheng, Kangning
2017-01-01
Considering that metro network expansion brings us with more alternative routes, it is attractive to integrate the impacts of routes set and the interdependency among alternative routes on route choice probability into route choice modeling. Therefore, the formulation, estimation and application of a constrained multinomial probit (CMNP) route choice model in the metro network are carried out in this paper. The utility function is formulated as three components: the compensatory component is a function of influencing factors; the non-compensatory component measures the impacts of routes set on utility; following a multivariate normal distribution, the covariance of error component is structured into three parts, representing the correlation among routes, the transfer variance of route, and the unobserved variance respectively. Considering multidimensional integrals of the multivariate normal probability density function, the CMNP model is rewritten as Hierarchical Bayes formula and M-H sampling algorithm based Monte Carlo Markov Chain approach is constructed to estimate all parameters. Based on Guangzhou Metro data, reliable estimation results are gained. Furthermore, the proposed CMNP model also shows a good forecasting performance for the route choice probabilities calculation and a good application performance for transfer flow volume prediction. PMID:28591188
Dominguez-Rodriguez, Alberto; Thibodeau, Jennifer T; Ayers, Colby R; Jimenez-Sosa, Alejandro; Garrido, Pilar; Montoto, Javier; Prada-Arrondo, Pablo C; Abreu-Gonzalez, Pedro; Drazner, Mark H
2018-06-02
Bendopnea is a recently described symptom of advanced heart failure. Its prevalence and prognostic utility in other cardiac conditions are unknown. We prospectively enrolled 108 consecutive patients (75 ± 3 years, 68% men) with severe symptomatic aortic stenosis referred for surgical aortic valve replacement (SAVR). Preoperatively, patients were tested for bendopnea, which was considered to be present when dyspnoea occurred within 30 s of bending forward. Univariable and stepwise multivariable analyses tested the association of bendopnea with preoperative echocardiographic parameters and postoperative clinical outcomes. Bendopnea was present in 46 of 108 (42%) patients. The mean time of onset was 10.5 ± 3.4 s. Bendopnea was associated with higher estimated pulmonary artery systolic pressures [51 (11) mmHg vs 40 (11) mmHg), P < 0.0001], smaller aortic valve area [0.66 (0.16) cm2 vs 0.76 (0.13) cm2, P = 0.0006] and longer duration of mechanical ventilation (P = 0.002) and length of stay in the hospital (P = 0.007). Following SAVR, in-hospital mortality in those with bendopnea versus those without bendopnea was 13% vs 3% (P = 0.07). In multivariable analysis, bendopnea was associated with duration of mechanical ventilation (parameter estimate 2.4, P < 0.0001) and length of stay in the hospital (parameter estimate 10.2, P ≤ 0.0001). Bendopnea was present in a sizeable minority of patients (42%) with severe aortic stenosis referred for SAVR. Bendopnea was associated with higher pulmonary artery systolic pressure and smaller aortic valve area preoperatively and with longer duration of mechanical ventilation and length of hospitalization postoperatively. These data suggest that bendopnea provides prognostic information in patients with severe aortic stenosis undergoing SAVR.
Enhanced ID Pit Sizing Using Multivariate Regression Algorithm
NASA Astrophysics Data System (ADS)
Krzywosz, Kenji
2007-03-01
EPRI is funding a program to enhance and improve the reliability of inside diameter (ID) pit sizing for balance-of plant heat exchangers, such as condensers and component cooling water heat exchangers. More traditional approaches to ID pit sizing involve the use of frequency-specific amplitude or phase angles. The enhanced multivariate regression algorithm for ID pit depth sizing incorporates three simultaneous input parameters of frequency, amplitude, and phase angle. A set of calibration data sets consisting of machined pits of various rounded and elongated shapes and depths was acquired in the frequency range of 100 kHz to 1 MHz for stainless steel tubing having nominal wall thickness of 0.028 inch. To add noise to the acquired data set, each test sample was rotated and test data acquired at 3, 6, 9, and 12 o'clock positions. The ID pit depths were estimated using a second order and fourth order regression functions by relying on normalized amplitude and phase angle information from multiple frequencies. Due to unique damage morphology associated with the microbiologically-influenced ID pits, it was necessary to modify the elongated calibration standard-based algorithms by relying on the algorithm developed solely from the destructive sectioning results. This paper presents the use of transformed multivariate regression algorithm to estimate ID pit depths and compare the results with the traditional univariate phase angle analysis. Both estimates were then compared with the destructive sectioning results.
Irano, Natalia; Bignardi, Annaiza Braga; El Faro, Lenira; Santana, Mário Luiz; Cardoso, Vera Lúcia; Albuquerque, Lucia Galvão
2014-03-01
The objective of this study was to estimate genetic parameters for milk yield, stayability, and the occurrence of clinical mastitis in Holstein cows, as well as studying the genetic relationship between them, in order to provide subsidies for the genetic evaluation of these traits. Records from 5,090 Holstein cows with calving varying from 1991 to 2010, were used in the analysis. Two standard multivariate analyses were carried out, one containing the trait of accumulated 305-day milk yields in the first lactation (MY1), stayability (STAY) until the third lactation, and clinical mastitis (CM), as well as the other traits, considering accumulated 305-day milk yields (Y305), STAY, and CM, including the first three lactations as repeated measures for Y305 and CM. The covariance components were obtained by a Bayesian approach. The heritability estimates obtained by multivariate analysis with MY1 were 0.19, 0.28, and 0.13 for MY1, STAY, and CM, respectively, whereas using the multivariate analysis with the Y305, the estimates were 0.19, 0.31, and 0.14, respectively. The genetic correlations between MY1 and STAY, MY1 and CM, and STAY and CM, respectively, were 0.38, 0.12, and -0.49. The genetic correlations between Y305 and STAY, Y305 and CM, and STAY and CM, respectively, were 0.66, -0.25, and -0.52.
NASA Technical Reports Server (NTRS)
Murphy, Patrick Charles
1985-01-01
An algorithm for maximum likelihood (ML) estimation is developed with an efficient method for approximating the sensitivities. The algorithm was developed for airplane parameter estimation problems but is well suited for most nonlinear, multivariable, dynamic systems. The ML algorithm relies on a new optimization method referred to as a modified Newton-Raphson with estimated sensitivities (MNRES). MNRES determines sensitivities by using slope information from local surface approximations of each output variable in parameter space. The fitted surface allows sensitivity information to be updated at each iteration with a significant reduction in computational effort. MNRES determines the sensitivities with less computational effort than using either a finite-difference method or integrating the analytically determined sensitivity equations. MNRES eliminates the need to derive sensitivity equations for each new model, thus eliminating algorithm reformulation with each new model and providing flexibility to use model equations in any format that is convenient. A random search technique for determining the confidence limits of ML parameter estimates is applied to nonlinear estimation problems for airplanes. The confidence intervals obtained by the search are compared with Cramer-Rao (CR) bounds at the same confidence level. It is observed that the degree of nonlinearity in the estimation problem is an important factor in the relationship between CR bounds and the error bounds determined by the search technique. The CR bounds were found to be close to the bounds determined by the search when the degree of nonlinearity was small. Beale's measure of nonlinearity is developed in this study for airplane identification problems; it is used to empirically correct confidence levels for the parameter confidence limits. The primary utility of the measure, however, was found to be in predicting the degree of agreement between Cramer-Rao bounds and search estimates.
R. L. Czaplewski
2009-01-01
The minimum variance multivariate composite estimator is a relatively simple sequential estimator for complex sampling designs (Czaplewski 2009). Such designs combine a probability sample of expensive field data with multiple censuses and/or samples of relatively inexpensive multi-sensor, multi-resolution remotely sensed data. Unfortunately, the multivariate composite...
Model-Based Clustering and Data Transformations for Gene Expression Data
2001-04-30
transformation parameters, e.g. Andrews, Gnanadesikan , and Warner (1973). Aitchison tests: Aitchison (1986) tested three aspects of the data for...N in the Box-Cox transformation in Equation (5) is estimated by maximum likelihood using the observa- tions (Andrews, Gnanadesikan , and Warner 1973...Compositional Data. Chapman and Hall. Andrews, D. F., R. Gnanadesikan , and J. L. Warner (1973). Methods for assessing multivari- ate normality. In P. R
Multivariate stochastic analysis for Monthly hydrological time series at Cuyahoga River Basin
NASA Astrophysics Data System (ADS)
zhang, L.
2011-12-01
Copula has become a very powerful statistic and stochastic methodology in case of the multivariate analysis in Environmental and Water resources Engineering. In recent years, the popular one-parameter Archimedean copulas, e.g. Gumbel-Houggard copula, Cook-Johnson copula, Frank copula, the meta-elliptical copula, e.g. Gaussian Copula, Student-T copula, etc. have been applied in multivariate hydrological analyses, e.g. multivariate rainfall (rainfall intensity, duration and depth), flood (peak discharge, duration and volume), and drought analyses (drought length, mean and minimum SPI values, and drought mean areal extent). Copula has also been applied in the flood frequency analysis at the confluences of river systems by taking into account the dependence among upstream gauge stations rather than by using the hydrological routing technique. In most of the studies above, the annual time series have been considered as stationary signal which the time series have been assumed as independent identically distributed (i.i.d.) random variables. But in reality, hydrological time series, especially the daily and monthly hydrological time series, cannot be considered as i.i.d. random variables due to the periodicity existed in the data structure. Also, the stationary assumption is also under question due to the Climate Change and Land Use and Land Cover (LULC) change in the fast years. To this end, it is necessary to revaluate the classic approach for the study of hydrological time series by relaxing the stationary assumption by the use of nonstationary approach. Also as to the study of the dependence structure for the hydrological time series, the assumption of same type of univariate distribution also needs to be relaxed by adopting the copula theory. In this paper, the univariate monthly hydrological time series will be studied through the nonstationary time series analysis approach. The dependence structure of the multivariate monthly hydrological time series will be studied through the copula theory. As to the parameter estimation, the maximum likelihood estimation (MLE) will be applied. To illustrate the method, the univariate time series model and the dependence structure will be determined and tested using the monthly discharge time series of Cuyahoga River Basin.
Punzo, Antonio; Ingrassia, Salvatore; Maruotti, Antonello
2018-04-22
A time-varying latent variable model is proposed to jointly analyze multivariate mixed-support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. HMRMFCs are inadequate for applications in which a clustering structure can be identified in the distribution of the covariates, as the clustering is independent from the covariates distribution. Here, hidden Markov regression models with random covariates are introduced by explicitly specifying state-specific distributions for the covariates, with the aim of improving the recovering of the clusters in the data with respect to a fixed covariates paradigm. The hidden Markov regression models with random covariates class is defined focusing on the exponential family, in a generalized linear model framework. Model identifiability conditions are sketched, an expectation-maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through simulation experiments and compared with those of HMRMFCs. The method is applied to physical activity data. Copyright © 2018 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Durmaz, Murat; Karslioglu, Mahmut Onur
2015-04-01
There are various global and regional methods that have been proposed for the modeling of ionospheric vertical total electron content (VTEC). Global distribution of VTEC is usually modeled by spherical harmonic expansions, while tensor products of compactly supported univariate B-splines can be used for regional modeling. In these empirical parametric models, the coefficients of the basis functions as well as differential code biases (DCBs) of satellites and receivers can be treated as unknown parameters which can be estimated from geometry-free linear combinations of global positioning system observables. In this work we propose a new semi-parametric multivariate adaptive regression B-splines (SP-BMARS) method for the regional modeling of VTEC together with satellite and receiver DCBs, where the parametric part of the model is related to the DCBs as fixed parameters and the non-parametric part adaptively models the spatio-temporal distribution of VTEC. The latter is based on multivariate adaptive regression B-splines which is a non-parametric modeling technique making use of compactly supported B-spline basis functions that are generated from the observations automatically. This algorithm takes advantage of an adaptive scale-by-scale model building strategy that searches for best-fitting B-splines to the data at each scale. The VTEC maps generated from the proposed method are compared numerically and visually with the global ionosphere maps (GIMs) which are provided by the Center for Orbit Determination in Europe (CODE). The VTEC values from SP-BMARS and CODE GIMs are also compared with VTEC values obtained through calibration using local ionospheric model. The estimated satellite and receiver DCBs from the SP-BMARS model are compared with the CODE distributed DCBs. The results show that the SP-BMARS algorithm can be used to estimate satellite and receiver DCBs while adaptively and flexibly modeling the daily regional VTEC.
Estimation of actual evapotranspiration in the Nagqu river basin of the Tibetan Plateau
NASA Astrophysics Data System (ADS)
Zou, Mijun; Zhong, Lei; Ma, Yaoming; Hu, Yuanyuan; Feng, Lu
2018-05-01
As a critical component of the energy and water cycle, terrestrial actual evapotranspiration (ET) can be influenced by many factors. This study was mainly devoted to providing accurate and continuous estimations of actual ET for the Tibetan Plateau (TP) and analyzing the effects of its impact factors. In this study, summer observational data from the Coordinated Enhanced Observing Period (CEOP) Asia-Australia Monsoon Project (CAMP) on the Tibetan Plateau (CAMP/Tibet) for 2003 to 2004 was selected to determine actual ET and investigate its relationship with energy, hydrological, and dynamical parameters. Multiple-layer air temperature, relative humidity, net radiation flux, wind speed, precipitation, and soil moisture were used to estimate actual ET. The regression model simulation results were validated with independent data retrieved using the combinatory method. The results suggested that significant correlations exist between actual ET and hydro-meteorological parameters in the surface layer of the Nagqu river basin, among which the most important factors are energy-related elements (net radiation flux and air temperature). The results also suggested that how ET is eventually affected by precipitation and two-layer wind speed difference depends on whether their positive or negative feedback processes have a more important role. The multivariate linear regression method provided reliable estimations of actual ET; thus, 6-parameter simplified schemes and 14-parameter regular schemes were established.
2014-01-01
This paper examined the efficiency of multivariate linear regression (MLR) and artificial neural network (ANN) models in prediction of two major water quality parameters in a wastewater treatment plant. Biochemical oxygen demand (BOD) and chemical oxygen demand (COD) as well as indirect indicators of organic matters are representative parameters for sewer water quality. Performance of the ANN models was evaluated using coefficient of correlation (r), root mean square error (RMSE) and bias values. The computed values of BOD and COD by model, ANN method and regression analysis were in close agreement with their respective measured values. Results showed that the ANN performance model was better than the MLR model. Comparative indices of the optimized ANN with input values of temperature (T), pH, total suspended solid (TSS) and total suspended (TS) for prediction of BOD was RMSE = 25.1 mg/L, r = 0.83 and for prediction of COD was RMSE = 49.4 mg/L, r = 0.81. It was found that the ANN model could be employed successfully in estimating the BOD and COD in the inlet of wastewater biochemical treatment plants. Moreover, sensitive examination results showed that pH parameter have more effect on BOD and COD predicting to another parameters. Also, both implemented models have predicted BOD better than COD. PMID:24456676
Nonstationary multivariate modeling of cerebral autoregulation during hypercapnia.
Kostoglou, Kyriaki; Debert, Chantel T; Poulin, Marc J; Mitsis, Georgios D
2014-05-01
We examined the time-varying characteristics of cerebral autoregulation and hemodynamics during a step hypercapnic stimulus by using recursively estimated multivariate (two-input) models which quantify the dynamic effects of mean arterial blood pressure (ABP) and end-tidal CO2 tension (PETCO2) on middle cerebral artery blood flow velocity (CBFV). Beat-to-beat values of ABP and CBFV, as well as breath-to-breath values of PETCO2 during baseline and sustained euoxic hypercapnia were obtained in 8 female subjects. The multiple-input, single-output models used were based on the Laguerre expansion technique, and their parameters were updated using recursive least squares with multiple forgetting factors. The results reveal the presence of nonstationarities that confirm previously reported effects of hypercapnia on autoregulation, i.e. a decrease in the MABP phase lead, and suggest that the incorporation of PETCO2 as an additional model input yields less time-varying estimates of dynamic pressure autoregulation obtained from single-input (ABP-CBFV) models. Copyright © 2013 IPEM. Published by Elsevier Ltd. All rights reserved.
Iterative Importance Sampling Algorithms for Parameter Estimation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grout, Ray W; Morzfeld, Matthias; Day, Marcus S.
In parameter estimation problems one computes a posterior distribution over uncertain parameters defined jointly by a prior distribution, a model, and noisy data. Markov chain Monte Carlo (MCMC) is often used for the numerical solution of such problems. An alternative to MCMC is importance sampling, which can exhibit near perfect scaling with the number of cores on high performance computing systems because samples are drawn independently. However, finding a suitable proposal distribution is a challenging task. Several sampling algorithms have been proposed over the past years that take an iterative approach to constructing a proposal distribution. We investigate the applicabilitymore » of such algorithms by applying them to two realistic and challenging test problems, one in subsurface flow, and one in combustion modeling. More specifically, we implement importance sampling algorithms that iterate over the mean and covariance matrix of Gaussian or multivariate t-proposal distributions. Our implementation leverages massively parallel computers, and we present strategies to initialize the iterations using 'coarse' MCMC runs or Gaussian mixture models.« less
Risk-adjusted outcome measurement in pediatric allogeneic stem cell transplantation.
Matthes-Martin, Susanne; Pötschger, Ulrike; Bergmann, Kirsten; Frommlet, Florian; Brannath, Werner; Bauer, Peter; Klingebiel, Thomas
2008-03-01
The purpose of the study was to define a risk score for 1-year treatment-related mortality (TRM) in children undergoing allogeneic stem cell transplantation as a basis for risk-adjusted outcome assessment. We analyzed 1364 consecutive stem cell transplants performed in 24 German and Austrian centers between 1998 and 2003. Five well-established risk factors were tested by multivariate logistic regression for predictive power: patient age, disease status, donor other than matched sibling donor, T cell depletion (TCD), and preceding stem cell transplantation. The risk score was defined by rounding the parameter estimates of the significant risk factors to the nearest integer. Crossvalidation was performed on the basis of 5 randomly extracted equal-sized parts from the database. Additionally, the score was validated for different disease entities and for single centers. Multivariate analysis revealed a significant correlation of TRM with 3 risk factors: age >10 years, advanced disease, and alternative donor. The parameter estimates were 0.76 for age, 0.73 for disease status, and 0.97 for donor type. Rounding the estimates resulted in a score with 1 point for each risk factor. One-year TRM (overall survival [OS]) were 5% (89%) with a score of 0, 18% (74%) with 1, 28% (54%) with 2, and 53% (27%) with 3 points. Crossvalidation showed stable results with a good correlation between predicted and observed mortality but moderate discrimination. The score seems to be a simple instrument to estimate the expected mortality for each risk group and for each center. Measuring TRM risk-adjusted and the comparison between expected and observed mortality may be an additional tool for outcome assessment in pediatric stem cell transplantation.
A method for analyzing clustered interval-censored data based on Cox's model.
Kor, Chew-Teng; Cheng, Kuang-Fu; Chen, Yi-Hau
2013-02-28
Methods for analyzing interval-censored data are well established. Unfortunately, these methods are inappropriate for the studies with correlated data. In this paper, we focus on developing a method for analyzing clustered interval-censored data. Our method is based on Cox's proportional hazard model with piecewise-constant baseline hazard function. The correlation structure of the data can be modeled by using Clayton's copula or independence model with proper adjustment in the covariance estimation. We establish estimating equations for the regression parameters and baseline hazards (and a parameter in copula) simultaneously. Simulation results confirm that the point estimators follow a multivariate normal distribution, and our proposed variance estimations are reliable. In particular, we found that the approach with independence model worked well even when the true correlation model was derived from Clayton's copula. We applied our method to a family-based cohort study of pandemic H1N1 influenza in Taiwan during 2009-2010. Using the proposed method, we investigate the impact of vaccination and family contacts on the incidence of pH1N1 influenza. Copyright © 2012 John Wiley & Sons, Ltd.
Genetic parameter estimation for long endurance trials in the Uruguayan Criollo horse.
López-Correa, R D; Peñagaricano, F; Rovere, G; Urioste, J I
2018-06-01
The aim of this study was to estimate the genetic parameters of performance in a 750-km, 15-day ride in Criollo horses. Heritability (h 2 ) and maternal lineage effects (mt 2 ) were obtained for rank, a relative placing measure of performance. Additive genetic and maternal lineage (rmt) correlations among five medium-to-high intensity phase ranks (pRK) and final rank (RK) were also estimated. Individual records from 1,236 Criollo horses from 1979 to 2012 were used. A multivariate threshold animal model was applied to the pRK and RK. Heritability was moderate to low (0.156-0.275). Estimates of mt 2 were consistently low (0.04-0.06). Additive genetic correlations between individual pRK and RK were high (0.801-0.924), and the genetic correlations between individual pRKs ranged from 0.763 to 0.847. The pRK heritabilities revealed that some phases were explained by a greater additive component, whereas others showed stronger genetic relationships with RK. Thus, not all pRK may be considered as similar measures of performance in competition. © 2018 Blackwell Verlag GmbH.
Anomaly Monitoring Method for Key Components of Satellite
Fan, Linjun; Xiao, Weidong; Tang, Jun
2014-01-01
This paper presented a fault diagnosis method for key components of satellite, called Anomaly Monitoring Method (AMM), which is made up of state estimation based on Multivariate State Estimation Techniques (MSET) and anomaly detection based on Sequential Probability Ratio Test (SPRT). On the basis of analysis failure of lithium-ion batteries (LIBs), we divided the failure of LIBs into internal failure, external failure, and thermal runaway and selected electrolyte resistance (R e) and the charge transfer resistance (R ct) as the key parameters of state estimation. Then, through the actual in-orbit telemetry data of the key parameters of LIBs, we obtained the actual residual value (R X) and healthy residual value (R L) of LIBs based on the state estimation of MSET, and then, through the residual values (R X and R L) of LIBs, we detected the anomaly states based on the anomaly detection of SPRT. Lastly, we conducted an example of AMM for LIBs, and, according to the results of AMM, we validated the feasibility and effectiveness of AMM by comparing it with the results of threshold detective method (TDM). PMID:24587703
Truu, Jaak; Heinaru, Eeva; Talpsep, Ene; Heinaru, Ain
2002-01-01
The oil-shale industry has created serious pollution problems in northeastern Estonia. Untreated, phenol-rich leachate from semi-coke mounds formed as a by-product of oil-shale processing is discharged into the Baltic Sea via channels and rivers. An exploratory analysis of water chemical and microbiological data sets from the low-flow period was carried out using different multivariate analysis techniques. Principal component analysis allowed us to distinguish different locations in the river system. The riverine microbial community response to water chemical parameters was assessed by co-inertia analysis. Water pH, COD and total nitrogen were negatively related to the number of biodegradative bacteria, while oxygen concentration promoted the abundance of these bacteria. The results demonstrate the utility of multivariate statistical techniques as tools for estimating the magnitude and extent of pollution based on river water chemical and microbiological parameters. An evaluation of river chemical and microbiological data suggests that the ambient natural attenuation mechanisms only partly eliminate pollutants from river water, and that a sufficient reduction of more recalcitrant compounds could be achieved through the reduction of wastewater discharge from the oil-shale chemical industry into the rivers.
Tangen, C M; Koch, G G
1999-03-01
In the randomized clinical trial setting, controlling for covariates is expected to produce variance reduction for the treatment parameter estimate and to adjust for random imbalances of covariates between the treatment groups. However, for the logistic regression model, variance reduction is not obviously obtained. This can lead to concerns about the assumptions of the logistic model. We introduce a complementary nonparametric method for covariate adjustment. It provides results that are usually compatible with expectations for analysis of covariance. The only assumptions required are based on randomization and sampling arguments. The resulting treatment parameter is a (unconditional) population average log-odds ratio that has been adjusted for random imbalance of covariates. Data from a randomized clinical trial are used to compare results from the traditional maximum likelihood logistic method with those from the nonparametric logistic method. We examine treatment parameter estimates, corresponding standard errors, and significance levels in models with and without covariate adjustment. In addition, we discuss differences between unconditional population average treatment parameters and conditional subpopulation average treatment parameters. Additional features of the nonparametric method, including stratified (multicenter) and multivariate (multivisit) analyses, are illustrated. Extensions of this methodology to the proportional odds model are also made.
A new approach to estimating trends in chlamydia incidence.
Ali, Hammad; Cameron, Ewan; Drovandi, Christopher C; McCaw, James M; Guy, Rebecca J; Middleton, Melanie; El-Hayek, Carol; Hocking, Jane S; Kaldor, John M; Donovan, Basil; Wilson, David P
2015-11-01
Directly measuring disease incidence in a population is difficult and not feasible to do routinely. We describe the development and application of a new method for estimating at a population level the number of incident genital chlamydia infections, and the corresponding incidence rates, by age and sex using routine surveillance data. A Bayesian statistical approach was developed to calibrate the parameters of a decision-pathway tree against national data on numbers of notifications and tests conducted (2001-2013). Independent beta probability density functions were adopted for priors on the time-independent parameters; the shapes of these beta parameters were chosen to match prior estimates sourced from peer-reviewed literature or expert opinion. To best facilitate the calibration, multivariate Gaussian priors on (the logistic transforms of) the time-dependent parameters were adopted, using the Matérn covariance function to favour small changes over consecutive years and across adjacent age cohorts. The model outcomes were validated by comparing them with other independent empirical epidemiological measures, that is, prevalence and incidence as reported by other studies. Model-based estimates suggest that the total number of people acquiring chlamydia per year in Australia has increased by ∼120% over 12 years. Nationally, an estimated 356 000 people acquired chlamydia in 2013, which is 4.3 times the number of reported diagnoses. This corresponded to a chlamydia annual incidence estimate of 1.54% in 2013, increased from 0.81% in 2001 (∼90% increase). We developed a statistical method which uses routine surveillance (notifications and testing) data to produce estimates of the extent and trends in chlamydia incidence. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Application of Multivariate Modeling for Radiation Injury Assessment: A Proof of Concept
Bolduc, David L.; Villa, Vilmar; Sandgren, David J.; Ledney, G. David; Blakely, William F.; Bünger, Rolf
2014-01-01
Multivariate radiation injury estimation algorithms were formulated for estimating severe hematopoietic acute radiation syndrome (H-ARS) injury (i.e., response category three or RC3) in a rhesus monkey total-body irradiation (TBI) model. Classical CBC and serum chemistry blood parameters were examined prior to irradiation (d 0) and on d 7, 10, 14, 21, and 25 after irradiation involving 24 nonhuman primates (NHP) (Macaca mulatta) given 6.5-Gy 60Co Υ-rays (0.4 Gy min−1) TBI. A correlation matrix was formulated with the RC3 severity level designated as the “dependent variable” and independent variables down selected based on their radioresponsiveness and relatively low multicollinearity using stepwise-linear regression analyses. Final candidate independent variables included CBC counts (absolute number of neutrophils, lymphocytes, and platelets) in formulating the “CBC” RC3 estimation algorithm. Additionally, the formulation of a diagnostic CBC and serum chemistry “CBC-SCHEM” RC3 algorithm expanded upon the CBC algorithm model with the addition of hematocrit and the serum enzyme levels of aspartate aminotransferase, creatine kinase, and lactate dehydrogenase. Both algorithms estimated RC3 with over 90% predictive power. Only the CBC-SCHEM RC3 algorithm, however, met the critical three assumptions of linear least squares demonstrating slightly greater precision for radiation injury estimation, but with significantly decreased prediction error indicating increased statistical robustness. PMID:25165485
Meyer, Karin; Kirkpatrick, Mark
2005-01-01
Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given. PMID:15588566
Differentially Private Synthesization of Multi-Dimensional Data using Copula Functions
Li, Haoran; Xiong, Li; Jiang, Xiaoqian
2014-01-01
Differential privacy has recently emerged in private statistical data release as one of the strongest privacy guarantees. Most of the existing techniques that generate differentially private histograms or synthetic data only work well for single dimensional or low-dimensional histograms. They become problematic for high dimensional and large domain data due to increased perturbation error and computation complexity. In this paper, we propose DPCopula, a differentially private data synthesization technique using Copula functions for multi-dimensional data. The core of our method is to compute a differentially private copula function from which we can sample synthetic data. Copula functions are used to describe the dependence between multivariate random vectors and allow us to build the multivariate joint distribution using one-dimensional marginal distributions. We present two methods for estimating the parameters of the copula functions with differential privacy: maximum likelihood estimation and Kendall’s τ estimation. We present formal proofs for the privacy guarantee as well as the convergence property of our methods. Extensive experiments using both real datasets and synthetic datasets demonstrate that DPCopula generates highly accurate synthetic multi-dimensional data with significantly better utility than state-of-the-art techniques. PMID:25405241
Evolutionary rates for multivariate traits: the role of selection and genetic variation
Pitchers, William; Wolf, Jason B.; Tregenza, Tom; Hunt, John; Dworkin, Ian
2014-01-01
A fundamental question in evolutionary biology is the relative importance of selection and genetic architecture in determining evolutionary rates. Adaptive evolution can be described by the multivariate breeders' equation (), which predicts evolutionary change for a suite of phenotypic traits () as a product of directional selection acting on them (β) and the genetic variance–covariance matrix for those traits (G). Despite being empirically challenging to estimate, there are enough published estimates of G and β to allow for synthesis of general patterns across species. We use published estimates to test the hypotheses that there are systematic differences in the rate of evolution among trait types, and that these differences are, in part, due to genetic architecture. We find some evidence that sexually selected traits exhibit faster rates of evolution compared with life-history or morphological traits. This difference does not appear to be related to stronger selection on sexually selected traits. Using numerous proposed approaches to quantifying the shape, size and structure of G, we examine how these parameters relate to one another, and how they vary among taxonomic and trait groupings. Despite considerable variation, they do not explain the observed differences in evolutionary rates. PMID:25002697
Oviedo de la Fuente, Manuel; Febrero-Bande, Manuel; Muñoz, María Pilar; Domínguez, Àngela
2018-01-01
This paper proposes a novel approach that uses meteorological information to predict the incidence of influenza in Galicia (Spain). It extends the Generalized Least Squares (GLS) methods in the multivariate framework to functional regression models with dependent errors. These kinds of models are useful when the recent history of the incidence of influenza are readily unavailable (for instance, by delays on the communication with health informants) and the prediction must be constructed by correcting the temporal dependence of the residuals and using more accessible variables. A simulation study shows that the GLS estimators render better estimations of the parameters associated with the regression model than they do with the classical models. They obtain extremely good results from the predictive point of view and are competitive with the classical time series approach for the incidence of influenza. An iterative version of the GLS estimator (called iGLS) was also proposed that can help to model complicated dependence structures. For constructing the model, the distance correlation measure [Formula: see text] was employed to select relevant information to predict influenza rate mixing multivariate and functional variables. These kinds of models are extremely useful to health managers in allocating resources in advance to manage influenza epidemics.
On Patarin's Attack against the lIC Scheme
NASA Astrophysics Data System (ADS)
Ogura, Naoki; Uchiyama, Shigenori
In 2007, Ding et al. proposed an attractive scheme, which is called the l-Invertible Cycles (lIC) scheme. lIC is one of the most efficient multivariate public-key cryptosystems (MPKC); these schemes would be suitable for using under limited computational resources. In 2008, an efficient attack against lIC using Gröbner basis algorithms was proposed by Fouque et al. However, they only estimated the complexity of their attack based on their experimental results. On the other hand, Patarin had proposed an efficient attack against some multivariate public-key cryptosystems. We call this attack Patarin's attack. The complexity of Patarin's attack can be estimated by finding relations corresponding to each scheme. In this paper, we propose an another practical attack against the lIC encryption/signature scheme. We estimate the complexity of our attack (not experimentally) by adapting Patarin's attack. The attack can be also applied to the lIC- scheme. Moreover, we show some experimental results of a practical attack against the lIC/lIC- schemes. This is the first implementation of both our proposed attack and an attack based on Gröbner basis algorithm for the even case, that is, a parameter l is even.
Sensitivity analysis of pulse pileup model parameter in photon counting detectors
NASA Astrophysics Data System (ADS)
Shunhavanich, Picha; Pelc, Norbert J.
2017-03-01
Photon counting detectors (PCDs) may provide several benefits over energy-integrating detectors (EIDs), including spectral information for tissue characterization and the elimination of electronic noise. PCDs, however, suffer from pulse pileup, which distorts the detected spectrum and degrades the accuracy of material decomposition. Several analytical models have been proposed to address this problem. The performance of these models are dependent on the assumptions used, including the estimated pulse shape whose parameter values could differ from the actual physical ones. As the incident flux increases and the corrections become more significant the needed parameter value accuracy may be more crucial. In this work, the sensitivity of model parameter accuracies is analyzed for the pileup model of Taguchi et al. The spectra distorted by pileup at different count rates are simulated using either the model or Monte Carlo simulations, and the basis material thicknesses are estimated by minimizing the negative log-likelihood with Poisson or multivariate Gaussian distributions. From simulation results, we find that the accuracy of the deadtime, the height of pulse negative tail, and the timing to the end of the pulse are more important than most other parameters, and they matter more with increasing count rate. This result can help facilitate further work on parameter calibrations.
Adaptive MCMC in Bayesian phylogenetics: an application to analyzing partitioned data in BEAST.
Baele, Guy; Lemey, Philippe; Rambaut, Andrew; Suchard, Marc A
2017-06-15
Advances in sequencing technology continue to deliver increasingly large molecular sequence datasets that are often heavily partitioned in order to accurately model the underlying evolutionary processes. In phylogenetic analyses, partitioning strategies involve estimating conditionally independent models of molecular evolution for different genes and different positions within those genes, requiring a large number of evolutionary parameters that have to be estimated, leading to an increased computational burden for such analyses. The past two decades have also seen the rise of multi-core processors, both in the central processing unit (CPU) and Graphics processing unit processor markets, enabling massively parallel computations that are not yet fully exploited by many software packages for multipartite analyses. We here propose a Markov chain Monte Carlo (MCMC) approach using an adaptive multivariate transition kernel to estimate in parallel a large number of parameters, split across partitioned data, by exploiting multi-core processing. Across several real-world examples, we demonstrate that our approach enables the estimation of these multipartite parameters more efficiently than standard approaches that typically use a mixture of univariate transition kernels. In one case, when estimating the relative rate parameter of the non-coding partition in a heterochronous dataset, MCMC integration efficiency improves by > 14-fold. Our implementation is part of the BEAST code base, a widely used open source software package to perform Bayesian phylogenetic inference. guy.baele@kuleuven.be. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Yiu, Sean; Tom, Brian Dm
2017-01-01
Several researchers have described two-part models with patient-specific stochastic processes for analysing longitudinal semicontinuous data. In theory, such models can offer greater flexibility than the standard two-part model with patient-specific random effects. However, in practice, the high dimensional integrations involved in the marginal likelihood (i.e. integrated over the stochastic processes) significantly complicates model fitting. Thus, non-standard computationally intensive procedures based on simulating the marginal likelihood have so far only been proposed. In this paper, we describe an efficient method of implementation by demonstrating how the high dimensional integrations involved in the marginal likelihood can be computed efficiently. Specifically, by using a property of the multivariate normal distribution and the standard marginal cumulative distribution function identity, we transform the marginal likelihood so that the high dimensional integrations are contained in the cumulative distribution function of a multivariate normal distribution, which can then be efficiently evaluated. Hence, maximum likelihood estimation can be used to obtain parameter estimates and asymptotic standard errors (from the observed information matrix) of model parameters. We describe our proposed efficient implementation procedure for the standard two-part model parameterisation and when it is of interest to directly model the overall marginal mean. The methodology is applied on a psoriatic arthritis data set concerning functional disability.
A Bayesian approach for parameter estimation and prediction using a computationally intensive model
Higdon, Dave; McDonnell, Jordan D.; Schunck, Nicolas; ...
2015-02-05
Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based modelmore » $$\\eta (\\theta )$$, where θ denotes the uncertain, best input setting. Hence the statistical model is of the form $$y=\\eta (\\theta )+\\epsilon ,$$ where $$\\epsilon $$ accounts for measurement, and possibly other, error sources. When nonlinearity is present in $$\\eta (\\cdot )$$, the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model $$\\eta (\\cdot )$$. This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. Lastly, we also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory.« less
Measuring multiple spike train synchrony.
Kreuz, Thomas; Chicharro, Daniel; Andrzejak, Ralph G; Haas, Julie S; Abarbanel, Henry D I
2009-10-15
Measures of multiple spike train synchrony are essential in order to study issues such as spike timing reliability, network synchronization, and neuronal coding. These measures can broadly be divided in multivariate measures and averages over bivariate measures. One of the most recent bivariate approaches, the ISI-distance, employs the ratio of instantaneous interspike intervals (ISIs). In this study we propose two extensions of the ISI-distance, the straightforward averaged bivariate ISI-distance and the multivariate ISI-diversity based on the coefficient of variation. Like the original measure these extensions combine many properties desirable in applications to real data. In particular, they are parameter-free, time scale independent, and easy to visualize in a time-resolved manner, as we illustrate with in vitro recordings from a cortical neuron. Using a simulated network of Hindemarsh-Rose neurons as a controlled configuration we compare the performance of our methods in distinguishing different levels of multi-neuron spike train synchrony to the performance of six other previously published measures. We show and explain why the averaged bivariate measures perform better than the multivariate ones and why the multivariate ISI-diversity is the best performer among the multivariate methods. Finally, in a comparison against standard methods that rely on moving window estimates, we use single-unit monkey data to demonstrate the advantages of the instantaneous nature of our methods.
NASA Astrophysics Data System (ADS)
Xing, Wanqiu; Wang, Weiguang; Shao, Quanxi; Yong, Bin
2018-01-01
Quantifying precipitation (P) partition into evapotranspiration (E) and runoff (Q) is of great importance for global and regional water availability assessment. Budyko framework serves as a powerful tool to make simple and transparent estimation for the partition, using a single parameter, to characterize the shape of the Budyko curve for a "specific basin", where the single parameter reflects the overall effect by not only climatic seasonality, catchment characteristics (e.g., soil, topography and vegetation) but also agricultural activities (e.g., cultivation and irrigation). At the regional scale, these influencing factors are interconnected, and the interactions between them can also affect the single parameter of Budyko-type equations' estimating. Here we employ the multivariate adaptive regression splines (MARS) model to estimate the Budyko curve shape parameter (n in the Choudhury's equation, one form of the Budyko framework) of the selected 96 catchments across China using a data set of long-term averages for climatic seasonality, catchment characteristics and agricultural activities. Results show average storm depth (ASD), vegetation coverage (M), and seasonality index of precipitation (SI) are three statistically significant factors affecting the Budyko parameter. More importantly, four pairs of interactions are recognized by the MARS model as: The interaction between CA (percentage of cultivated land area to total catchment area) and ASD shows that the cultivation can weaken the reducing effect of high ASD (>46.78 mm) on the Budyko parameter estimating. Drought (represented by the value of Palmer drought severity index < -0.74) and uneven distribution of annual rainfall (represented by the value of coefficient of variation of precipitation > 0.23) tend to enhance the Budyko parameter reduction by large SI (>0.797). Low vegetation coverage (34.56%) is likely to intensify the rising effect on evapotranspiration ratio by IA (percentage of irrigation area to total catchment area). The Budyko n values estimated by the MARS model reproduce the calculated ones by the observation well for the selected 96 catchments (with R = 0.817, MAE = 4.09). Compared to the multiple stepwise regression model estimating the parameter n taken the influencing factors as independent inputs, the MARS model enhances the capability of the Budyko framework for assessing water availability at regional scale using readily available data.
Prentice, Ross L; Zhao, Shanshan
2018-01-01
The Dabrowska (Ann Stat 16:1475-1489, 1988) product integral representation of the multivariate survivor function is extended, leading to a nonparametric survivor function estimator for an arbitrary number of failure time variates that has a simple recursive formula for its calculation. Empirical process methods are used to sketch proofs for this estimator's strong consistency and weak convergence properties. Summary measures of pairwise and higher-order dependencies are also defined and nonparametrically estimated. Simulation evaluation is given for the special case of three failure time variates.
Spatial estimation from remotely sensed data via empirical Bayes models
NASA Technical Reports Server (NTRS)
Hill, J. R.; Hinkley, D. V.; Kostal, H.; Morris, C. N.
1984-01-01
Multichannel satellite image data, available as LANDSAT imagery, are recorded as a multivariate time series (four channels, multiple passovers) in two spatial dimensions. The application of parametric empirical Bayes theory to classification of, and estimating the probability of, each crop type at each of a large number of pixels is considered. This theory involves both the probability distribution of imagery data, conditional on crop types, and the prior spatial distribution of crop types. For the latter Markov models indexed by estimable parameters are used. A broad outline of the general theory reveals several questions for further research. Some detailed results are given for the special case of two crop types when only a line transect is analyzed. Finally, the estimation of an underlying continuous process on the lattice is discussed which would be applicable to such quantities as crop yield.
Prognostic impact of intestinal wall thickening in hospitalized patients with heart failure.
Ikeda, Yuki; Ishii, Shunsuke; Fujita, Teppei; Iida, Yuichiro; Kaida, Toyoji; Nabeta, Takeru; Maekawa, Emi; Yanagisawa, Tomoyoshi; Koitabashi, Toshimi; Takeuchi, Ichiro; Inomata, Takayuki; Ako, Junya
2017-03-01
Intestine-cardiovascular relationship has been increasingly recognized as a key factor in patients with heart disease. We aimed to identify the relationships among intestinal wall edema, cardiac function, and adverse clinical events in hospitalized heart failure (HF) patients. Abdominal computed tomographic images of 168 hospitalized HF patients were retrospectively investigated for identification of average colon wall thickness (CWT) from the ascending to sigmoid colon. Relationships between average CWT and echocardiographic parameters, blood sampling data, and primary outcomes including readmission for deteriorated HF and all-cause mortality were evaluated. Among the echocardiographic parameters, lower left ventricular diastolic function was correlated with higher average CWT. In multivariate analysis, higher logarithmic C-reactive protein level, lower estimated glomerular filtration rate, lower peripheral blood lymphocyte count, higher E/E' ratio, and extremely higher/lower defecation frequency were independently correlated with higher average CWT. Multivariate Cox-hazard analysis demonstrated that higher average CWT was independently related to higher incidence of primary outcomes. In hospitalized HF patients, increased CWT was associated with lower cardiac performance, and predicted poorer long-term clinical outcomes. Copyright © 2016. Published by Elsevier B.V.
Riley, Richard D; Ensor, Joie; Jackson, Dan; Burke, Danielle L
2017-01-01
Many meta-analysis models contain multiple parameters, for example due to multiple outcomes, multiple treatments or multiple regression coefficients. In particular, meta-regression models may contain multiple study-level covariates, and one-stage individual participant data meta-analysis models may contain multiple patient-level covariates and interactions. Here, we propose how to derive percentage study weights for such situations, in order to reveal the (otherwise hidden) contribution of each study toward the parameter estimates of interest. We assume that studies are independent, and utilise a decomposition of Fisher's information matrix to decompose the total variance matrix of parameter estimates into study-specific contributions, from which percentage weights are derived. This approach generalises how percentage weights are calculated in a traditional, single parameter meta-analysis model. Application is made to one- and two-stage individual participant data meta-analyses, meta-regression and network (multivariate) meta-analysis of multiple treatments. These reveal percentage study weights toward clinically important estimates, such as summary treatment effects and treatment-covariate interactions, and are especially useful when some studies are potential outliers or at high risk of bias. We also derive percentage study weights toward methodologically interesting measures, such as the magnitude of ecological bias (difference between within-study and across-study associations) and the amount of inconsistency (difference between direct and indirect evidence in a network meta-analysis).
Wang, Yuxin; Lai, Adelene; Latino, Diogo; Fenner, Kathrin; Helbling, Damian E
2018-06-14
Aerobic biodegradation half-lives (half-lives) are key parameters used to evaluate pesticide persistence in soil. However, half-life estimates for individual pesticides often span several orders of magnitude, reflecting the impact that various environmental or experimental parameters have on half-lives in soil. In this work, we collected literature-reported half-lives for eleven pesticides along with associated metadata describing the environmental or experimental conditions under which they were derived. We then developed a multivariable framework to discover relationships between the half-lives and associated metadata. We first compared data for the herbicide atrazine collected from 95 laboratory and 65 field studies. We discovered that atrazine application history and soil texture were the parameters that have the largest influence on the observed half-lives in both types of studies. We then extended the analysis to include ten additional pesticides with data collected exclusively from laboratory studies. We found that, when data were available, pesticide application history and biomass concentrations were always positively associated with half-lives. The relevance of other parameters varied among the pesticides, but in some cases the variability could be explained by the physicochemical properties of the pesticides. For example, we found that the relative significance of the organic carbon content of soil for determining half-lives depends on the relative solubility of the pesticide. Altogether, our analyses highlight the reciprocal influence of both environmental parameters and intrinsic physicochemical properties for determining half-lives in soil. Copyright © 2018 Elsevier Ltd. All rights reserved.
Analysis of signal-dependent sensor noise on JPEG 2000-compressed Sentinel-2 multi-spectral images
NASA Astrophysics Data System (ADS)
Uss, M.; Vozel, B.; Lukin, V.; Chehdi, K.
2017-10-01
The processing chain of Sentinel-2 MultiSpectral Instrument (MSI) data involves filtering and compression stages that modify MSI sensor noise. As a result, noise in Sentinel-2 Level-1C data distributed to users becomes processed. We demonstrate that processed noise variance model is bivariate: noise variance depends on image intensity (caused by signal-dependency of photon counting detectors) and signal-to-noise ratio (SNR; caused by filtering/compression). To provide information on processed noise parameters, which is missing in Sentinel-2 metadata, we propose to use blind noise parameter estimation approach. Existing methods are restricted to univariate noise model. Therefore, we propose extension of existing vcNI+fBm blind noise parameter estimation method to multivariate noise model, mvcNI+fBm, and apply it to each band of Sentinel-2A data. Obtained results clearly demonstrate that noise variance is affected by filtering/compression for SNR less than about 15. Processed noise variance is reduced by a factor of 2 - 5 in homogeneous areas as compared to noise variance for high SNR values. Estimate of noise variance model parameters are provided for each Sentinel-2A band. Sentinel-2A MSI Level-1C noise models obtained in this paper could be useful for end users and researchers working in a variety of remote sensing applications.
Golightly, Andrew; Wilkinson, Darren J.
2011-01-01
Computational systems biology is concerned with the development of detailed mechanistic models of biological processes. Such models are often stochastic and analytically intractable, containing uncertain parameters that must be estimated from time course data. In this article, we consider the task of inferring the parameters of a stochastic kinetic model defined as a Markov (jump) process. Inference for the parameters of complex nonlinear multivariate stochastic process models is a challenging problem, but we find here that algorithms based on particle Markov chain Monte Carlo turn out to be a very effective computationally intensive approach to the problem. Approximations to the inferential model based on stochastic differential equations (SDEs) are considered, as well as improvements to the inference scheme that exploit the SDE structure. We apply the methodology to a Lotka–Volterra system and a prokaryotic auto-regulatory network. PMID:23226583
Conlon, Anna S C; Taylor, Jeremy M G; Elliott, Michael R
2014-04-01
In clinical trials, a surrogate outcome variable (S) can be measured before the outcome of interest (T) and may provide early information regarding the treatment (Z) effect on T. Using the principal surrogacy framework introduced by Frangakis and Rubin (2002. Principal stratification in causal inference. Biometrics 58, 21-29), we consider an approach that has a causal interpretation and develop a Bayesian estimation strategy for surrogate validation when the joint distribution of potential surrogate and outcome measures is multivariate normal. From the joint conditional distribution of the potential outcomes of T, given the potential outcomes of S, we propose surrogacy validation measures from this model. As the model is not fully identifiable from the data, we propose some reasonable prior distributions and assumptions that can be placed on weakly identified parameters to aid in estimation. We explore the relationship between our surrogacy measures and the surrogacy measures proposed by Prentice (1989. Surrogate endpoints in clinical trials: definition and operational criteria. Statistics in Medicine 8, 431-440). The method is applied to data from a macular degeneration study and an ovarian cancer study.
Conlon, Anna S. C.; Taylor, Jeremy M. G.; Elliott, Michael R.
2014-01-01
In clinical trials, a surrogate outcome variable (S) can be measured before the outcome of interest (T) and may provide early information regarding the treatment (Z) effect on T. Using the principal surrogacy framework introduced by Frangakis and Rubin (2002. Principal stratification in causal inference. Biometrics 58, 21–29), we consider an approach that has a causal interpretation and develop a Bayesian estimation strategy for surrogate validation when the joint distribution of potential surrogate and outcome measures is multivariate normal. From the joint conditional distribution of the potential outcomes of T, given the potential outcomes of S, we propose surrogacy validation measures from this model. As the model is not fully identifiable from the data, we propose some reasonable prior distributions and assumptions that can be placed on weakly identified parameters to aid in estimation. We explore the relationship between our surrogacy measures and the surrogacy measures proposed by Prentice (1989. Surrogate endpoints in clinical trials: definition and operational criteria. Statistics in Medicine 8, 431–440). The method is applied to data from a macular degeneration study and an ovarian cancer study. PMID:24285772
Fink, Howard A; Langsetmo, Lisa; Vo, Tien N; Orwoll, Eric S; Schousboe, John T; Ensrud, Kristine E
2018-05-08
High-resolution peripheral quantitative computed tomography (HR-pQCT) assesses both volumetric bone mineral density (vBMD) and trabecular and cortical microarchitecture. However, studies of the association of HR-pQCT parameters with fracture history have been small, predominantly limited to postmenopausal women, often performed limited adjustment for potential confounders including for BMD, and infrequently assessed strength or failure measures. We used data from the Osteoporotic Fractures in Men (MrOS) study, a prospective cohort study of community-dwelling men aged ≥65 years, to evaluate the association of distal radius, proximal (diaphyseal) tibia and distal tibia HR-pQCT parameters measured at the Year 14 (Y14) study visit with prior clinical fracture. The primary HR-pQCT exposure variables were finite element analysis estimated failure loads (EFL) for each skeletal site; secondary exposure variables were total vBMD, total bone area, trabecular vBMD, trabecular bone area, trabecular thickness, trabecular number, cortical vBMD, cortical bone area, cortical thickness, and cortical porosity. Clinical fractures were ascertained from questionnaires administered every 4 months between MrOS study baseline and the Y14 visit and centrally adjudicated by masked review of radiographic reports. We used multivariate-adjusted logistic regression to estimate the odds of prior clinical fracture per 1 SD decrement for each Y14 HR-pQCT parameter. Three hundred forty-four (19.2%) of the 1794 men with available HR-pQCT measures had a confirmed clinical fracture between baseline and Y14. After multivariable adjustment, including for total hip areal BMD, decreased HR-pQCT finite element analysis EFL for each site was associated with significantly greater odds of prior confirmed clinical fracture and major osteoporotic fracture. Among other HR-pQCT parameters, decreased cortical area appeared to have the strongest independent association with prior clinical fracture. Future studies should explore associations of HR-pQCT parameters with specific fracture types and risk of incident fractures and the impact of age and sex on these relationships. Published by Elsevier Inc.
Modeling absolute differences in life expectancy with a censored skew-normal regression approach
Clough-Gorr, Kerri; Zwahlen, Marcel
2015-01-01
Parameter estimates from commonly used multivariable parametric survival regression models do not directly quantify differences in years of life expectancy. Gaussian linear regression models give results in terms of absolute mean differences, but are not appropriate in modeling life expectancy, because in many situations time to death has a negative skewed distribution. A regression approach using a skew-normal distribution would be an alternative to parametric survival models in the modeling of life expectancy, because parameter estimates can be interpreted in terms of survival time differences while allowing for skewness of the distribution. In this paper we show how to use the skew-normal regression so that censored and left-truncated observations are accounted for. With this we model differences in life expectancy using data from the Swiss National Cohort Study and from official life expectancy estimates and compare the results with those derived from commonly used survival regression models. We conclude that a censored skew-normal survival regression approach for left-truncated observations can be used to model differences in life expectancy across covariates of interest. PMID:26339544
The following SAS macros can be used to create a multivariate usual intake distribution for multiple dietary components that are consumed nearly every day or episodically. A SAS macro for performing balanced repeated replication (BRR) variance estimation is also included.
Multivariate Statistical Analysis of Cigarette Design Feature Influence on ISO TNCO Yields.
Agnew-Heard, Kimberly A; Lancaster, Vicki A; Bravo, Roberto; Watson, Clifford; Walters, Matthew J; Holman, Matthew R
2016-06-20
The aim of this study is to explore how differences in cigarette physical design parameters influence tar, nicotine, and carbon monoxide (TNCO) yields in mainstream smoke (MSS) using the International Organization of Standardization (ISO) smoking regimen. Standardized smoking methods were used to evaluate 50 U.S. domestic brand cigarettes and a reference cigarette representing a range of TNCO yields in MSS collected from linear smoking machines using a nonintense smoking regimen. Multivariate statistical methods were used to form clusters of cigarettes based on their ISO TNCO yields and then to explore the relationship between the ISO generated TNCO yields and the nine cigarette physical design parameters between and within each cluster simultaneously. The ISO generated TNCO yields in MSS are 1.1-17.0 mg tar/cigarette, 0.1-2.2 mg nicotine/cigarette, and 1.6-17.3 mg CO/cigarette. Cluster analysis divided the 51 cigarettes into five discrete clusters based on their ISO TNCO yields. No one physical parameter dominated across all clusters. Predicting ISO machine generated TNCO yields based on these nine physical design parameters is complex due to the correlation among and between the nine physical design parameters and TNCO yields. From these analyses, it is estimated that approximately 20% of the variability in the ISO generated TNCO yields comes from other parameters (e.g., filter material, filter type, inclusion of expanded or reconstituted tobacco, and tobacco blend composition, along with differences in tobacco leaf origin and stalk positions and added ingredients). A future article will examine the influence of these physical design parameters on TNCO yields under a Canadian Intense (CI) smoking regimen. Together, these papers will provide a more robust picture of the design features that contribute to TNCO exposure across the range of real world smoking patterns.
Bostanmaneshrad, Farshid; Partani, Sadegh; Noori, Roohollah; Nachtnebel, Hans-Peter; Berndtsson, Ronny; Adamowski, Jan Franklin
2018-10-15
To date, few studies have investigated the simultaneous effects of macro-scale parameters (MSPs) such as land use, population density, geology, and erosion layers on micro-scale water quality variables (MSWQVs). This research focused on an evaluation of the relationship between MSPs and MSWQVs in the Siminehrood River Basin, Iran. In addition, we investigated the importance of water particle travel time (hydrological distance) on this relationship. The MSWQVs included 13 physicochemical and biochemical parameters observed at 15 stations during three seasons. Primary screening was performed by utilizing three multivariate statistical analyses (Pearson's correlation, cluster and discriminant analyses) in seven series of observed data. These series included three separate seasonal data, three two-season data, and aggregated three-season data for investigation of relationships between MSPs and MSWQVs. Coupled data (pairs of MSWQVs and MSPs) repeated in at least two out of three statistical analyses were selected for final screening. The primary screening results demonstrated significant relationships between land use and phosphorus, total solids and turbidity, erosion levels and electrical conductivity, and erosion and total solids. Furthermore, water particle travel time effects were considered through three geographical pattern definitions of distance for each MSP by using two weighting methods. To find effective MSP factors on MSWQVs, a multivariate linear regression analysis was employed. Then, preliminary equations that estimated MSWQVs were developed. The preliminary equations were modified to adaptive equations to obtain the final models. The final models indicated that a new metric, referred to as hydrological distance, provided better MSWQV estimation and water quality prediction compared to the National Sanitation Foundation Water Quality Index. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Developing population models with data from marked individuals
Hae Yeong Ryu,; Kevin T. Shoemaker,; Eva Kneip,; Anna Pidgeon,; Patricia Heglund,; Brooke Bateman,; Thogmartin, Wayne E.; Reşit Akçakaya,
2016-01-01
Population viability analysis (PVA) is a powerful tool for biodiversity assessments, but its use has been limited because of the requirements for fully specified population models such as demographic structure, density-dependence, environmental stochasticity, and specification of uncertainties. Developing a fully specified population model from commonly available data sources – notably, mark–recapture studies – remains complicated due to lack of practical methods for estimating fecundity, true survival (as opposed to apparent survival), natural temporal variability in both survival and fecundity, density-dependence in the demographic parameters, and uncertainty in model parameters. We present a general method that estimates all the key parameters required to specify a stochastic, matrix-based population model, constructed using a long-term mark–recapture dataset. Unlike standard mark–recapture analyses, our approach provides estimates of true survival rates and fecundities, their respective natural temporal variabilities, and density-dependence functions, making it possible to construct a population model for long-term projection of population dynamics. Furthermore, our method includes a formal quantification of parameter uncertainty for global (multivariate) sensitivity analysis. We apply this approach to 9 bird species and demonstrate the feasibility of using data from the Monitoring Avian Productivity and Survivorship (MAPS) program. Bias-correction factors for raw estimates of survival and fecundity derived from mark–recapture data (apparent survival and juvenile:adult ratio, respectively) were non-negligible, and corrected parameters were generally more biologically reasonable than their uncorrected counterparts. Our method allows the development of fully specified stochastic population models using a single, widely available data source, substantially reducing the barriers that have until now limited the widespread application of PVA. This method is expected to greatly enhance our understanding of the processes underlying population dynamics and our ability to analyze viability and project trends for species of conservation concern.
Penalized spline estimation for functional coefficient regression models.
Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z; Yu, Yan
2010-04-01
The functional coefficient regression models assume that the regression coefficients vary with some "threshold" variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called "curse of dimensionality" in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application.
Taylor, Jeremy M G; Conlon, Anna S C; Elliott, Michael R
2015-08-01
The validation of intermediate markers as surrogate markers (S) for the true outcome of interest (T) in clinical trials offers the possibility for trials to be run more quickly and cheaply by using the surrogate endpoint in place of the true endpoint. Working within a principal stratification framework, we propose causal quantities to evaluate surrogacy using a Gaussian copula model for an ordinal surrogate and time-to-event final outcome. The methods are applied to data from four colorectal cancer clinical trials, where S is tumor response and T is overall survival. For the Gaussian copula model, a Bayesian estimation strategy is used and, as some parameters are not identifiable from the data, we explore the use of informative priors that are consistent with reasonable assumptions in the surrogate marker setting to aid in estimation. While there is some bias in the estimation of the surrogacy quantities of interest, the estimation procedure does reasonably well at distinguishing between poor and good surrogate markers. Some of the parameters of the proposed model are not identifiable from the data, and therefore, assumptions must be made in order to aid in their estimation. The proposed quantities can be used in combination to provide evidence about the validity of S as a surrogate marker for T. © The Author(s) 2014.
Ghumare, Eshwar; Schrooten, Maarten; Vandenberghe, Rik; Dupont, Patrick
2015-08-01
Kalman filter approaches are widely applied to derive time varying effective connectivity from electroencephalographic (EEG) data. For multi-trial data, a classical Kalman filter (CKF) designed for the estimation of single trial data, can be implemented by trial-averaging the data or by averaging single trial estimates. A general linear Kalman filter (GLKF) provides an extension for multi-trial data. In this work, we studied the performance of the different Kalman filtering approaches for different values of signal-to-noise ratio (SNR), number of trials and number of EEG channels. We used a simulated model from which we calculated scalp recordings. From these recordings, we estimated cortical sources. Multivariate autoregressive model parameters and partial directed coherence was calculated for these estimated sources and compared with the ground-truth. The results showed an overall superior performance of GLKF except for low levels of SNR and number of trials.
Evolutionary rates for multivariate traits: the role of selection and genetic variation.
Pitchers, William; Wolf, Jason B; Tregenza, Tom; Hunt, John; Dworkin, Ian
2014-08-19
A fundamental question in evolutionary biology is the relative importance of selection and genetic architecture in determining evolutionary rates. Adaptive evolution can be described by the multivariate breeders' equation (Δz(-)=Gβ), which predicts evolutionary change for a suite of phenotypic traits (Δz(-)) as a product of directional selection acting on them (β) and the genetic variance-covariance matrix for those traits (G ). Despite being empirically challenging to estimate, there are enough published estimates of G and β to allow for synthesis of general patterns across species. We use published estimates to test the hypotheses that there are systematic differences in the rate of evolution among trait types, and that these differences are, in part, due to genetic architecture. We find some evidence that sexually selected traits exhibit faster rates of evolution compared with life-history or morphological traits. This difference does not appear to be related to stronger selection on sexually selected traits. Using numerous proposed approaches to quantifying the shape, size and structure of G, we examine how these parameters relate to one another, and how they vary among taxonomic and trait groupings. Despite considerable variation, they do not explain the observed differences in evolutionary rates. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Measuring firm size distribution with semi-nonparametric densities
NASA Astrophysics Data System (ADS)
Cortés, Lina M.; Mora-Valencia, Andrés; Perote, Javier
2017-11-01
In this article, we propose a new methodology based on a (log) semi-nonparametric (log-SNP) distribution that nests the lognormal and enables better fits in the upper tail of the distribution through the introduction of new parameters. We test the performance of the lognormal and log-SNP distributions capturing firm size, measured through a sample of US firms in 2004-2015. Taking different levels of aggregation by type of economic activity, our study shows that the log-SNP provides a better fit of the firm size distribution. We also formally introduce the multivariate log-SNP distribution, which encompasses the multivariate lognormal, to analyze the estimation of the joint distribution of the value of the firm's assets and sales. The results suggest that sales are a better firm size measure, as indicated by other studies in the literature.
Urbain, P; Birlinger, J; Lambert, C; Finke, J; Bertz, H; Biesalski, H-K
2013-03-01
There are few longitudinal data on nutritional status and body composition of patients undergoing allogeneic hematopoietic cell transplantation (alloHCT). We assessed nutritional status of 105 patients before alloHCT and its course during the early post-transplant period to day +30 and day +100 via weight history, body mass index (BMI) normalized for gender and age, Subjective Global Assessment, phase angle normalized for gender, age, and BMI, and fat-free and body fat masses. Furthermore, we present a multivariate regression model investigating the impact of factors on body weight. At admission, 23.8% reported significant weight losses (>5%) in the previous 6 months, and we noted 31.5% with abnormal age- and sex-adjusted BMI values (10th, 90th percentiles). BMI decreased significantly (P<0.0001) in both periods by 11% in total, meaning a weight loss of 8.6±5.7 kg. Simultaneously, the patients experienced significant losses (P<0.0001) of both fat-free and body fat masses. Multivariate regression model revealed clinically relevant acute GVHD (parameter estimate 1.43; P=0.02) and moderate/severe anorexia (parameter estimate 1.07; P=0.058) as independent factors influencing early weight loss. In conclusion, our results show a significant deterioration in nutritional status during the early post-transplant period. Predominant alloHCT-associated complications such as anorexia and acute GVHD became evident as significant factors influencing nutritional status.
Lu, Tsui-Shan; Longnecker, Matthew P.; Zhou, Haibo
2016-01-01
Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data and the general ODS design for a continuous response. While substantial work has been done for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome dependent sampling (Multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the Multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the Multivariate-ODS or the estimator from a simple random sample with the same sample size. The Multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of PCB exposure to hearing loss in children born to the Collaborative Perinatal Study. PMID:27966260
Zammit-Mangion, Andrew; Rougier, Jonathan; Schön, Nana; Lindgren, Finn; Bamber, Jonathan
2015-01-01
Antarctica is the world's largest fresh-water reservoir, with the potential to raise sea levels by about 60 m. An ice sheet contributes to sea-level rise (SLR) when its rate of ice discharge and/or surface melting exceeds accumulation through snowfall. Constraining the contribution of the ice sheets to present-day SLR is vital both for coastal development and planning, and climate projections. Information on various ice sheet processes is available from several remote sensing data sets, as well as in situ data such as global positioning system data. These data have differing coverage, spatial support, temporal sampling and sensing characteristics, and thus, it is advantageous to combine them all in a single framework for estimation of the SLR contribution and the assessment of processes controlling mass exchange with the ocean. In this paper, we predict the rate of height change due to salient geophysical processes in Antarctica and use these to provide estimates of SLR contribution with associated uncertainties. We employ a multivariate spatio-temporal model, approximated as a Gaussian Markov random field, to take advantage of differing spatio-temporal properties of the processes to separate the causes of the observed change. The process parameters are estimated from geophysical models, while the remaining parameters are estimated using a Markov chain Monte Carlo scheme, designed to operate in a high-performance computing environment across multiple nodes. We validate our methods against a separate data set and compare the results to those from studies that invariably employ numerical model outputs directly. We conclude that it is possible, and insightful, to assess Antarctica's contribution without explicit use of numerical models. Further, the results obtained here can be used to test the geophysical numerical models for which in situ data are hard to obtain. © 2015 The Authors. Environmetrics published by John Wiley & Sons Ltd. PMID:25937792
NASA Astrophysics Data System (ADS)
Xiao, B.; Haslauer, C. P.; Bohling, G. C.; Bárdossy, A.
2017-12-01
The spatial arrangement of hydraulic conductivity (K) determines water flow and solute transport behaviour in groundwater systems. This presentation demonstrates three advances over commonly used geostatistical methods by integrating measurements from novel measurement techniques and novel multivariate non-Gaussian dependence models: The spatial dependence structure of K was analysed using both data sets of K. Previously encountered similarities were confirmed in low-dimensional dependence. These similarities become less stringent and deviate more from symmetric Gaussian dependence in dimensions larger than two. Measurements of small and large K values are more uncertain than medium K values due to decreased sensitivity of the measurement devices at both ends of the K scale. Nevertheless, these measurements contain useful information that we include in the estimation of the marginal distribution and the spatial dependence structure as ``censored measurements'' that are estimated jointly without the common assumption of independence. The spatial dependence structure of the two data sets and their cross-covariances are used to infer the spatial dependence and the amount of the bias between the two data sets. By doing so, one spatial model for K is constructed that is used for simulation and that reflects the characteristics of both measurement techniques. The concept of the presented methodology is to use all available information for the estimation of a stochastic model of the primary parameter (K) at the highly heterogeneous Macrodispersion Experiment (MADE) site. The primary parameter has been measured by two independent measurement techniques whose sets of locations do not overlap. This site offers the unique opportunity of large quantities of measurements of K (31123 direct push injection logging based measurements and 2611 flowmeter based measurements). This improved dependence structure of K will be included into the estimated non-Gaussian dependence models and is expected to reproduce observed solute concentrations at the site better than existing dependence models of K.
Truccolo, Wilson
2016-11-01
This review presents a perspective on capturing collective dynamics in recorded neuronal ensembles based on multivariate point process models, inference of low-dimensional dynamics and coarse graining of spatiotemporal measurements. A general probabilistic framework for continuous time point processes reviewed, with an emphasis on multivariate nonlinear Hawkes processes with exogenous inputs. A point process generalized linear model (PP-GLM) framework for the estimation of discrete time multivariate nonlinear Hawkes processes is described. The approach is illustrated with the modeling of collective dynamics in neocortical neuronal ensembles recorded in human and non-human primates, and prediction of single-neuron spiking. A complementary approach to capture collective dynamics based on low-dimensional dynamics ("order parameters") inferred via latent state-space models with point process observations is presented. The approach is illustrated by inferring and decoding low-dimensional dynamics in primate motor cortex during naturalistic reach and grasp movements. Finally, we briefly review hypothesis tests based on conditional inference and spatiotemporal coarse graining for assessing collective dynamics in recorded neuronal ensembles. Published by Elsevier Ltd.
The effectiveness of robust RMCD control chart as outliers’ detector
NASA Astrophysics Data System (ADS)
Darmanto; Astutik, Suci
2017-12-01
A well-known control chart to monitor a multivariate process is Hotelling’s T 2 which its parameters are estimated classically, very sensitive and also marred by masking and swamping of outliers data effect. To overcome these situation, robust estimators are strongly recommended. One of robust estimators is re-weighted minimum covariance determinant (RMCD) which has robust characteristics as same as MCD. In this paper, the effectiveness term is accuracy of the RMCD control chart in detecting outliers as real outliers. In other word, how effectively this control chart can identify and remove masking and swamping effects of outliers. We assessed the effectiveness the robust control chart based on simulation by considering different scenarios: n sample sizes, proportion of outliers, number of p quality characteristics. We found that in some scenarios, this RMCD robust control chart works effectively.
NASA Technical Reports Server (NTRS)
Temporelli, P. L.; Scapellato, F.; Corra, U.; Eleuteri, E.; Firstenberg, M. S.; Thomas, J. D.; Giannuzzi, P.
2001-01-01
Previous studies relating Doppler parameters and pulmonary capillary wedge pressures (PCWP) typically exclude patients with severe mitral regurgitation (MR). We evaluated the effects of varying degrees of chronic MR on the Doppler estimation of PCWP. PCWP and mitral Doppler profiles were obtained in 88 patients (mean age 55 +/- 8 years) with severe left ventricular (LV) dysfunction (mean ejection fraction 23% +/- 5%). Patients were classified by severity of MR. Patients with severe MR had greater left atrial areas, LV end-diastolic volumes, and mean PCWPs and lower ejection fractions (each P <.01). In patients with mild MR, multiple echocardiographic parameters correlated with PCWP; however, with worsening MR, only deceleration time strongly related to PCWP. From stepwise multivariate analysis, deceleration time was the best independent predictor of PCWP overall, and it was the only predictor in patients with moderate or severe MR. Doppler-derived early mitral deceleration time reliably predicts PCWP in patients with severe LV dysfunction irrespective of degree of MR.
A Comparison of Three Multivariate Models for Estimating Test Battery Reliability.
ERIC Educational Resources Information Center
Wood, Terry M.; Safrit, Margaret J.
1987-01-01
A comparison of three multivariate models (canonical reliability model, maximum generalizability model, canonical correlation model) for estimating test battery reliability indicated that the maximum generalizability model showed the least degree of bias, smallest errors in estimation, and the greatest relative efficiency across all experimental…
Structural Equation Model Trees
Brandmaier, Andreas M.; von Oertzen, Timo; McArdle, John J.; Lindenberger, Ulman
2015-01-01
In the behavioral and social sciences, structural equation models (SEMs) have become widely accepted as a modeling tool for the relation between latent and observed variables. SEMs can be seen as a unification of several multivariate analysis techniques. SEM Trees combine the strengths of SEMs and the decision tree paradigm by building tree structures that separate a data set recursively into subsets with significantly different parameter estimates in a SEM. SEM Trees provide means for finding covariates and covariate interactions that predict differences in structural parameters in observed as well as in latent space and facilitate theory-guided exploration of empirical data. We describe the methodology, discuss theoretical and practical implications, and demonstrate applications to a factor model and a linear growth curve model. PMID:22984789
Electromagnetic wave scattering from rough terrain
NASA Astrophysics Data System (ADS)
Papa, R. J.; Lennon, J. F.; Taylor, R. L.
1980-09-01
This report presents two aspects of a program designed to calculate electromagnetic scattering from rough terrain: (1) the use of statistical estimation techniques to determine topographic parameters and (2) the results of a single-roughness-scale scattering calculation based on those parameters, including comparison with experimental data. In the statistical part of the present calculation, digitized topographic maps are used to generate data bases for the required scattering cells. The application of estimation theory to the data leads to the specification of statistical parameters for each cell. The estimated parameters are then used in a hypothesis test to decide on a probability density function (PDF) that represents the height distribution in the cell. Initially, the formulation uses a single observation of the multivariate data. A subsequent approach involves multiple observations of the heights on a bivariate basis, and further refinements are being considered. The electromagnetic scattering analysis, the second topic, calculates the amount of specular and diffuse multipath power reaching a monopulse receiver from a pulsed beacon positioned over a rough Earth. The program allows for spatial inhomogeneities and multiple specular reflection points. The analysis of shadowing by the rough surface has been extended to the case where the surface heights are distributed exponentially. The calculated loss of boresight pointing accuracy attributable to diffuse multipath is then compared with the experimental results. The extent of the specular region, the use of localized height variations, and the effect of the azimuthal variation in power pattern are all assessed.
NASA Technical Reports Server (NTRS)
Litt, Jonathan; Kurtkaya, Mehmet; Duyar, Ahmet
1994-01-01
This paper presents an application of a fault detection and diagnosis scheme for the sensor faults of a helicopter engine. The scheme utilizes a model-based approach with real time identification and hypothesis testing which can provide early detection, isolation, and diagnosis of failures. It is an integral part of a proposed intelligent control system with health monitoring capabilities. The intelligent control system will allow for accommodation of faults, reduce maintenance cost, and increase system availability. The scheme compares the measured outputs of the engine with the expected outputs of an engine whose sensor suite is functioning normally. If the differences between the real and expected outputs exceed threshold values, a fault is detected. The isolation of sensor failures is accomplished through a fault parameter isolation technique where parameters which model the faulty process are calculated on-line with a real-time multivariable parameter estimation algorithm. The fault parameters and their patterns can then be analyzed for diagnostic and accommodation purposes. The scheme is applied to the detection and diagnosis of sensor faults of a T700 turboshaft engine. Sensor failures are induced in a T700 nonlinear performance simulation and data obtained are used with the scheme to detect, isolate, and estimate the magnitude of the faults.
Wen, Xiaotong; Rangarajan, Govindan; Ding, Mingzhou
2013-01-01
Granger causality is increasingly being applied to multi-electrode neurophysiological and functional imaging data to characterize directional interactions between neurons and brain regions. For a multivariate dataset, one might be interested in different subsets of the recorded neurons or brain regions. According to the current estimation framework, for each subset, one conducts a separate autoregressive model fitting process, introducing the potential for unwanted variability and uncertainty. In this paper, we propose a multivariate framework for estimating Granger causality. It is based on spectral density matrix factorization and offers the advantage that the estimation of such a matrix needs to be done only once for the entire multivariate dataset. For any subset of recorded data, Granger causality can be calculated through factorizing the appropriate submatrix of the overall spectral density matrix. PMID:23858479
Multivariate Density Estimation and Remote Sensing
NASA Technical Reports Server (NTRS)
Scott, D. W.
1983-01-01
Current efforts to develop methods and computer algorithms to effectively represent multivariate data commonly encountered in remote sensing applications are described. While this may involve scatter diagrams, multivariate representations of nonparametric probability density estimates are emphasized. The density function provides a useful graphical tool for looking at data and a useful theoretical tool for classification. This approach is called a thunderstorm data analysis.
Lu, Tsui-Shan; Longnecker, Matthew P; Zhou, Haibo
2017-03-15
Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data, and the general ODS design for a continuous response. While substantial work has been carried out for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome-dependent sampling (multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the multivariate-ODS or the estimator from a simple random sample with the same sample size. The multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of polychlorinated biphenyl exposure to hearing loss in children born to the Collaborative Perinatal Study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Technical Reports Server (NTRS)
Joshi, S. M.
1984-01-01
Closed-loop stability is investigated for multivariable linear time-invariant systems controlled by optimal full state feedback linear quadratic (LQ) regulators, with nonlinear gains present in the feedback channels. Estimates are obtained for the region of attraction when the nonlinearities escape the (0.5, infinity) sector in regions away from the origin and for the region of ultimate boundedness when the nonlinearities escape the sector near the origin. The expressions for these regions also provide methods for selecting the performance function parameters in order to obtain LQ designs with better tolerance for nonlinearities. The analytical results are illustrated by applying them to the problem of controlling the rigid-body pitch angle and elastic motion of a large, flexible space antenna.
Functional Generalized Structured Component Analysis.
Suk, Hye Won; Hwang, Heungsun
2016-12-01
An extension of Generalized Structured Component Analysis (GSCA), called Functional GSCA, is proposed to analyze functional data that are considered to arise from an underlying smooth curve varying over time or other continua. GSCA has been geared for the analysis of multivariate data. Accordingly, it cannot deal with functional data that often involve different measurement occasions across participants and a large number of measurement occasions that exceed the number of participants. Functional GSCA addresses these issues by integrating GSCA with spline basis function expansions that represent infinite-dimensional curves onto a finite-dimensional space. For parameter estimation, functional GSCA minimizes a penalized least squares criterion by using an alternating penalized least squares estimation algorithm. The usefulness of functional GSCA is illustrated with gait data.
Multiscale entropy analysis of biological signals: a fundamental bi-scaling law
Gao, Jianbo; Hu, Jing; Liu, Feiyan; Cao, Yinhe
2015-01-01
Since introduced in early 2000, multiscale entropy (MSE) has found many applications in biosignal analysis, and been extended to multivariate MSE. So far, however, no analytic results for MSE or multivariate MSE have been reported. This has severely limited our basic understanding of MSE. For example, it has not been studied whether MSE estimated using default parameter values and short data set is meaningful or not. Nor is it known whether MSE has any relation with other complexity measures, such as the Hurst parameter, which characterizes the correlation structure of the data. To overcome this limitation, and more importantly, to guide more fruitful applications of MSE in various areas of life sciences, we derive a fundamental bi-scaling law for fractal time series, one for the scale in phase space, the other for the block size used for smoothing. We illustrate the usefulness of the approach by examining two types of physiological data. One is heart rate variability (HRV) data, for the purpose of distinguishing healthy subjects from patients with congestive heart failure, a life-threatening condition. The other is electroencephalogram (EEG) data, for the purpose of distinguishing epileptic seizure EEG from normal healthy EEG. PMID:26082711
NASA Astrophysics Data System (ADS)
Kisi, Ozgur; Parmar, Kulwinder Singh
2016-03-01
This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.
Sex estimation of the tibia in modern Turkish: A computed tomography study.
Ekizoglu, Oguzhan; Er, Ali; Bozdag, Mustafa; Akcaoglu, Mustafa; Can, Ismail Ozgur; García-Donas, Julieta G; Kranioti, Elena F
2016-11-01
The utilization of computed tomography is beneficial for the analysis of skeletal remains and it has important advantages for anthropometric studies. The present study investigated morphometry of left tibia using CT images of a contemporary Turkish population. Seven parameters were measured on 203 individuals (124 males and 79 females) within the 19-92-years age group. The first objective of this study was to provide population-specific sex estimation equations for the contemporary Turkish population based on CT images. A second objective was to test the sex estimation formulae on Southern Europeans by Kranioti and Apostol (2015). Univariate discriminant functions resulted in classification accuracy that ranged from 66 to 86%. The best single variable was found to be upper epiphyseal breadth (86%) followed by lower epiphyseal breadth (85%). Multivariate discriminant functions resulted in classification accuracy for cross-validated data ranged from 79 to 86%. Applying the multivariate sex estimation formulae on Southern Europeans (SE) by Kranioti and Apostol in our sample resulted in very high classification accuracy ranging from 81 to 88%. In addition, 35.5-47% of the total Turkish sample is correctly classified with over 95% posterior probability, which is actually higher than the one reported for the original sample (25-43%). We conclude that the tibia is a very useful bone for sex estimation in the contemporary Turkish population. Moreover, our test results support the hypothesis that the SE formulae are sufficient for the contemporary Turkish population and they can be used safely for criminal investigations when posterior probabilities are over 95%. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Morales, R; Menéndez-Buxadera, A; Avilés, C; Molina, A
2013-12-01
The direct and maternal genetic effects were estimated for the preweaning growth of Retinta calves with a multitrait model across parities, using a longitudinal approach with random regression models (RRM). The 120 (P120) and 180 days (P180) weights (5972 calves) were considered as different traits in each calving. The heritability of direct effect across parities was on average 0.37 for P120 and 0.58 for P180, slightly higher than the estimates by univariate (0.30 and 0.56) and bivariate models (0.30 and 0.51, respectively). The heritability for maternal effects was 0.16 for P120 and 0.26 for P180 and very similar by uni- (0.16 and 0.23) and multivariate model (0.16 and 0.22, respectively). The correlation between direct and maternal effects by RRM showed a pronounced antagonism -0.64 for P120 and -0.78 for P180), likewise uni- (-0.62 and -0.72) and multivariate case (-0.64 and -0.74, respectively). The preweaning weights should be considered as different traits across parities, because the genetic correlations were different from unity. The RRM also allowed us to estimate all the parameters throughout the calving trajectory of the cow. The use of multiple traits RRM across parities can provide very useful information for the breeding programmes. © 2013 Blackwell Verlag GmbH.
Multivariable frequency domain identification via 2-norm minimization
NASA Technical Reports Server (NTRS)
Bayard, David S.
1992-01-01
The author develops a computational approach to multivariable frequency domain identification, based on 2-norm minimization. In particular, a Gauss-Newton (GN) iteration is developed to minimize the 2-norm of the error between frequency domain data and a matrix fraction transfer function estimate. To improve the global performance of the optimization algorithm, the GN iteration is initialized using the solution to a particular sequentially reweighted least squares problem, denoted as the SK iteration. The least squares problems which arise from both the SK and GN iterations are shown to involve sparse matrices with identical block structure. A sparse matrix QR factorization method is developed to exploit the special block structure, and to efficiently compute the least squares solution. A numerical example involving the identification of a multiple-input multiple-output (MIMO) plant having 286 unknown parameters is given to illustrate the effectiveness of the algorithm.
Modelling lifetime data with multivariate Tweedie distribution
NASA Astrophysics Data System (ADS)
Nor, Siti Rohani Mohd; Yusof, Fadhilah; Bahar, Arifah
2017-05-01
This study aims to measure the dependence between individual lifetimes by applying multivariate Tweedie distribution to the lifetime data. Dependence between lifetimes incorporated in the mortality model is a new form of idea that gives significant impact on the risk of the annuity portfolio which is actually against the idea of standard actuarial methods that assumes independent between lifetimes. Hence, this paper applies Tweedie family distribution to the portfolio of lifetimes to induce the dependence between lives. Tweedie distribution is chosen since it contains symmetric and non-symmetric, as well as light-tailed and heavy-tailed distributions. Parameter estimation is modified in order to fit the Tweedie distribution to the data. This procedure is developed by using method of moments. In addition, the comparison stage is made to check for the adequacy between the observed mortality and expected mortality. Finally, the importance of including systematic mortality risk in the model is justified by the Pearson's chi-squared test.
Learning multivariate distributions by competitive assembly of marginals.
Sánchez-Vega, Francisco; Younes, Laurent; Geman, Donald
2013-02-01
We present a new framework for learning high-dimensional multivariate probability distributions from estimated marginals. The approach is motivated by compositional models and Bayesian networks, and designed to adapt to small sample sizes. We start with a large, overlapping set of elementary statistical building blocks, or "primitives," which are low-dimensional marginal distributions learned from data. Each variable may appear in many primitives. Subsets of primitives are combined in a Lego-like fashion to construct a probabilistic graphical model; only a small fraction of the primitives will participate in any valid construction. Since primitives can be precomputed, parameter estimation and structure search are separated. Model complexity is controlled by strong biases; we adapt the primitives to the amount of training data and impose rules which restrict the merging of them into allowable compositions. The likelihood of the data decomposes into a sum of local gains, one for each primitive in the final structure. We focus on a specific subclass of networks which are binary forests. Structure optimization corresponds to an integer linear program and the maximizing composition can be computed for reasonably large numbers of variables. Performance is evaluated using both synthetic data and real datasets from natural language processing and computational biology.
Zafar, Raheel; Kamel, Nidal; Naufal, Mohamad; Malik, Aamir Saeed; Dass, Sarat C; Ahmad, Rana Fayyaz; Abdullah, Jafri M; Reza, Faruque
2017-01-01
Decoding of human brain activity has always been a primary goal in neuroscience especially with functional magnetic resonance imaging (fMRI) data. In recent years, Convolutional neural network (CNN) has become a popular method for the extraction of features due to its higher accuracy, however it needs a lot of computation and training data. In this study, an algorithm is developed using Multivariate pattern analysis (MVPA) and modified CNN to decode the behavior of brain for different images with limited data set. Selection of significant features is an important part of fMRI data analysis, since it reduces the computational burden and improves the prediction performance; significant features are selected using t-test. MVPA uses machine learning algorithms to classify different brain states and helps in prediction during the task. General linear model (GLM) is used to find the unknown parameters of every individual voxel and the classification is done using multi-class support vector machine (SVM). MVPA-CNN based proposed algorithm is compared with region of interest (ROI) based method and MVPA based estimated values. The proposed method showed better overall accuracy (68.6%) compared to ROI (61.88%) and estimation values (64.17%).
Multivariate analysis of ATR-FTIR spectra for assessment of oil shale organic geochemical properties
Washburn, Kathryn E.; Birdwell, Justin E.
2013-01-01
In this study, attenuated total reflectance (ATR) Fourier transform infrared spectroscopy (FTIR) was coupled with partial least squares regression (PLSR) analysis to relate spectral data to parameters from total organic carbon (TOC) analysis and programmed pyrolysis to assess the feasibility of developing predictive models to estimate important organic geochemical parameters. The advantage of ATR-FTIR over traditional analytical methods is that source rocks can be analyzed in the laboratory or field in seconds, facilitating more rapid and thorough screening than would be possible using other tools. ATR-FTIR spectra, TOC concentrations and Rock–Eval parameters were measured for a set of oil shales from deposits around the world and several pyrolyzed oil shale samples. PLSR models were developed to predict the measured geochemical parameters from infrared spectra. Application of the resulting models to a set of test spectra excluded from the training set generated accurate predictions of TOC and most Rock–Eval parameters. The critical region of the infrared spectrum for assessing S1, S2, Hydrogen Index and TOC consisted of aliphatic organic moieties (2800–3000 cm−1) and the models generated a better correlation with measured values of TOC and S2 than did integrated aliphatic peak areas. The results suggest that combining ATR-FTIR with PLSR is a reliable approach for estimating useful geochemical parameters of oil shales that is faster and requires less sample preparation than current screening methods.
Tsai, Jason S-H; Hsu, Wen-Teng; Lin, Long-Guei; Guo, Shu-Mei; Tann, Joseph W
2014-01-01
A modified nonlinear autoregressive moving average with exogenous inputs (NARMAX) model-based state-space self-tuner with fault tolerance is proposed in this paper for the unknown nonlinear stochastic hybrid system with a direct transmission matrix from input to output. Through the off-line observer/Kalman filter identification method, one has a good initial guess of modified NARMAX model to reduce the on-line system identification process time. Then, based on the modified NARMAX-based system identification, a corresponding adaptive digital control scheme is presented for the unknown continuous-time nonlinear system, with an input-output direct transmission term, which also has measurement and system noises and inaccessible system states. Besides, an effective state space self-turner with fault tolerance scheme is presented for the unknown multivariable stochastic system. A quantitative criterion is suggested by comparing the innovation process error estimated by the Kalman filter estimation algorithm, so that a weighting matrix resetting technique by adjusting and resetting the covariance matrices of parameter estimate obtained by the Kalman filter estimation algorithm is utilized to achieve the parameter estimation for faulty system recovery. Consequently, the proposed method can effectively cope with partially abrupt and/or gradual system faults and input failures by the fault detection. Copyright © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Baldwin, D.; Manfreda, S.; Keller, K.; Smithwick, E. A. H.
2017-03-01
Satellite-based near-surface (0-2 cm) soil moisture estimates have global coverage, but do not capture variations of soil moisture in the root zone (up to 100 cm depth) and may be biased with respect to ground-based soil moisture measurements. Here, we present an ensemble Kalman filter (EnKF) hydrologic data assimilation system that predicts bias in satellite soil moisture data to support the physically based Soil Moisture Analytical Relationship (SMAR) infiltration model, which estimates root zone soil moisture with satellite soil moisture data. The SMAR-EnKF model estimates a regional-scale bias parameter using available in situ data. The regional bias parameter is added to satellite soil moisture retrievals before their use in the SMAR model, and the bias parameter is updated continuously over time with the EnKF algorithm. In this study, the SMAR-EnKF assimilates in situ soil moisture at 43 Soil Climate Analysis Network (SCAN) monitoring locations across the conterminous U.S. Multivariate regression models are developed to estimate SMAR parameters using soil physical properties and the moderate resolution imaging spectroradiometer (MODIS) evapotranspiration data product as covariates. SMAR-EnKF root zone soil moisture predictions are in relatively close agreement with in situ observations when using optimal model parameters, with root mean square errors averaging 0.051 [cm3 cm-3] (standard error, s.e. = 0.005). The average root mean square error associated with a 20-fold cross-validation analysis with permuted SMAR parameter regression models increases moderately (0.082 [cm3 cm-3], s.e. = 0.004). The expected regional-scale satellite correction bias is negative in four out of six ecoregions studied (mean = -0.12 [-], s.e. = 0.002), excluding the Great Plains and Eastern Temperate Forests (0.053 [-], s.e. = 0.001). With its capability of estimating regional-scale satellite bias, the SMAR-EnKF system can predict root zone soil moisture over broad extents and has applications in drought predictions and other operational hydrologic modeling purposes.
Simulating the effect of non-linear mode coupling in cosmological parameter estimation
NASA Astrophysics Data System (ADS)
Kiessling, A.; Taylor, A. N.; Heavens, A. F.
2011-09-01
Fisher Information Matrix methods are commonly used in cosmology to estimate the accuracy that cosmological parameters can be measured with a given experiment and to optimize the design of experiments. However, the standard approach usually assumes both data and parameter estimates are Gaussian-distributed. Further, for survey forecasts and optimization it is usually assumed that the power-spectrum covariance matrix is diagonal in Fourier space. However, in the low-redshift Universe, non-linear mode coupling will tend to correlate small-scale power, moving information from lower to higher order moments of the field. This movement of information will change the predictions of cosmological parameter accuracy. In this paper we quantify this loss of information by comparing naïve Gaussian Fisher matrix forecasts with a maximum likelihood parameter estimation analysis of a suite of mock weak lensing catalogues derived from N-body simulations, based on the SUNGLASS pipeline, for a 2D and tomographic shear analysis of a Euclid-like survey. In both cases, we find that the 68 per cent confidence area of the Ωm-σ8 plane increases by a factor of 5. However, the marginal errors increase by just 20-40 per cent. We propose a new method to model the effects of non-linear shear-power mode coupling in the Fisher matrix by approximating the shear-power distribution as a multivariate Gaussian with a covariance matrix derived from the mock weak lensing survey. We find that this approximation can reproduce the 68 per cent confidence regions of the full maximum likelihood analysis in the Ωm-σ8 plane to high accuracy for both 2D and tomographic weak lensing surveys. Finally, we perform a multiparameter analysis of Ωm, σ8, h, ns, w0 and wa to compare the Gaussian and non-linear mode-coupled Fisher matrix contours. The 6D volume of the 1σ error contours for the non-linear Fisher analysis is a factor of 3 larger than for the Gaussian case, and the shape of the 68 per cent confidence volume is modified. We propose that future Fisher matrix estimates of cosmological parameter accuracies should include mode-coupling effects.
Generating Multivariate Ordinal Data via Entropy Principles.
Lee, Yen; Kaplan, David
2018-03-01
When conducting robustness research where the focus of attention is on the impact of non-normality, the marginal skewness and kurtosis are often used to set the degree of non-normality. Monte Carlo methods are commonly applied to conduct this type of research by simulating data from distributions with skewness and kurtosis constrained to pre-specified values. Although several procedures have been proposed to simulate data from distributions with these constraints, no corresponding procedures have been applied for discrete distributions. In this paper, we present two procedures based on the principles of maximum entropy and minimum cross-entropy to estimate the multivariate observed ordinal distributions with constraints on skewness and kurtosis. For these procedures, the correlation matrix of the observed variables is not specified but depends on the relationships between the latent response variables. With the estimated distributions, researchers can study robustness not only focusing on the levels of non-normality but also on the variations in the distribution shapes. A simulation study demonstrates that these procedures yield excellent agreement between specified parameters and those of estimated distributions. A robustness study concerning the effect of distribution shape in the context of confirmatory factor analysis shows that shape can affect the robust [Formula: see text] and robust fit indices, especially when the sample size is small, the data are severely non-normal, and the fitted model is complex.
Models and analysis for multivariate failure time data
NASA Astrophysics Data System (ADS)
Shih, Joanna Huang
The goal of this research is to develop and investigate models and analytic methods for multivariate failure time data. We compare models in terms of direct modeling of the margins, flexibility of dependency structure, local vs. global measures of association, and ease of implementation. In particular, we study copula models, and models produced by right neutral cumulative hazard functions and right neutral hazard functions. We examine the changes of association over time for families of bivariate distributions induced from these models by displaying their density contour plots, conditional density plots, correlation curves of Doksum et al, and local cross ratios of Oakes. We know that bivariate distributions with same margins might exhibit quite different dependency structures. In addition to modeling, we study estimation procedures. For copula models, we investigate three estimation procedures. the first procedure is full maximum likelihood. The second procedure is two-stage maximum likelihood. At stage 1, we estimate the parameters in the margins by maximizing the marginal likelihood. At stage 2, we estimate the dependency structure by fixing the margins at the estimated ones. The third procedure is two-stage partially parametric maximum likelihood. It is similar to the second procedure, but we estimate the margins by the Kaplan-Meier estimate. We derive asymptotic properties for these three estimation procedures and compare their efficiency by Monte-Carlo simulations and direct computations. For models produced by right neutral cumulative hazards and right neutral hazards, we derive the likelihood and investigate the properties of the maximum likelihood estimates. Finally, we develop goodness of fit tests for the dependency structure in the copula models. We derive a test statistic and its asymptotic properties based on the test of homogeneity of Zelterman and Chen (1988), and a graphical diagnostic procedure based on the empirical Bayes approach. We study the performance of these two methods using actual and computer generated data.
Joint Adaptive Mean-Variance Regularization and Variance Stabilization of High Dimensional Data.
Dazard, Jean-Eudes; Rao, J Sunil
2012-07-01
The paper addresses a common problem in the analysis of high-dimensional high-throughput "omics" data, which is parameter estimation across multiple variables in a set of data where the number of variables is much larger than the sample size. Among the problems posed by this type of data are that variable-specific estimators of variances are not reliable and variable-wise tests statistics have low power, both due to a lack of degrees of freedom. In addition, it has been observed in this type of data that the variance increases as a function of the mean. We introduce a non-parametric adaptive regularization procedure that is innovative in that : (i) it employs a novel "similarity statistic"-based clustering technique to generate local-pooled or regularized shrinkage estimators of population parameters, (ii) the regularization is done jointly on population moments, benefiting from C. Stein's result on inadmissibility, which implies that usual sample variance estimator is improved by a shrinkage estimator using information contained in the sample mean. From these joint regularized shrinkage estimators, we derived regularized t-like statistics and show in simulation studies that they offer more statistical power in hypothesis testing than their standard sample counterparts, or regular common value-shrinkage estimators, or when the information contained in the sample mean is simply ignored. Finally, we show that these estimators feature interesting properties of variance stabilization and normalization that can be used for preprocessing high-dimensional multivariate data. The method is available as an R package, called 'MVR' ('Mean-Variance Regularization'), downloadable from the CRAN website.
Joint Adaptive Mean-Variance Regularization and Variance Stabilization of High Dimensional Data
Dazard, Jean-Eudes; Rao, J. Sunil
2012-01-01
The paper addresses a common problem in the analysis of high-dimensional high-throughput “omics” data, which is parameter estimation across multiple variables in a set of data where the number of variables is much larger than the sample size. Among the problems posed by this type of data are that variable-specific estimators of variances are not reliable and variable-wise tests statistics have low power, both due to a lack of degrees of freedom. In addition, it has been observed in this type of data that the variance increases as a function of the mean. We introduce a non-parametric adaptive regularization procedure that is innovative in that : (i) it employs a novel “similarity statistic”-based clustering technique to generate local-pooled or regularized shrinkage estimators of population parameters, (ii) the regularization is done jointly on population moments, benefiting from C. Stein's result on inadmissibility, which implies that usual sample variance estimator is improved by a shrinkage estimator using information contained in the sample mean. From these joint regularized shrinkage estimators, we derived regularized t-like statistics and show in simulation studies that they offer more statistical power in hypothesis testing than their standard sample counterparts, or regular common value-shrinkage estimators, or when the information contained in the sample mean is simply ignored. Finally, we show that these estimators feature interesting properties of variance stabilization and normalization that can be used for preprocessing high-dimensional multivariate data. The method is available as an R package, called ‘MVR’ (‘Mean-Variance Regularization’), downloadable from the CRAN website. PMID:22711950
Ellington, Sascha R; Devine, Owen; Bertolli, Jeanne; Martinez Quiñones, Alma; Shapiro-Mendoza, Carrie K; Perez-Padilla, Janice; Rivera-Garcia, Brenda; Simeone, Regina M; Jamieson, Denise J; Valencia-Prado, Miguel; Gilboa, Suzanne M; Honein, Margaret A; Johansson, Michael A
2016-10-01
Zika virus (ZIKV) infection during pregnancy is a cause of congenital microcephaly and severe fetal brain defects, and it has been associated with other adverse pregnancy and birth outcomes. To estimate the number of pregnant women infected with ZIKV in Puerto Rico and the number of associated congenital microcephaly cases. We conducted a modeling study from April to July 2016. Using parameters derived from published reports, outcomes were modeled probabilistically using Monte Carlo simulation. We used uncertainty distributions to reflect the limited information available for parameter values. Given the high level of uncertainty in model parameters, interquartile ranges (IQRs) are presented as primary results. Outcomes were modeled for pregnant women in Puerto Rico, which currently has more confirmed ZIKV cases than any other US location. Zika virus infection in pregnant women. Number of pregnant women infected with ZIKV and number of congenital microcephaly cases. We estimated an IQR of 5900 to 10 300 pregnant women (median, 7800) might be infected during the initial ZIKV outbreak in Puerto Rico. Of these, an IQR of 100 to 270 infants (median, 180) may be born with microcephaly due to congenital ZIKV infection from mid-2016 to mid-2017. In the absence of a ZIKV outbreak, an IQR of 9 to 16 cases (median, 12) of congenital microcephaly are expected in Puerto Rico per year. The estimate of 5900 to 10 300 pregnant women that might be infected with ZIKV provides an estimate for the number of infants that could potentially have ZIKV-associated adverse outcomes. Including baseline cases of microcephaly, we estimated that an IQR of 110 to 290 total cases of congenital microcephaly, mostly attributable to ZIKV infection, could occur from mid-2016 to mid-2017 in the absence of effective interventions. The primary limitation in this analysis is uncertainty in model parameters. Multivariate sensitivity analyses indicated that the cumulative incidence of ZIKV infection and risk of microcephaly given maternal infection in the first trimester were the primary drivers of both magnitude and uncertainty in the estimated number of microcephaly cases. Increased information on these parameters would lead to more precise estimates. Nonetheless, the results underscore the need for urgent actions being undertaken in Puerto Rico to prevent congenital ZIKV infection and prepare for affected infants.
A LiDAR data-based camera self-calibration method
NASA Astrophysics Data System (ADS)
Xu, Lijun; Feng, Jing; Li, Xiaolu; Chen, Jianjun
2018-07-01
To find the intrinsic parameters of a camera, a LiDAR data-based camera self-calibration method is presented here. Parameters have been estimated using particle swarm optimization (PSO), enhancing the optimal solution of a multivariate cost function. The main procedure of camera intrinsic parameter estimation has three parts, which include extraction and fine matching of interest points in the images, establishment of cost function, based on Kruppa equations and optimization of PSO using LiDAR data as the initialization input. To improve the precision of matching pairs, a new method of maximal information coefficient (MIC) and maximum asymmetry score (MAS) was used to remove false matching pairs based on the RANSAC algorithm. Highly precise matching pairs were used to calculate the fundamental matrix so that the new cost function (deduced from Kruppa equations in terms of the fundamental matrix) was more accurate. The cost function involving four intrinsic parameters was minimized by PSO for the optimal solution. To overcome the issue of optimization pushed to a local optimum, LiDAR data was used to determine the scope of initialization, based on the solution to the P4P problem for camera focal length. To verify the accuracy and robustness of the proposed method, simulations and experiments were implemented and compared with two typical methods. Simulation results indicated that the intrinsic parameters estimated by the proposed method had absolute errors less than 1.0 pixel and relative errors smaller than 0.01%. Based on ground truth obtained from a meter ruler, the distance inversion accuracy in the experiments was smaller than 1.0 cm. Experimental and simulated results demonstrated that the proposed method was highly accurate and robust.
Estimation and Optimization of the Parameters Preserving the Lustre of the Fabrics
NASA Astrophysics Data System (ADS)
Prodanova, Krasimira
2009-11-01
The paper discusses the optimization of the continuance of the Damp-Heating Process of a steaming iron press machine, and the preserving of the lustre of the fabrics. In order to be obtained high qualitative damp-heating processing, it is necessary to monitor parameters such as temperature, damp, and pressure during the process. The purpose of the present paper is a mathematical model to be constructed that adequately describes the technological process using multivariate data analysis. It was established that the full factorial design of type 23 is not adequate. The research has proceeded with central rotatable design of experiment. The obtained model adequately describes the technological process of damp-heating treatment in the defined factor space. The present investigation is helpful to the technological improvement and modernization in sewing companies.
An adaptive Cartesian control scheme for manipulators
NASA Technical Reports Server (NTRS)
Seraji, H.
1987-01-01
A adaptive control scheme for direct control of manipulator end-effectors to achieve trajectory tracking in Cartesian space is developed. The control structure is obtained from linear multivariable theory and is composed of simple feedforward and feedback controllers and an auxiliary input. The direct adaptation laws are derived from model reference adaptive control theory and are not based on parameter estimation of the robot model. The utilization of feedforward control and the inclusion of auxiliary input are novel features of the present scheme and result in improved dynamic performance over existing adaptive control schemes. The adaptive controller does not require the complex mathematical model of the robot dynamics or any knowledge of the robot parameters or the payload, and is computationally fast for online implementation with high sampling rates.
An Improved Method to Control the Critical Parameters of a Multivariable Control System
NASA Astrophysics Data System (ADS)
Subha Hency Jims, P.; Dharmalingam, S.; Wessley, G. Jims John
2017-10-01
The role of control systems is to cope with the process deficiencies and the undesirable effect of the external disturbances. Most of the multivariable processes are highly iterative and complex in nature. Aircraft systems, Modern Power Plants, Refineries, Robotic systems are few such complex systems that involve numerous critical parameters that need to be monitored and controlled. Control of these important parameters is not only tedious and cumbersome but also is crucial from environmental, safety and quality perspective. In this paper, one such multivariable system, namely, a utility boiler has been considered. A modern power plant is a complex arrangement of pipework and machineries with numerous interacting control loops and support systems. In this paper, the calculation of controller parameters based on classical tuning concepts has been presented. The controller parameters thus obtained and employed has controlled the critical parameters of a boiler during fuel switching disturbances. The proposed method can be applied to control the critical parameters like elevator, aileron, rudder, elevator trim rudder and aileron trim, flap control systems of aircraft systems.
Streibel, T; Nordsieck, H; Neuer-Etscheidt, K; Schnelle-Kreis, J; Zimmermann, R
2007-04-01
On-line detectable indicator parameters in the flue gas of municipal solid waste incinerators (MSWI) such as chlorinated benzenes (PCBz) are well known surrogate compounds for gas-phase PCDD/PCDF concentration. In the here presented work derivation of indicators is broadened to the detection of fly and boiler ash fractions with increased PCDD/PCDF content. Subsequently these fractions could be subject to further treatment such as recirculation in the combustion chamber to destroy their PCDD/PCDF and other organic pollutants' content. Aim of this work was to detect suitable on-line detectable indicator parameters in the gas phase, which are well correlated to PCDD/PCDF concentration in the solid residues. For this, solid residues and gas-phase samples were taken at three MSWI plants in Bavaria. Analysis of the ash content from different plants yielded a broad variation range of PCDD/PCDF concentrations especially after disturbed combustion conditions. Even during normal operation conditions significantly increased PCDD/PCDF concentrations may occur after unanticipated disturbances. Statistical evaluation of gas phase and ash measurements was carried out by means of principal component analysis, uni- and multivariate correlation analysis. Surprisingly, well known indicators for gas-phase PCDD/PCDF concentration such as polychlorinated benzenes and phenols proved to be insufficiently correlated to PCDD/PCDF content of the solid residues. Moreover, no single parameter alone was found appropriate to describe the PCDD/PCDF content of fly and boiler ashes. On the other hand, multivariate fitting of three or four parameters yielded convenient correlation coefficients of at least r=0.8 for every investigated case. Thereby, comprehension of plant operation parameters such as temperatures and air flow alongside concentrations of inorganic compounds in the gas phase (HCl, CO, SO2, NOx) gave the best results. However, the suitable set of parameters suited best for estimation of PCDD/PCDF concentration in solid residues has to be derived anew for each individual plant and type of ash.
Estimating Finite Rate of Population Increase for Sharks Based on Vital Parameters
Liu, Kwang-Ming; Chin, Chien-Pang; Chen, Chun-Hui; Chang, Jui-Han
2015-01-01
The vital parameter data for 62 stocks, covering 38 species, collected from the literature, including parameters of age, growth, and reproduction, were log-transformed and analyzed using multivariate analyses. Three groups were identified and empirical equations were developed for each to describe the relationships between the predicted finite rates of population increase (λ’) and the vital parameters, maximum age (Tmax), age at maturity (Tm), annual fecundity (f/Rc)), size at birth (Lb), size at maturity (Lm), and asymptotic length (L∞). Group (1) included species with slow growth rates (0.034 yr-1 < k < 0.103 yr-1) and extended longevity (26 yr < Tmax < 81 yr), e.g., shortfin mako Isurus oxyrinchus, dusky shark Carcharhinus obscurus, etc.; Group (2) included species with fast growth rates (0.103 yr-1 < k < 0.358 yr-1) and short longevity (9 yr < Tmax < 26 yr), e.g., starspotted smoothhound Mustelus manazo, gray smoothhound M. californicus, etc.; Group (3) included late maturing species (Lm/L∞ ≧ 0.75) with moderate longevity (Tmax < 29 yr), e.g., pelagic thresher Alopias pelagicus, sevengill shark Notorynchus cepedianus. The empirical equation for all data pooled was also developed. The λ’ values estimated by these empirical equations showed good agreement with those calculated using conventional demographic analysis. The predictability was further validated by an independent data set of three species. The empirical equations developed in this study not only reduce the uncertainties in estimation but also account for the difference in life history among groups. This method therefore provides an efficient and effective approach to the implementation of precautionary shark management measures. PMID:26576058
Okuyucu, Kursat; Ozaydın, Sukru; Alagoz, Engin; Ozgur, Gokhan; Oysul, Fahrettin Guven; Ozmen, Ozlem; Tuncel, Murat; Ozturk, Mustafa; Arslan, Nuri
2016-01-01
Abstract Background Non-Hodgkin’s lymphomas arising from the tissues other than primary lymphatic organs are named primary extranodal lymphoma. Most of the studies evaluated metabolic tumor parameters in different organs and histopathologic variants of this disease generally for treatment response. We aimed to evaluate the prognostic value of metabolic tumor parameters derived from initial FDG-PET/CT in patients with a medley of primary extranodal lymphoma in this study. Patients and methods There were 67 patients with primary extranodal lymphoma for whom FDG-PET/CT was requested for primary staging. Quantitative PET/CT parameters: maximum standardized uptake value (SUVmax), average standardized uptake value (SUVmean), metabolic tumor volume (MTV) and total lesion glycolysis (TLG) were used to estimate disease-free survival and overall survival. Results SUVmean, MTV and TLG were found statistically significant after multivariate analysis. SUVmean remained significant after ROC curve analysis. Sensitivity and specificity were calculated as 88% and 64%, respectively, when the cut-off value of SUVmean was chosen as 5.15. After the investigation of primary presentation sites and histo-pathological variants according to recurrence, there is no difference amongst the variants. Primary site of extranodal lymphomas however, is statistically important (p = 0.014). Testis and central nervous system lymphomas have higher recurrence rate (62.5%, 73%, respectively). Conclusions High SUVmean, MTV and TLG values obtained from primary staging FDG-PET/CT are potential risk factors for both disease-free survival and overall survival in primary extranodal lymphoma. SUVmean is the most significant one amongst them for estimating recurrence/metastasis. PMID:27904443
ERIC Educational Resources Information Center
Vallejo, Guillermo; Fidalgo, Angel; Fernandez, Paula
2001-01-01
Estimated empirical Type I error rate and power rate for three procedures for analyzing multivariate repeated measures designs: (1) the doubly multivariate model; (2) the Welch-James multivariate solution (H. Keselman, M. Carriere, a nd L. Lix, 1993); and (3) the multivariate version of the modified Brown-Forsythe procedure (M. Brown and A.…
NASA Astrophysics Data System (ADS)
Yan, Ying; Zhang, Shen; Tang, Jinjun; Wang, Xiaofei
2017-07-01
Discovering dynamic characteristics in traffic flow is the significant step to design effective traffic managing and controlling strategy for relieving traffic congestion in urban cities. A new method based on complex network theory is proposed to study multivariate traffic flow time series. The data were collected from loop detectors on freeway during a year. In order to construct complex network from original traffic flow, a weighted Froenius norm is adopt to estimate similarity between multivariate time series, and Principal Component Analysis is implemented to determine the weights. We discuss how to select optimal critical threshold for networks at different hour in term of cumulative probability distribution of degree. Furthermore, two statistical properties of networks: normalized network structure entropy and cumulative probability of degree, are utilized to explore hourly variation in traffic flow. The results demonstrate these two statistical quantities express similar pattern to traffic flow parameters with morning and evening peak hours. Accordingly, we detect three traffic states: trough, peak and transitional hours, according to the correlation between two aforementioned properties. The classifying results of states can actually represent hourly fluctuation in traffic flow by analyzing annual average hourly values of traffic volume, occupancy and speed in corresponding hours.
Bili, Eleni; Bili, Authors Eleni; Dampala, Kaliopi; Iakovou, Ioannis; Tsolakidis, Dimitrios; Giannakou, Anastasia; Tarlatzis, Basil C
2014-08-01
The aim of this study was to determine the performance of prostate specific antigen (PSA) and ultrasound parameters, such as ovarian volume and outline, in the diagnosis of polycystic ovary syndrome (PCOS). This prospective, observational, case-controlled study included 43 women with PCOS, and 40 controls. Between day 3 and 5 of the menstrual cycle, fasting serum samples were collected and transvaginal ultrasound was performed. The diagnostic performance of each parameter [total PSA (tPSA), total-to-free PSA ratio (tPSA:fPSA), ovarian volume, ovarian outline] was estimated by means of receiver operating characteristic (ROC) analysis, along with area under the curve (AUC), threshold, sensitivity, specificity as well as positive (+) and negative (-) likelihood ratios (LRs). Multivariate logistical regression models, using ovarian volume and ovarian outline, were constructed. The tPSA and tPSA:fPSA ratio resulted in AUC of 0.74 and 0.70, respectively, with moderate specificity/sensitivity and insufficient LR+/- values. In the multivariate logistic regression model, the combination of ovarian volume and outline had a sensitivity of 97.7% and a specificity of 97.5% in the diagnosis of PCOS, with +LR and -LR values of 39.1 and 0.02, respectively. In women with PCOS, tPSA and tPSA:fPSA ratio have similar diagnostic performance. The use of a multivariate logistic regression model, incorporating ovarian volume and outline, offers very good diagnostic accuracy in distinguishing women with PCOS patients from controls. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Rehder, Sönke; Wu, Jian X; Laackmann, Julian; Moritz, Hans-Ulrich; Rantanen, Jukka; Rades, Thomas; Leopold, Claudia S
2013-01-23
The objective of this study was to monitor the amorphous-to-crystalline solid-state phase transformation kinetics of the model drug ibuprofen with spectroscopic methods during acoustic levitation. Chemical and physical information was obtained by real-time near infrared (NIRS) and Raman spectroscopy measurements. The recrystallisation kinetic parameters (overall recrystallisation rate constant β and the time needed to reach 50% of the equilibrated level t(50)), were determined using a multivariate curve resolution approach. The acoustic levitation device coupled with non-invasive spectroscopy enabled monitoring of the recrystallisation process of the difficult-to-handle (adhesive) amorphous sample. The application of multivariate curve resolution enabled isolation of the underlying pure spectra, which corresponded well with the reference spectra of amorphous and crystalline ibuprofen. The recrystallisation kinetic parameters were estimated from the recrystallisation profiles. While the empirical recrystallisation rate constant determined by NIR and Raman spectroscopy were comparable, the lag time for recrystallisation was significantly lower with Raman spectroscopy as compared to NIRS. This observation was explained by the high energy density of the Raman laser beam, which might have led to local heating effects of the sample and thus reduced the recrystallisation onset time. It was concluded that acoustic levitation with NIR and Raman spectroscopy combined with multivariate curve resolution allowed direct determination of the recrystallisation kinetics of amorphous drugs and thus is a promising technique for monitoring solid-state phase transformations of adhesive small-sized samples during the early phase of drug development. Copyright © 2012 Elsevier B.V. All rights reserved.
Using cystoscopy to segment bladder tumors with a multivariate approach in different color spaces.
Freitas, Nuno R; Vieira, Pedro M; Lima, Estevao; Lima, Carlos S
2017-07-01
Nowadays the diagnosis of bladder lesions relies upon cystoscopy examination and depends on the interpreter's experience. State of the art of bladder tumor identification are based on 3D reconstruction, using CT images (Virtual Cystoscopy) or images where the structures are exalted with the use of pigmentation, but none uses white light cystoscopy images. An initial attempt to automatically identify tumoral tissue was already developed by the authors and this paper will develop this idea. Traditional cystoscopy images processing has a huge potential to improve early tumor detection and allows a more effective treatment. In this paper is described a multivariate approach to do segmentation of bladder cystoscopy images, that will be used to automatically detect and improve physician diagnose. Each region can be assumed as a normal distribution with specific parameters, leading to the assumption that the distribution of intensities is a Gaussian Mixture Model (GMM). Region of high grade and low grade tumors, usually appears with higher intensity than normal regions. This paper proposes a Maximum a Posteriori (MAP) approach based on pixel intensities read simultaneously in different color channels from RGB, HSV and CIELab color spaces. The Expectation-Maximization (EM) algorithm is used to estimate the best multivariate GMM parameters. Experimental results show that the proposed method does bladder tumor segmentation into two classes in a more efficient way in RGB even in cases where the tumor shape is not well defined. Results also show that the elimination of component L from CIELab color space does not allow definition of the tumor shape.
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China.
Pei, Ling-Ling; Li, Qin; Wang, Zheng-Xin
2018-03-08
The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China's pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N )) model based on the nonlinear least square (NLS) method. The Gauss-Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N ) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N ) and the NLS-based TNGM (1, N ) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO₂ and dust, alongside GDP per capita in China during the period 1996-2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N ) model presents greater precision when forecasting WDPC, SO₂ emissions and dust emissions per capita, compared to the traditional GM (1, N ) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO₂ and dust reduce accordingly.
Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan
2012-01-01
Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
MULTIVARIATE RECEPTOR MODELS AND MODEL UNCERTAINTY. (R825173)
Estimation of the number of major pollution sources, the source composition profiles, and the source contributions are the main interests in multivariate receptor modeling. Due to lack of identifiability of the receptor model, however, the estimation cannot be...
Bansal, Ravi; Hao, Xuejun; Liu, Jun; Peterson, Bradley S.
2014-01-01
Many investigators have tried to apply machine learning techniques to magnetic resonance images (MRIs) of the brain in order to diagnose neuropsychiatric disorders. Usually the number of brain imaging measures (such as measures of cortical thickness and measures of local surface morphology) derived from the MRIs (i.e., their dimensionality) has been large (e.g. >10) relative to the number of participants who provide the MRI data (<100). Sparse data in a high dimensional space increases the variability of the classification rules that machine learning algorithms generate, thereby limiting the validity, reproducibility, and generalizability of those classifiers. The accuracy and stability of the classifiers can improve significantly if the multivariate distributions of the imaging measures can be estimated accurately. To accurately estimate the multivariate distributions using sparse data, we propose to estimate first the univariate distributions of imaging data and then combine them using a Copula to generate more accurate estimates of their multivariate distributions. We then sample the estimated Copula distributions to generate dense sets of imaging measures and use those measures to train classifiers. We hypothesize that the dense sets of brain imaging measures will generate classifiers that are stable to variations in brain imaging measures, thereby improving the reproducibility, validity, and generalizability of diagnostic classification algorithms in imaging datasets from clinical populations. In our experiments, we used both computer-generated and real-world brain imaging datasets to assess the accuracy of multivariate Copula distributions in estimating the corresponding multivariate distributions of real-world imaging data. Our experiments showed that diagnostic classifiers generated using imaging measures sampled from the Copula were significantly more accurate and more reproducible than were the classifiers generated using either the real-world imaging measures or their multivariate Gaussian distributions. Thus, our findings demonstrate that estimated multivariate Copula distributions can generate dense sets of brain imaging measures that can in turn be used to train classifiers, and those classifiers are significantly more accurate and more reproducible than are those generated using real-world imaging measures alone. PMID:25093634
Fully probabilistic seismic source inversion - Part 2: Modelling errors and station covariances
NASA Astrophysics Data System (ADS)
Stähler, Simon C.; Sigloch, Karin
2016-11-01
Seismic source inversion, a central task in seismology, is concerned with the estimation of earthquake source parameters and their uncertainties. Estimating uncertainties is particularly challenging because source inversion is a non-linear problem. In a companion paper, Stähler and Sigloch (2014) developed a method of fully Bayesian inference for source parameters, based on measurements of waveform cross-correlation between broadband, teleseismic body-wave observations and their modelled counterparts. This approach yields not only depth and moment tensor estimates but also source time functions. A prerequisite for Bayesian inference is the proper characterisation of the noise afflicting the measurements, a problem we address here. We show that, for realistic broadband body-wave seismograms, the systematic error due to an incomplete physical model affects waveform misfits more strongly than random, ambient background noise. In this situation, the waveform cross-correlation coefficient CC, or rather its decorrelation D = 1 - CC, performs more robustly as a misfit criterion than ℓp norms, more commonly used as sample-by-sample measures of misfit based on distances between individual time samples. From a set of over 900 user-supervised, deterministic earthquake source solutions treated as a quality-controlled reference, we derive the noise distribution on signal decorrelation D = 1 - CC of the broadband seismogram fits between observed and modelled waveforms. The noise on D is found to approximately follow a log-normal distribution, a fortunate fact that readily accommodates the formulation of an empirical likelihood function for D for our multivariate problem. The first and second moments of this multivariate distribution are shown to depend mostly on the signal-to-noise ratio (SNR) of the CC measurements and on the back-azimuthal distances of seismic stations. By identifying and quantifying this likelihood function, we make D and thus waveform cross-correlation measurements usable for fully probabilistic sampling strategies, in source inversion and related applications such as seismic tomography.
NASA Technical Reports Server (NTRS)
Tripp, John S.; Tcheng, Ping
1999-01-01
Statistical tools, previously developed for nonlinear least-squares estimation of multivariate sensor calibration parameters and the associated calibration uncertainty analysis, have been applied to single- and multiple-axis inertial model attitude sensors used in wind tunnel testing to measure angle of attack and roll angle. The analysis provides confidence and prediction intervals of calibrated sensor measurement uncertainty as functions of applied input pitch and roll angles. A comparative performance study of various experimental designs for inertial sensor calibration is presented along with corroborating experimental data. The importance of replicated calibrations over extended time periods has been emphasized; replication provides independent estimates of calibration precision and bias uncertainties, statistical tests for calibration or modeling bias uncertainty, and statistical tests for sensor parameter drift over time. A set of recommendations for a new standardized model attitude sensor calibration method and usage procedures is included. The statistical information provided by these procedures is necessary for the uncertainty analysis of aerospace test results now required by users of industrial wind tunnel test facilities.
Uncertainty Analysis of Instrument Calibration and Application
NASA Technical Reports Server (NTRS)
Tripp, John S.; Tcheng, Ping
1999-01-01
Experimental aerodynamic researchers require estimated precision and bias uncertainties of measured physical quantities, typically at 95 percent confidence levels. Uncertainties of final computed aerodynamic parameters are obtained by propagation of individual measurement uncertainties through the defining functional expressions. In this paper, rigorous mathematical techniques are extended to determine precision and bias uncertainties of any instrument-sensor system. Through this analysis, instrument uncertainties determined through calibration are now expressed as functions of the corresponding measurement for linear and nonlinear univariate and multivariate processes. Treatment of correlated measurement precision error is developed. During laboratory calibration, calibration standard uncertainties are assumed to be an order of magnitude less than those of the instrument being calibrated. Often calibration standards do not satisfy this assumption. This paper applies rigorous statistical methods for inclusion of calibration standard uncertainty and covariance due to the order of their application. The effects of mathematical modeling error on calibration bias uncertainty are quantified. The effects of experimental design on uncertainty are analyzed. The importance of replication is emphasized, techniques for estimation of both bias and precision uncertainties using replication are developed. Statistical tests for stationarity of calibration parameters over time are obtained.
Nazem-Zadeh, Mohammad-Reza; Elisevich, Kost V; Schwalb, Jason M; Bagher-Ebadian, Hassan; Mahmoudi, Fariborz; Soltanian-Zadeh, Hamid
2014-12-15
Multiple modalities are used in determining laterality in mesial temporal lobe epilepsy (mTLE). It is unclear how much different imaging modalities should be weighted in decision-making. The purpose of this study is to develop response-driven multimodal multinomial models for lateralization of epileptogenicity in mTLE patients based upon imaging features in order to maximize the accuracy of noninvasive studies. The volumes, means and standard deviations of FLAIR intensity and means of normalized ictal-interictal SPECT intensity of the left and right hippocampi were extracted from preoperative images of a retrospective cohort of 45 mTLE patients with Engel class I surgical outcomes, as well as images of a cohort of 20 control, nonepileptic subjects. Using multinomial logistic function regression, the parameters of various univariate and multivariate models were estimated. Based on the Bayesian model averaging (BMA) theorem, response models were developed as compositions of independent univariate models. A BMA model composed of posterior probabilities of univariate response models of hippocampal volumes, means and standard deviations of FLAIR intensity, and means of SPECT intensity with the estimated weighting coefficients of 0.28, 0.32, 0.09, and 0.31, respectively, as well as a multivariate response model incorporating all mentioned attributes, demonstrated complete reliability by achieving a probability of detection of one with no false alarms to establish proper laterality in all mTLE patients. The proposed multinomial multivariate response-driven model provides a reliable lateralization of mesial temporal epileptogenicity including those patients who require phase II assessment. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Vittal, H.; Singh, Jitendra; Kumar, Pankaj; Karmakar, Subhankar
2015-06-01
In watershed management, flood frequency analysis (FFA) is performed to quantify the risk of flooding at different spatial locations and also to provide guidelines for determining the design periods of flood control structures. The traditional FFA was extensively performed by considering univariate scenario for both at-site and regional estimation of return periods. However, due to inherent mutual dependence of the flood variables or characteristics [i.e., peak flow (P), flood volume (V) and flood duration (D), which are random in nature], analysis has been further extended to multivariate scenario, with some restrictive assumptions. To overcome the assumption of same family of marginal density function for all flood variables, the concept of copula has been introduced. Although, the advancement from univariate to multivariate analyses drew formidable attention to the FFA research community, the basic limitation was that the analyses were performed with the implementation of only parametric family of distributions. The aim of the current study is to emphasize the importance of nonparametric approaches in the field of multivariate FFA; however, the nonparametric distribution may not always be a good-fit and capable of replacing well-implemented multivariate parametric and multivariate copula-based applications. Nevertheless, the potential of obtaining best-fit using nonparametric distributions might be improved because such distributions reproduce the sample's characteristics, resulting in more accurate estimations of the multivariate return period. Hence, the current study shows the importance of conjugating multivariate nonparametric approach with multivariate parametric and copula-based approaches, thereby results in a comprehensive framework for complete at-site FFA. Although the proposed framework is designed for at-site FFA, this approach can also be applied to regional FFA because regional estimations ideally include at-site estimations. The framework is based on the following steps: (i) comprehensive trend analysis to assess nonstationarity in the observed data; (ii) selection of the best-fit univariate marginal distribution with a comprehensive set of parametric and nonparametric distributions for the flood variables; (iii) multivariate frequency analyses with parametric, copula-based and nonparametric approaches; and (iv) estimation of joint and various conditional return periods. The proposed framework for frequency analysis is demonstrated using 110 years of observed data from Allegheny River at Salamanca, New York, USA. The results show that for both univariate and multivariate cases, the nonparametric Gaussian kernel provides the best estimate. Further, we perform FFA for twenty major rivers over continental USA, which shows for seven rivers, all the flood variables followed nonparametric Gaussian kernel; whereas for other rivers, parametric distributions provide the best-fit either for one or two flood variables. Thus the summary of results shows that the nonparametric method cannot substitute the parametric and copula-based approaches, but should be considered during any at-site FFA to provide the broadest choices for best estimation of the flood return periods.
Effect of sexual steroids on boar kinematic sperm subpopulations.
Ayala, E M E; Aragón, M A
2017-11-01
Here, we show the effects of sexual steroids, progesterone, testosterone, or estradiol on motility parameters of boar sperm. Sixteen commercial seminal doses, four each of four adult boars, were analyzed using computer assisted sperm analysis (CASA). Mean values of motility parameters were analyzed by bivariate and multivariate statistics. Principal component analysis (PCA), followed by hierarchical clustering, was applied on data of motility parameters, provided automatically as intervals by the CASA system. Effects of sexual steroids were described in the kinematic subpopulations identified from multivariate statistics. Mean values of motility parameters were not significantly changed after addition of sexual steroids. Multivariate graphics showed that sperm subpopulations were not sensitive to the addition of either testosterone or estradiol, but sperm subpopulations responsive to progesterone were found. Distribution of motility parameters were wide in controls but sharpened at distinct concentrations of progesterone. We conclude that kinematic sperm subpopulations responsive to progesterone are present in boar semen, and these subpopulations are masked in evaluations of mean values of motility parameters. © 2017 International Society for Advancement of Cytometry. © 2017 International Society for Advancement of Cytometry.
Jackson, Dan; White, Ian R; Riley, Richard D
2013-01-01
Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213
Rovadoscki, Gregori A; Petrini, Juliana; Ramirez-Diaz, Johanna; Pertile, Simone F N; Pertille, Fábio; Salvian, Mayara; Iung, Laiza H S; Rodriguez, Mary Ana P; Zampar, Aline; Gaya, Leila G; Carvalho, Rachel S B; Coelho, Antonio A D; Savino, Vicente J M; Coutinho, Luiz L; Mourão, Gerson B
2016-09-01
Repeated measures from the same individual have been analyzed by using repeatability and finite dimension models under univariate or multivariate analyses. However, in the last decade, the use of random regression models for genetic studies with longitudinal data have become more common. Thus, the aim of this research was to estimate genetic parameters for body weight of four experimental chicken lines by using univariate random regression models. Body weight data from hatching to 84 days of age (n = 34,730) from four experimental free-range chicken lines (7P, Caipirão da ESALQ, Caipirinha da ESALQ and Carijó Barbado) were used. The analysis model included the fixed effects of contemporary group (gender and rearing system), fixed regression coefficients for age at measurement, and random regression coefficients for permanent environmental effects and additive genetic effects. Heterogeneous variances for residual effects were considered, and one residual variance was assigned for each of six subclasses of age at measurement. Random regression curves were modeled by using Legendre polynomials of the second and third orders, with the best model chosen based on the Akaike Information Criterion, Bayesian Information Criterion, and restricted maximum likelihood. Multivariate analyses under the same animal mixed model were also performed for the validation of the random regression models. The Legendre polynomials of second order were better for describing the growth curves of the lines studied. Moderate to high heritabilities (h(2) = 0.15 to 0.98) were estimated for body weight between one and 84 days of age, suggesting that selection for body weight at all ages can be used as a selection criteria. Genetic correlations among body weight records obtained through multivariate analyses ranged from 0.18 to 0.96, 0.12 to 0.89, 0.06 to 0.96, and 0.28 to 0.96 in 7P, Caipirão da ESALQ, Caipirinha da ESALQ, and Carijó Barbado chicken lines, respectively. Results indicate that genetic gain for body weight can be achieved by selection. Also, selection for body weight at 42 days of age can be maintained as a selection criterion. © 2016 Poultry Science Association Inc.
A general diagnostic model applied to language testing data.
von Davier, Matthias
2008-11-01
Probabilistic models with one or more latent variables are designed to report on a corresponding number of skills or cognitive attributes. Multidimensional skill profiles offer additional information beyond what a single test score can provide, if the reported skills can be identified and distinguished reliably. Many recent approaches to skill profile models are limited to dichotomous data and have made use of computationally intensive estimation methods such as Markov chain Monte Carlo, since standard maximum likelihood (ML) estimation techniques were deemed infeasible. This paper presents a general diagnostic model (GDM) that can be estimated with standard ML techniques and applies to polytomous response variables as well as to skills with two or more proficiency levels. The paper uses one member of a larger class of diagnostic models, a compensatory diagnostic model for dichotomous and partial credit data. Many well-known models, such as univariate and multivariate versions of the Rasch model and the two-parameter logistic item response theory model, the generalized partial credit model, as well as a variety of skill profile models, are special cases of this GDM. In addition to an introduction to this model, the paper presents a parameter recovery study using simulated data and an application to real data from the field test for TOEFL Internet-based testing.
Yilmaz, Banu; Aras, Egemen; Nacar, Sinan; Kankal, Murat
2018-05-23
The functional life of a dam is often determined by the rate of sediment delivery to its reservoir. Therefore, an accurate estimate of the sediment load in rivers with dams is essential for designing and predicting a dam's useful lifespan. The most credible method is direct measurements of sediment input, but this can be very costly and it cannot always be implemented at all gauging stations. In this study, we tested various regression models to estimate suspended sediment load (SSL) at two gauging stations on the Çoruh River in Turkey, including artificial bee colony (ABC), teaching-learning-based optimization algorithm (TLBO), and multivariate adaptive regression splines (MARS). These models were also compared with one another and with classical regression analyses (CRA). Streamflow values and previously collected data of SSL were used as model inputs with predicted SSL data as output. Two different training and testing dataset configurations were used to reinforce the model accuracy. For the MARS method, the root mean square error value was found to range between 35% and 39% for the test two gauging stations, which was lower than errors for other models. Error values were even lower (7% to 15%) using another dataset. Our results indicate that simultaneous measurements of streamflow with SSL provide the most effective parameter for obtaining accurate predictive models and that MARS is the most accurate model for predicting SSL. Copyright © 2017 Elsevier B.V. All rights reserved.
Dimension reduction of frequency-based direct Granger causality measures on short time series.
Siggiridou, Elsa; Kimiskidis, Vasilios K; Kugiumtzis, Dimitris
2017-09-01
The mainstream in the estimation of effective brain connectivity relies on Granger causality measures in the frequency domain. If the measure is meant to capture direct causal effects accounting for the presence of other observed variables, as in multi-channel electroencephalograms (EEG), typically the fit of a vector autoregressive (VAR) model on the multivariate time series is required. For short time series of many variables, the estimation of VAR may not be stable requiring dimension reduction resulting in restricted or sparse VAR models. The restricted VAR obtained by the modified backward-in-time selection method (mBTS) is adapted to the generalized partial directed coherence (GPDC), termed restricted GPDC (RGPDC). Dimension reduction on other frequency based measures, such the direct directed transfer function (dDTF), is straightforward. First, a simulation study using linear stochastic multivariate systems is conducted and RGPDC is favorably compared to GPDC on short time series in terms of sensitivity and specificity. Then the two measures are tested for their ability to detect changes in brain connectivity during an epileptiform discharge (ED) from multi-channel scalp EEG. It is shown that RGPDC identifies better than GPDC the connectivity structure of the simulated systems, as well as changes in the brain connectivity, and is less dependent on the free parameter of VAR order. The proposed dimension reduction in frequency measures based on VAR constitutes an appropriate strategy to estimate reliably brain networks within short-time windows. Copyright © 2017 Elsevier B.V. All rights reserved.
Degeling, Koen; IJzerman, Maarten J; Koopman, Miriam; Koffijberg, Hendrik
2017-12-15
Parametric distributions based on individual patient data can be used to represent both stochastic and parameter uncertainty. Although general guidance is available on how parameter uncertainty should be accounted for in probabilistic sensitivity analysis, there is no comprehensive guidance on reflecting parameter uncertainty in the (correlated) parameters of distributions used to represent stochastic uncertainty in patient-level models. This study aims to provide this guidance by proposing appropriate methods and illustrating the impact of this uncertainty on modeling outcomes. Two approaches, 1) using non-parametric bootstrapping and 2) using multivariate Normal distributions, were applied in a simulation and case study. The approaches were compared based on point-estimates and distributions of time-to-event and health economic outcomes. To assess sample size impact on the uncertainty in these outcomes, sample size was varied in the simulation study and subgroup analyses were performed for the case-study. Accounting for parameter uncertainty in distributions that reflect stochastic uncertainty substantially increased the uncertainty surrounding health economic outcomes, illustrated by larger confidence ellipses surrounding the cost-effectiveness point-estimates and different cost-effectiveness acceptability curves. Although both approaches performed similar for larger sample sizes (i.e. n = 500), the second approach was more sensitive to extreme values for small sample sizes (i.e. n = 25), yielding infeasible modeling outcomes. Modelers should be aware that parameter uncertainty in distributions used to describe stochastic uncertainty needs to be reflected in probabilistic sensitivity analysis, as it could substantially impact the total amount of uncertainty surrounding health economic outcomes. If feasible, the bootstrap approach is recommended to account for this uncertainty.
ERIC Educational Resources Information Center
Haberman, Shelby J.; von Davier, Matthias; Lee, Yi-Hsuan
2008-01-01
Multidimensional item response models can be based on multivariate normal ability distributions or on multivariate polytomous ability distributions. For the case of simple structure in which each item corresponds to a unique dimension of the ability vector, some applications of the two-parameter logistic model to empirical data are employed to…
Stirrup, Oliver T; Babiker, Abdel G; Carpenter, James R; Copas, Andrew J
2016-04-30
Longitudinal data are widely analysed using linear mixed models, with 'random slopes' models particularly common. However, when modelling, for example, longitudinal pre-treatment CD4 cell counts in HIV-positive patients, the incorporation of non-stationary stochastic processes such as Brownian motion has been shown to lead to a more biologically plausible model and a substantial improvement in model fit. In this article, we propose two further extensions. Firstly, we propose the addition of a fractional Brownian motion component, and secondly, we generalise the model to follow a multivariate-t distribution. These extensions are biologically plausible, and each demonstrated substantially improved fit on application to example data from the Concerted Action on SeroConversion to AIDS and Death in Europe study. We also propose novel procedures for residual diagnostic plots that allow such models to be assessed. Cohorts of patients were simulated from the previously reported and newly developed models in order to evaluate differences in predictions made for the timing of treatment initiation under different clinical management strategies. A further simulation study was performed to demonstrate the substantial biases in parameter estimates of the mean slope of CD4 decline with time that can occur when random slopes models are applied in the presence of censoring because of treatment initiation, with the degree of bias found to depend strongly on the treatment initiation rule applied. Our findings indicate that researchers should consider more complex and flexible models for the analysis of longitudinal biomarker data, particularly when there are substantial missing data, and that the parameter estimates from random slopes models must be interpreted with caution. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
NASA Astrophysics Data System (ADS)
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-01-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254
Nobashi, Tomomi; Koyasu, Sho; Nakamoto, Yuji; Kubo, Takeshi; Ishimori, Takayoshi; Kim, Young H; Yoshizawa, Akihiko; Togashi, Kaori
2016-01-01
To investigate the prognostic value of fluorine-18 fludeoxyglucose (FDG) positron emission tomography (PET) parameters for small-cell lung cancer (SCLC), according to the primary tumour location, adjusted by conventional prognostic factors. From 2008 to 2013, we enrolled consecutive patients with histologically proven SCLC, who had undergone FDG-PET/CT prior to initial therapy. The primary tumour location was categorized into central or peripheral types. PET parameters and clinical variables were evaluated using univariate and multivariate analysis. A total of 69 patients were enrolled in this study; 28 of these patients were categorized as having the central type and 41 patients as having the peripheral type. In univariate analysis, stage, serum neuron-specific enolase, whole-body metabolic tumour volume (WB-MTV) and whole-body total lesion glycolysis (WB-TLG) were found to be significant in both types of patients. In multivariate analysis, the independent prognostic factor was found to be stage in the central type, but WB-MTV and WB-TLG in the peripheral type. Kaplan-Meier analysis demonstrated that patients with peripheral type with limited disease and low WB-MTV or WB-TLG showed significantly better overall survival than all of the other groups (p < 0.0083). The FDG-PET volumetric parameters were demonstrated to be significant and independent prognostic factors in patients with peripheral type of SCLC, while stage was the only independent prognostic factor in patients with central type of SCLC. FDG-PET is a non-invasive method that could potentially be used to estimate the prognosis of patients, especially those with peripheral-type SCLC.
Borst, Jordi; Berkhemer, Olvert A; Roos, Yvo B W E M; van Bavel, Ed; van Zwam, Wim H; van Oostenbrugge, Robert J; van Walderveen, Marianne A A; Lingsma, Hester F; van der Lugt, Aad; Dippel, Diederik W J; Yoo, Albert J; Marquering, Henk A; Majoie, Charles B L M
2015-12-01
The utility of computed tomographic perfusion (CTP)-based patient selection for intra-arterial treatment of acute ischemic stroke has not been proven in randomized trials and requires further study in a cohort that was not selected based on CTP. Our objective was to study the relationship between CTP-derived parameters and outcome and treatment effect in patients with acute ischemic stroke because of a proximal intracranial arterial occlusion. We included 175 patients who underwent CTP in the Multicenter Randomized Clinical Trial of Endovascular Treatment for Acute Ischemic Stroke in The Netherlands (MR CLEAN). Association of CTP-derived parameters (ischemic-core volume, penumbra volume, and percentage ischemic core) with outcome was estimated with multivariable ordinal logistic regression as an adjusted odds ratio for a shift in the direction of a better outcome on the modified Rankin Scale. Interaction between CTP-derived parameters and treatment effect was determined using multivariable ordinal logistic regression. Interaction with treatment effect was also tested for mismatch (core <70 mL; penumbra core >1.2; penumbra core >10 mL). The adjusted odds ratio for improved functional outcome for ischemic core, percentage ischemic core, and penumbra were 0.79 per 10 mL (95% confidence interval: 0.71-0.89; P<0.001), 0.82 per 10% (95% confidence interval: 0.66-0.90; P=0.002), and 0.97 per 10 mL (96% confidence interval: 0.92-1.01; P=0.15), respectively. No significant interaction between any of the CTP-derived parameters and treatment effect was observed. We observed no significant interaction between mismatch and treatment effect. CTP seems useful for predicting functional outcome, but cannot reliably identify patients who will not benefit from intra-arterial therapy. © 2015 American Heart Association, Inc.
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula
NASA Astrophysics Data System (ADS)
Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.
2016-03-01
A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
Efficient Global Aerodynamic Modeling from Flight Data
NASA Technical Reports Server (NTRS)
Morelli, Eugene A.
2012-01-01
A method for identifying global aerodynamic models from flight data in an efficient manner is explained and demonstrated. A novel experiment design technique was used to obtain dynamic flight data over a range of flight conditions with a single flight maneuver. Multivariate polynomials and polynomial splines were used with orthogonalization techniques and statistical modeling metrics to synthesize global nonlinear aerodynamic models directly and completely from flight data alone. Simulation data and flight data from a subscale twin-engine jet transport aircraft were used to demonstrate the techniques. Results showed that global multivariate nonlinear aerodynamic dependencies could be accurately identified using flight data from a single maneuver. Flight-derived global aerodynamic model structures, model parameter estimates, and associated uncertainties were provided for all six nondimensional force and moment coefficients for the test aircraft. These models were combined with a propulsion model identified from engine ground test data to produce a high-fidelity nonlinear flight simulation very efficiently. Prediction testing using a multi-axis maneuver showed that the identified global model accurately predicted aircraft responses.
User Selection Criteria of Airspace Designs in Flexible Airspace Management
NASA Technical Reports Server (NTRS)
Lee, Hwasoo E.; Lee, Paul U.; Jung, Jaewoo; Lai, Chok Fung
2011-01-01
A method for identifying global aerodynamic models from flight data in an efficient manner is explained and demonstrated. A novel experiment design technique was used to obtain dynamic flight data over a range of flight conditions with a single flight maneuver. Multivariate polynomials and polynomial splines were used with orthogonalization techniques and statistical modeling metrics to synthesize global nonlinear aerodynamic models directly and completely from flight data alone. Simulation data and flight data from a subscale twin-engine jet transport aircraft were used to demonstrate the techniques. Results showed that global multivariate nonlinear aerodynamic dependencies could be accurately identified using flight data from a single maneuver. Flight-derived global aerodynamic model structures, model parameter estimates, and associated uncertainties were provided for all six nondimensional force and moment coefficients for the test aircraft. These models were combined with a propulsion model identified from engine ground test data to produce a high-fidelity nonlinear flight simulation very efficiently. Prediction testing using a multi-axis maneuver showed that the identified global model accurately predicted aircraft responses.
Multi-variate joint PDF for non-Gaussianities: exact formulation and generic approximations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Verde, Licia; Jimenez, Raul; Alvarez-Gaume, Luis
2013-06-01
We provide an exact expression for the multi-variate joint probability distribution function of non-Gaussian fields primordially arising from local transformations of a Gaussian field. This kind of non-Gaussianity is generated in many models of inflation. We apply our expression to the non-Gaussianity estimation from Cosmic Microwave Background maps and the halo mass function where we obtain analytical expressions. We also provide analytic approximations and their range of validity. For the Cosmic Microwave Background we give a fast way to compute the PDF which is valid up to more than 7σ for f{sub NL} values (both true and sampled) not ruledmore » out by current observations, which consists of expressing the PDF as a combination of bispectrum and trispectrum of the temperature maps. The resulting expression is valid for any kind of non-Gaussianity and is not limited to the local type. The above results may serve as the basis for a fully Bayesian analysis of the non-Gaussianity parameter.« less
Bayes Factor Covariance Testing in Item Response Models.
Fox, Jean-Paul; Mulder, Joris; Sinharay, Sandip
2017-12-01
Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning the underlying covariance structure are evaluated using (fractional) Bayes factor tests. The support for a unidimensional factor (i.e., assumption of local independence) and differential item functioning are evaluated by testing the covariance components. The posterior distribution of common covariance components is obtained in closed form by transforming latent responses with an orthogonal (Helmert) matrix. This posterior distribution is defined as a shifted-inverse-gamma, thereby introducing a default prior and a balanced prior distribution. Based on that, an MCMC algorithm is described to estimate all model parameters and to compute (fractional) Bayes factor tests. Simulation studies are used to show that the (fractional) Bayes factor tests have good properties for testing the underlying covariance structure of binary response data. The method is illustrated with two real data studies.
Ray, J.; Lee, J.; Yadav, V.; ...
2015-04-29
Atmospheric inversions are frequently used to estimate fluxes of atmospheric greenhouse gases (e.g., biospheric CO 2 flux fields) at Earth's surface. These inversions typically assume that flux departures from a prior model are spatially smoothly varying, which are then modeled using a multi-variate Gaussian. When the field being estimated is spatially rough, multi-variate Gaussian models are difficult to construct and a wavelet-based field model may be more suitable. Unfortunately, such models are very high dimensional and are most conveniently used when the estimation method can simultaneously perform data-driven model simplification (removal of model parameters that cannot be reliably estimated) andmore » fitting. Such sparse reconstruction methods are typically not used in atmospheric inversions. In this work, we devise a sparse reconstruction method, and illustrate it in an idealized atmospheric inversion problem for the estimation of fossil fuel CO 2 (ffCO 2) emissions in the lower 48 states of the USA. Our new method is based on stagewise orthogonal matching pursuit (StOMP), a method used to reconstruct compressively sensed images. Our adaptations bestow three properties to the sparse reconstruction procedure which are useful in atmospheric inversions. We have modified StOMP to incorporate prior information on the emission field being estimated and to enforce non-negativity on the estimated field. Finally, though based on wavelets, our method allows for the estimation of fields in non-rectangular geometries, e.g., emission fields inside geographical and political boundaries. Our idealized inversions use a recently developed multi-resolution (i.e., wavelet-based) random field model developed for ffCO 2 emissions and synthetic observations of ffCO 2 concentrations from a limited set of measurement sites. We find that our method for limiting the estimated field within an irregularly shaped region is about a factor of 10 faster than conventional approaches. It also reduces the overall computational cost by a factor of 2. Further, the sparse reconstruction scheme imposes non-negativity without introducing strong nonlinearities, such as those introduced by employing log-transformed fields, and thus reaps the benefits of simplicity and computational speed that are characteristic of linear inverse problems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ray, J.; Lee, J.; Yadav, V.
Atmospheric inversions are frequently used to estimate fluxes of atmospheric greenhouse gases (e.g., biospheric CO 2 flux fields) at Earth's surface. These inversions typically assume that flux departures from a prior model are spatially smoothly varying, which are then modeled using a multi-variate Gaussian. When the field being estimated is spatially rough, multi-variate Gaussian models are difficult to construct and a wavelet-based field model may be more suitable. Unfortunately, such models are very high dimensional and are most conveniently used when the estimation method can simultaneously perform data-driven model simplification (removal of model parameters that cannot be reliably estimated) andmore » fitting. Such sparse reconstruction methods are typically not used in atmospheric inversions. In this work, we devise a sparse reconstruction method, and illustrate it in an idealized atmospheric inversion problem for the estimation of fossil fuel CO 2 (ffCO 2) emissions in the lower 48 states of the USA. Our new method is based on stagewise orthogonal matching pursuit (StOMP), a method used to reconstruct compressively sensed images. Our adaptations bestow three properties to the sparse reconstruction procedure which are useful in atmospheric inversions. We have modified StOMP to incorporate prior information on the emission field being estimated and to enforce non-negativity on the estimated field. Finally, though based on wavelets, our method allows for the estimation of fields in non-rectangular geometries, e.g., emission fields inside geographical and political boundaries. Our idealized inversions use a recently developed multi-resolution (i.e., wavelet-based) random field model developed for ffCO 2 emissions and synthetic observations of ffCO 2 concentrations from a limited set of measurement sites. We find that our method for limiting the estimated field within an irregularly shaped region is about a factor of 10 faster than conventional approaches. It also reduces the overall computational cost by a factor of 2. Further, the sparse reconstruction scheme imposes non-negativity without introducing strong nonlinearities, such as those introduced by employing log-transformed fields, and thus reaps the benefits of simplicity and computational speed that are characteristic of linear inverse problems.« less
Estimating the decomposition of predictive information in multivariate systems
NASA Astrophysics Data System (ADS)
Faes, Luca; Kugiumtzis, Dimitris; Nollo, Giandomenico; Jurysta, Fabrice; Marinazzo, Daniele
2015-03-01
In the study of complex systems from observed multivariate time series, insight into the evolution of one system may be under investigation, which can be explained by the information storage of the system and the information transfer from other interacting systems. We present a framework for the model-free estimation of information storage and information transfer computed as the terms composing the predictive information about the target of a multivariate dynamical process. The approach tackles the curse of dimensionality employing a nonuniform embedding scheme that selects progressively, among the past components of the multivariate process, only those that contribute most, in terms of conditional mutual information, to the present target process. Moreover, it computes all information-theoretic quantities using a nearest-neighbor technique designed to compensate the bias due to the different dimensionality of individual entropy terms. The resulting estimators of prediction entropy, storage entropy, transfer entropy, and partial transfer entropy are tested on simulations of coupled linear stochastic and nonlinear deterministic dynamic processes, demonstrating the superiority of the proposed approach over the traditional estimators based on uniform embedding. The framework is then applied to multivariate physiologic time series, resulting in physiologically well-interpretable information decompositions of cardiovascular and cardiorespiratory interactions during head-up tilt and of joint brain-heart dynamics during sleep.
Improvements in GRACE Gravity Field Determination through Stochastic Observation Modeling
NASA Astrophysics Data System (ADS)
McCullough, C.; Bettadpur, S. V.
2016-12-01
Current unconstrained Release 05 GRACE gravity field solutions from the Center for Space Research (CSR RL05) assume random observation errors following an independent multivariate Gaussian distribution. This modeling of observations, a simplifying assumption, fails to account for long period, correlated errors arising from inadequacies in the background force models. Fully modeling the errors inherent in the observation equations, through the use of a full observation covariance (modeling colored noise), enables optimal combination of GPS and inter-satellite range-rate data and obviates the need for estimating kinematic empirical parameters during the solution process. Most importantly, fully modeling the observation errors drastically improves formal error estimates of the spherical harmonic coefficients, potentially enabling improved uncertainty quantification of scientific results derived from GRACE and optimizing combinations of GRACE with independent data sets and a priori constraints.
Big data integration for regional hydrostratigraphic mapping
NASA Astrophysics Data System (ADS)
Friedel, M. J.
2013-12-01
Numerical models provide a way to evaluate groundwater systems, but determining the hydrostratigraphic units (HSUs) used in devising these models remains subjective, nonunique, and uncertain. A novel geophysical-hydrogeologic data integration scheme is proposed to constrain the estimation of continuous HSUs. First, machine-learning and multivariate statistical techniques are used to simultaneously integrate borehole hydrogeologic (lithology, hydraulic conductivity, aqueous field parameters, dissolved constituents) and geophysical (gamma, spontaneous potential, and resistivity) measurements. Second, airborne electromagnetic measurements are numerically inverted to obtain subsurface resistivity structure at randomly selected locations. Third, the machine-learning algorithm is trained using the borehole hydrostratigraphic units and inverted airborne resistivity profiles. The trained machine-learning algorithm is then used to estimate HSUs at independent resistivity profile locations. We demonstrate efficacy of the proposed approach to map the hydrostratigraphy of a heterogeneous surficial aquifer in northwestern Nebraska.
Petersen, Nanna; Stocks, Stuart; Gernaey, Krist V
2008-05-01
The main purpose of this article is to demonstrate that principal component analysis (PCA) and partial least squares regression (PLSR) can be used to extract information from particle size distribution data and predict rheological properties. Samples from commercially relevant Aspergillus oryzae fermentations conducted in 550 L pilot scale tanks were characterized with respect to particle size distribution, biomass concentration, and rheological properties. The rheological properties were described using the Herschel-Bulkley model. Estimation of all three parameters in the Herschel-Bulkley model (yield stress (tau(y)), consistency index (K), and flow behavior index (n)) resulted in a large standard deviation of the parameter estimates. The flow behavior index was not found to be correlated with any of the other measured variables and previous studies have suggested a constant value of the flow behavior index in filamentous fermentations. It was therefore chosen to fix this parameter to the average value thereby decreasing the standard deviation of the estimates of the remaining rheological parameters significantly. Using a PLSR model, a reasonable prediction of apparent viscosity (micro(app)), yield stress (tau(y)), and consistency index (K), could be made from the size distributions, biomass concentration, and process information. This provides a predictive method with a high predictive power for the rheology of fermentation broth, and with the advantages over previous models that tau(y) and K can be predicted as well as micro(app). Validation on an independent test set yielded a root mean square error of 1.21 Pa for tau(y), 0.209 Pa s(n) for K, and 0.0288 Pa s for micro(app), corresponding to R(2) = 0.95, R(2) = 0.94, and R(2) = 0.95 respectively. Copyright 2007 Wiley Periodicals, Inc.
Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L
2017-05-07
In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
NASA Astrophysics Data System (ADS)
Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.
2017-05-01
In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
Guo, Ying; Manatunga, Amita K
2009-03-01
Assessing agreement is often of interest in clinical studies to evaluate the similarity of measurements produced by different raters or methods on the same subjects. We present a modified weighted kappa coefficient to measure agreement between bivariate discrete survival times. The proposed kappa coefficient accommodates censoring by redistributing the mass of censored observations within the grid where the unobserved events may potentially happen. A generalized modified weighted kappa is proposed for multivariate discrete survival times. We estimate the modified kappa coefficients nonparametrically through a multivariate survival function estimator. The asymptotic properties of the kappa estimators are established and the performance of the estimators are examined through simulation studies of bivariate and trivariate survival times. We illustrate the application of the modified kappa coefficient in the presence of censored observations with data from a prostate cancer study.
Covariate analysis of bivariate survival data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bennett, L.E.
1992-01-01
The methods developed are used to analyze the effects of covariates on bivariate survival data when censoring and ties are present. The proposed method provides models for bivariate survival data that include differential covariate effects and censored observations. The proposed models are based on an extension of the univariate Buckley-James estimators which replace censored data points by their expected values, conditional on the censoring time and the covariates. For the bivariate situation, it is necessary to determine the expectation of the failure times for one component conditional on the failure or censoring time of the other component. Two different methodsmore » have been developed to estimate these expectations. In the semiparametric approach these expectations are determined from a modification of Burke's estimate of the bivariate empirical survival function. In the parametric approach censored data points are also replaced by their conditional expected values where the expected values are determined from a specified parametric distribution. The model estimation will be based on the revised data set, comprised of uncensored components and expected values for the censored components. The variance-covariance matrix for the estimated covariate parameters has also been derived for both the semiparametric and parametric methods. Data from the Demographic and Health Survey was analyzed by these methods. The two outcome variables are post-partum amenorrhea and breastfeeding; education and parity were used as the covariates. Both the covariate parameter estimates and the variance-covariance estimates for the semiparametric and parametric models will be compared. In addition, a multivariate test statistic was used in the semiparametric model to examine contrasts. The significance of the statistic was determined from a bootstrap distribution of the test statistic.« less
Jiang, Xuejun; Guo, Xu; Zhang, Ning; Wang, Bo
2018-01-01
This article presents and investigates performance of a series of robust multivariate nonparametric tests for detection of location shift between two multivariate samples in randomized controlled trials. The tests are built upon robust estimators of distribution locations (medians, Hodges-Lehmann estimators, and an extended U statistic) with both unscaled and scaled versions. The nonparametric tests are robust to outliers and do not assume that the two samples are drawn from multivariate normal distributions. Bootstrap and permutation approaches are introduced for determining the p-values of the proposed test statistics. Simulation studies are conducted and numerical results are reported to examine performance of the proposed statistical tests. The numerical results demonstrate that the robust multivariate nonparametric tests constructed from the Hodges-Lehmann estimators are more efficient than those based on medians and the extended U statistic. The permutation approach can provide a more stringent control of Type I error and is generally more powerful than the bootstrap procedure. The proposed robust nonparametric tests are applied to detect multivariate distributional difference between the intervention and control groups in the Thai Healthy Choices study and examine the intervention effect of a four-session motivational interviewing-based intervention developed in the study to reduce risk behaviors among youth living with HIV. PMID:29672555
NASA Astrophysics Data System (ADS)
Das Bhowmik, R.; Arumugam, S.
2015-12-01
Multivariate downscaling techniques exhibited superiority over univariate regression schemes in terms of preserving cross-correlations between multiple variables- precipitation and temperature - from GCMs. This study focuses on two aspects: (a) develop an analytical solutions on estimating biases in cross-correlations from univariate downscaling approaches and (b) quantify the uncertainty in land-surface states and fluxes due to biases in cross-correlations in downscaled climate forcings. Both these aspects are evaluated using climate forcings available from both historical climate simulations and CMIP5 hindcasts over the entire US. The analytical solution basically relates the univariate regression parameters, co-efficient of determination of regression and the co-variance ratio between GCM and downscaled values. The analytical solutions are compared with the downscaled univariate forcings by choosing the desired p-value (Type-1 error) in preserving the observed cross-correlation. . For quantifying the impacts of biases on cross-correlation on estimating streamflow and groundwater, we corrupt the downscaled climate forcings with different cross-correlation structure.
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China
Pei, Ling-Ling; Li, Qin
2018-01-01
The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China’s pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N)) model based on the nonlinear least square (NLS) method. The Gauss–Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N) and the NLS-based TNGM (1, N) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO2 and dust, alongside GDP per capita in China during the period 1996–2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N) model presents greater precision when forecasting WDPC, SO2 emissions and dust emissions per capita, compared to the traditional GM (1, N) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO2 and dust reduce accordingly. PMID:29517985
Estimation of saturated pixel values in digital color imaging
Zhang, Xuemei; Brainard, David H.
2007-01-01
Pixel saturation, where the incident light at a pixel causes one of the color channels of the camera sensor to respond at its maximum value, can produce undesirable artifacts in digital color images. We present a Bayesian algorithm that estimates what the saturated channel's value would have been in the absence of saturation. The algorithm uses the non-saturated responses from the other color channels, together with a multivariate Normal prior that captures the correlation in response across color channels. The appropriate parameters for the prior may be estimated directly from the image data, since most image pixels are not saturated. Given the prior, the responses of the non-saturated channels, and the fact that the true response of the saturated channel is known to be greater than the saturation level, the algorithm returns the optimal expected mean square estimate for the true response. Extensions of the algorithm to the case where more than one channel is saturated are also discussed. Both simulations and examples with real images are presented to show that the algorithm is effective. PMID:15603065
NASA Astrophysics Data System (ADS)
Emamgolizadeh, S.; Bateni, S. M.; Shahsavani, D.; Ashrafi, T.; Ghorbani, H.
2015-10-01
The soil cation exchange capacity (CEC) is one of the main soil chemical properties, which is required in various fields such as environmental and agricultural engineering as well as soil science. In situ measurement of CEC is time consuming and costly. Hence, numerous studies have used traditional regression-based techniques to estimate CEC from more easily measurable soil parameters (e.g., soil texture, organic matter (OM), and pH). However, these models may not be able to adequately capture the complex and highly nonlinear relationship between CEC and its influential soil variables. In this study, Genetic Expression Programming (GEP) and Multivariate Adaptive Regression Splines (MARS) were employed to estimate CEC from more readily measurable soil physical and chemical variables (e.g., OM, clay, and pH) by developing functional relations. The GEP- and MARS-based functional relations were tested at two field sites in Iran. Results showed that GEP and MARS can provide reliable estimates of CEC. Also, it was found that the MARS model (with root-mean-square-error (RMSE) of 0.318 Cmol+ kg-1 and correlation coefficient (R2) of 0.864) generated slightly better results than the GEP model (with RMSE of 0.270 Cmol+ kg-1 and R2 of 0.807). The performance of GEP and MARS models was compared with two existing approaches, namely artificial neural network (ANN) and multiple linear regression (MLR). The comparison indicated that MARS and GEP outperformed the MLP model, but they did not perform as good as ANN. Finally, a sensitivity analysis was conducted to determine the most and the least influential variables affecting CEC. It was found that OM and pH have the most and least significant effect on CEC, respectively.
A Sandwich-Type Standard Error Estimator of SEM Models with Multivariate Time Series
ERIC Educational Resources Information Center
Zhang, Guangjian; Chow, Sy-Miin; Ong, Anthony D.
2011-01-01
Structural equation models are increasingly used as a modeling tool for multivariate time series data in the social and behavioral sciences. Standard error estimators of SEM models, originally developed for independent data, require modifications to accommodate the fact that time series data are inherently dependent. In this article, we extend a…
Exploring connectivity with large-scale Granger causality on resting-state functional MRI.
DSouza, Adora M; Abidin, Anas Z; Leistritz, Lutz; Wismüller, Axel
2017-08-01
Large-scale Granger causality (lsGC) is a recently developed, resting-state functional MRI (fMRI) connectivity analysis approach that estimates multivariate voxel-resolution connectivity. Unlike most commonly used multivariate approaches, which establish coarse-resolution connectivity by aggregating voxel time-series avoiding an underdetermined problem, lsGC estimates voxel-resolution, fine-grained connectivity by incorporating an embedded dimension reduction. We investigate application of lsGC on realistic fMRI simulations, modeling smoothing of neuronal activity by the hemodynamic response function and repetition time (TR), and empirical resting-state fMRI data. Subsequently, functional subnetworks are extracted from lsGC connectivity measures for both datasets and validated quantitatively. We also provide guidelines to select lsGC free parameters. Results indicate that lsGC reliably recovers underlying network structure with area under receiver operator characteristic curve (AUC) of 0.93 at TR=1.5s for a 10-min session of fMRI simulations. Furthermore, subnetworks of closely interacting modules are recovered from the aforementioned lsGC networks. Results on empirical resting-state fMRI data demonstrate recovery of visual and motor cortex in close agreement with spatial maps obtained from (i) visuo-motor fMRI stimulation task-sequence (Accuracy=0.76) and (ii) independent component analysis (ICA) of resting-state fMRI (Accuracy=0.86). Compared with conventional Granger causality approach (AUC=0.75), lsGC produces better network recovery on fMRI simulations. Furthermore, it cannot recover functional subnetworks from empirical fMRI data, since quantifying voxel-resolution connectivity is not possible as consequence of encountering an underdetermined problem. Functional network recovery from fMRI data suggests that lsGC gives useful insight into connectivity patterns from resting-state fMRI at a multivariate voxel-resolution. Copyright © 2017 Elsevier B.V. All rights reserved.
Contamination Event Detection with Multivariate Time-Series Data in Agricultural Water Monitoring †
Mao, Yingchi; Qi, Hai; Ping, Ping; Li, Xiaofang
2017-01-01
Time series data of multiple water quality parameters are obtained from the water sensor networks deployed in the agricultural water supply network. The accurate and efficient detection and warning of contamination events to prevent pollution from spreading is one of the most important issues when pollution occurs. In order to comprehensively reduce the event detection deviation, a spatial–temporal-based event detection approach with multivariate time-series data for water quality monitoring (M-STED) was proposed. The M-STED approach includes three parts. The first part is that M-STED adopts a Rule K algorithm to select backbone nodes as the nodes in the CDS, and forward the sensed data of multiple water parameters. The second part is to determine the state of each backbone node with back propagation neural network models and the sequential Bayesian analysis in the current timestamp. The third part is to establish a spatial model with Bayesian networks to estimate the state of the backbones in the next timestamp and trace the “outlier” node to its neighborhoods to detect a contamination event. The experimental results indicate that the average detection rate is more than 80% with M-STED and the false detection rate is lower than 9%, respectively. The M-STED approach can improve the rate of detection by about 40% and reduce the false alarm rate by about 45%, compared with the event detection with a single water parameter algorithm, S-STED. Moreover, the proposed M-STED can exhibit better performance in terms of detection delay and scalability. PMID:29207535
Forcino, Frank L; Leighton, Lindsey R; Twerdy, Pamela; Cahill, James F
2015-01-01
Community ecologists commonly perform multivariate techniques (e.g., ordination, cluster analysis) to assess patterns and gradients of taxonomic variation. A critical requirement for a meaningful statistical analysis is accurate information on the taxa found within an ecological sample. However, oversampling (too many individuals counted per sample) also comes at a cost, particularly for ecological systems in which identification and quantification is substantially more resource consuming than the field expedition itself. In such systems, an increasingly larger sample size will eventually result in diminishing returns in improving any pattern or gradient revealed by the data, but will also lead to continually increasing costs. Here, we examine 396 datasets: 44 previously published and 352 created datasets. Using meta-analytic and simulation-based approaches, the research within the present paper seeks (1) to determine minimal sample sizes required to produce robust multivariate statistical results when conducting abundance-based, community ecology research. Furthermore, we seek (2) to determine the dataset parameters (i.e., evenness, number of taxa, number of samples) that require larger sample sizes, regardless of resource availability. We found that in the 44 previously published and the 220 created datasets with randomly chosen abundances, a conservative estimate of a sample size of 58 produced the same multivariate results as all larger sample sizes. However, this minimal number varies as a function of evenness, where increased evenness resulted in increased minimal sample sizes. Sample sizes as small as 58 individuals are sufficient for a broad range of multivariate abundance-based research. In cases when resource availability is the limiting factor for conducting a project (e.g., small university, time to conduct the research project), statistically viable results can still be obtained with less of an investment.
Testing the causality of Hawkes processes with time reversal
NASA Astrophysics Data System (ADS)
Cordi, Marcus; Challet, Damien; Muni Toke, Ioane
2018-03-01
We show that univariate and symmetric multivariate Hawkes processes are only weakly causal: the true log-likelihoods of real and reversed event time vectors are almost equal, thus parameter estimation via maximum likelihood only weakly depends on the direction of the arrow of time. In ideal (synthetic) conditions, tests of goodness of parametric fit unambiguously reject backward event times, which implies that inferring kernels from time-symmetric quantities, such as the autocovariance of the event rate, only rarely produce statistically significant fits. Finally, we find that fitting financial data with many-parameter kernels may yield significant fits for both arrows of time for the same event time vector, sometimes favouring the backward time direction. This goes to show that a significant fit of Hawkes processes to real data with flexible kernels does not imply a definite arrow of time unless one tests it.
Direct adaptive control of manipulators in Cartesian space
NASA Technical Reports Server (NTRS)
Seraji, H.
1987-01-01
A new adaptive-control scheme for direct control of manipulator end effector to achieve trajectory tracking in Cartesian space is developed in this article. The control structure is obtained from linear multivariable theory and is composed of simple feedforward and feedback controllers and an auxiliary input. The direct adaptation laws are derived from model reference adaptive control theory and are not based on parameter estimation of the robot model. The utilization of adaptive feedforward control and the inclusion of auxiliary input are novel features of the present scheme and result in improved dynamic performance over existing adaptive control schemes. The adaptive controller does not require the complex mathematical model of the robot dynamics or any knowledge of the robot parameters or the payload, and is computationally fast for on-line implementation with high sampling rates. The control scheme is applied to a two-link manipulator for illustration.
Schuster, Alexander Karl-Georg; Fischer, Joachim Ernst; Vossmerbaeumer, Christine; Vossmerbaeumer, Urs
2016-10-01
Optical coherence tomography (OCT) allows quantitative image analysis of retinal tissue in vivo. Peripapillary retinal nerve fiber layer (pRNFL) thickness is widely used for evaluation of retinal nerve fiber rarefaction in several optic neuropathies. This study evaluates associations of pRNFL thickness in healthy adult subjects in order to evaluate influencing factors. A cross-sectional study was performed in a working-age population. Only eyes without detectable ocular pathologies were included in the analysis. Among analyzed systemic cardiovascular parameters were age, gender, body-mass index, mean arterial blood pressure, HbA1c, high- and low-density-lipoproteins, and triglycerides. A comprehensive ophthalmological examination including refraction, tonometry, keratometry, and central corneal thickness measurmentwas performed. In addition, pRNFL thickness was imaged by spectral-domain OCT. Univariable and multivariable associations of pRNFL thickness in all four quadrants and on average with systemic and ocular parameters were calculated using a generalized estimating equation model. Three hundred and six subjects were included. pRNFL thickness measurements showed a significant association with spherical equivalent: pRNFL thickness decreased with increasing myopia in all quadrants (multivariable regression coefficients Beta: superior: 1.16, 95 % CI [0.62;1.71], p < 0.001; temporal: 0.87, [0.33;1.41], p = 0.001; inferior: 1.80, [1.18;2.42], p < 0.001; nasal: 2.60, [2.01;3.20], p < 0.001) and on average (1.51, [1.20;1.82], p < 0.001). A thicker central cornea thickness was related to lower pRNFL in the superior (-0.05, [-0.10; -0.01], p = 0.01), the inferior quadrant (-0.05, [-0.10;0.00], p = 0.03) and on average (-0.04, [-0.07; -0.01], p = 0.02). All other parameters were not associated. Our findings highlight the importance of refraction when evaluating pRNFL thickness and its independence from other systemic parameters.
Impact of brown adipose tissue on body fatness and glucose metabolism in healthy humans.
Matsushita, M; Yoneshiro, T; Aita, S; Kameya, T; Sugie, H; Saito, M
2014-06-01
Brown adipose tissue (BAT) is involved in the regulation of whole-body energy expenditure and adiposity. Some clinical studies have reported an association between BAT and blood glucose in humans. To examine the impact of BAT on glucose metabolism, independent of that of body fatness, age and sex in healthy adult humans. Two hundred and sixty healthy volunteers (184 males and 76 females, 20-72 years old) underwent fluorodeoxyglucose-positron emission tomography and computed tomography after 2 h of cold exposure to assess maximal BAT activity. Blood parameters including glucose, HbA1c and low-density lipoprotein (LDL)/high-density lipoprotein-cholesterol were measured by conventional methods, and body fatness was estimated from body mass index (BMI), body fat mass and abdominal fat area. The impact of BAT on body fatness and blood parameters was determined by logistic regression with the use of univariate and multivariate models. Cold-activated BAT was detected in 125 (48%) out of 260 subjects. When compared with subjects without detectable BAT, those with detectable BAT were younger and showed lower adiposity-related parameters such as the BMI, body fat mass and abdominal fat area. Although blood parameters were within the normal range in the two subject groups, HbA1c, total cholesterol and LDL-cholesterol were significantly lower in the BAT-positive group. Blood glucose also tended to be lower in the BAT-positive group. Logistic regression demonstrated that BAT, in addition to age and sex, was independently associated with BMI, body fat mass, and abdominal visceral and subcutaneous fat areas. For blood parameters, multivariate analysis after adjustment for age, sex and body fatness revealed that BAT was a significantly independent determinant of glucose and HbA1c. BAT, independent of age, sex and body fatness, has a significant impact on glucose metabolism in adult healthy humans.
Towards reliable ET estimates in the semi-arid Júcar region in Spain.
NASA Astrophysics Data System (ADS)
Brenner, Johannes; Zink, Matthias; Schrön, Martin; Thober, Stephan; Rakovec, Oldrich; Cuntz, Matthias; Merz, Ralf; Samaniego, Luis
2017-04-01
Current research indicated the potential for improving evapotranspiration (ET) estimates in state-of-the-art hydrologic models such as the mesoscale Hydrological Model (mHM, www.ufz.de/mhm). Most models exhibit deficiencies to estimate the ET flux in semi-arid regions. Possible reasons for poor performance may be related to the low resolution of the forcings, the estimation of the PET, which is in most cases based on temperature only, the joint estimation of the transpiration and evaporation through the Feddes equation, poor process parameterizations, among others. In this study, we aim at sequential hypothesis-based experiments to uncover the main reasons of these deficiencies at the Júcar basin in Spain. We plan the following experiments: 1) Use the high resolution meteorological forcing (P and T) provided by local authorities to estimate its effects on ET and streamflow. 2) Use local ET measurements at seven eddy covariance stations to estimate evaporation related parameters. 3) Test the influence of the PET formulations (Hargreaves-Samani, Priestley-Taylor, Penman-Montheith). 4) Estimate evaporation and transpiration separately based on equations proposed by Bohn and Vivoni (2016) 5) Incorporate local soil moisture measurements to re-estimate ET and soil moisture related parameters. We set-up mHM for seven eddy-covariance sites at the local scale (100 × 100 m2). This resolution was chosen because it is representative for the footprint of the latent heat estimation at the eddy-covariance station. In the second experiment, for example, a parameter set is to be found as a compromised solution between ET measured at local stations and the streamflow observations at eight sub-basins of the Júcar river. Preliminary results indicate that higher model performance regarding streamflow can be achieved using local high-resolution meteorology. ET performance is, however, still deficient. On the contrary, using ET site calibrations alone increase performance in ET but yields in poor performance in streamflow. Results suggest the need of multi-variable, simultaneous calibration schemes to reliable estimate ET and streamflow in the Júcar basin. Penman-Montheith appears to be the best performing PET formulation. Experiments 4 and 5 should reveal the benefits of separating evaporation from bare soil and transpiration in semi-arid regions using mHM. Further research in this direction is foreseen by incorporating neutron counts from Cosmic Ray Neutron Sensing technology in the calibration/validation procedure of mHM.
Choi, D J; Park, H
2001-11-01
For control and automation of biological treatment processes, lack of reliable on-line sensors to measure water quality parameters is one of the most important problems to overcome. Many parameters cannot be measured directly with on-line sensors. The accuracy of existing hardware sensors is also not sufficient and maintenance problems such as electrode fouling often cause trouble. This paper deals with the development of software sensor techniques that estimate the target water quality parameter from other parameters using the correlation between water quality parameters. We focus our attention on the preprocessing of noisy data and the selection of the best model feasible to the situation. Problems of existing approaches are also discussed. We propose a hybrid neural network as a software sensor inferring wastewater quality parameter. Multivariate regression, artificial neural networks (ANN), and a hybrid technique that combines principal component analysis as a preprocessing stage are applied to data from industrial wastewater processes. The hybrid ANN technique shows an enhancement of prediction capability and reduces the overfitting problem of neural networks. The result shows that the hybrid ANN technique can be used to extract information from noisy data and to describe the nonlinearity of complex wastewater treatment processes.
NASA Astrophysics Data System (ADS)
Jennings, E.; Madigan, M.
2017-04-01
Given the complexity of modern cosmological parameter inference where we are faced with non-Gaussian data and noise, correlated systematics and multi-probe correlated datasets,the Approximate Bayesian Computation (ABC) method is a promising alternative to traditional Markov Chain Monte Carlo approaches in the case where the Likelihood is intractable or unknown. The ABC method is called "Likelihood free" as it avoids explicit evaluation of the Likelihood by using a forward model simulation of the data which can include systematics. We introduce astroABC, an open source ABC Sequential Monte Carlo (SMC) sampler for parameter estimation. A key challenge in astrophysics is the efficient use of large multi-probe datasets to constrain high dimensional, possibly correlated parameter spaces. With this in mind astroABC allows for massive parallelization using MPI, a framework that handles spawning of processes across multiple nodes. A key new feature of astroABC is the ability to create MPI groups with different communicators, one for the sampler and several others for the forward model simulation, which speeds up sampling time considerably. For smaller jobs the Python multiprocessing option is also available. Other key features of this new sampler include: a Sequential Monte Carlo sampler; a method for iteratively adapting tolerance levels; local covariance estimate using scikit-learn's KDTree; modules for specifying optimal covariance matrix for a component-wise or multivariate normal perturbation kernel and a weighted covariance metric; restart files output frequently so an interrupted sampling run can be resumed at any iteration; output and restart files are backed up at every iteration; user defined distance metric and simulation methods; a module for specifying heterogeneous parameter priors including non-standard prior PDFs; a module for specifying a constant, linear, log or exponential tolerance level; well-documented examples and sample scripts. This code is hosted online at https://github.com/EliseJ/astroABC.
ERIC Educational Resources Information Center
Grochowalski, Joseph H.
2015-01-01
Component Universe Score Profile analysis (CUSP) is introduced in this paper as a psychometric alternative to multivariate profile analysis. The theoretical foundations of CUSP analysis are reviewed, which include multivariate generalizability theory and constrained principal components analysis. Because CUSP is a combination of generalizability…
NASA Astrophysics Data System (ADS)
Khanlari, G. R.; Heidari, M.; Noori, M.; Momeni, A.
2016-07-01
To assess relationship between engineering characteristics and petrographic features, conglomerates samples related to Qom formation from Famenin region in northeast of Hamedan province were studied. Samples were tested in laboratory to determine the uniaxial compressive strength, point load strength index, modulus of elasticity, porosity, dry and saturation densities. For determining petrographic features, textural and mineralogical parameters, thin sections of the samples were prepared and studied. The results show that the effect of textural characteristics on the engineering properties of conglomerates supposed to be more important than mineralogical composition. It also was concluded that the packing proximity, packing density, grain shape and mean grain size, cement and matrix frequency are as textural features that have a significant effect on the physical and mechanical properties of the studied conglomerates. In this study, predictive statistical relationships were developed to estimate the physical and mechanical properties of the rocks based on the results of petrographic features. Furthermore, multivariate linear regression was used in four different steps comprising various combinations of petrographical characteristics for each engineering parameters. Finally, the best equations with specific arrangement were suggested to estimate engineering properties of the Qom formation conglomerates.
Sekulic, Tatjana Djakovic; Keleman, Svetlana; Tot, Kristina; Tot, Jadranka; Trisovic, Nemanja; Uscumlic, Gordana
2016-01-01
New synthesized compounds, particularly those with biological activity, are potential drug candidates. This article describes experimental studies performed to estimate lipophilicity parameters of new 3-(4-substituted benzyl)-5-phenylhydantoins. Lipophilicity, as one of the most important molecular characteristics for the activity, was determined using the reversed-phase liquid chromatography (RP-18 stationary phase and methanol-water mobile phase). Molecular structures were used to generate in silico data which were used to estimate pharmacokinetic properties of the investigated compounds. The results show that generally, the investigated compounds attain good bioavailability properties. A more detailed analysis shows that the presence of a nitro, methoxy and tert-butyl group in the molecule is indicated as unfavorable for the oral bioavailability of hydantoins. Multivariate exploratory analysis was used in order to visualize grouping patterns among molecular descriptors as well as among the investigated compounds. Molecular docking study performed for two hydantoins with the highest bioavailability scores shows high binding affinity to tyrosine kinase receptor IGF-1R. The results achieved can be useful as a template for future development and further derivation or modification to obtain more potent and selective antitumor agents.
Kernel canonical-correlation Granger causality for multiple time series
NASA Astrophysics Data System (ADS)
Wu, Guorong; Duan, Xujun; Liao, Wei; Gao, Qing; Chen, Huafu
2011-04-01
Canonical-correlation analysis as a multivariate statistical technique has been applied to multivariate Granger causality analysis to infer information flow in complex systems. It shows unique appeal and great superiority over the traditional vector autoregressive method, due to the simplified procedure that detects causal interaction between multiple time series, and the avoidance of potential model estimation problems. However, it is limited to the linear case. Here, we extend the framework of canonical correlation to include the estimation of multivariate nonlinear Granger causality for drawing inference about directed interaction. Its feasibility and effectiveness are verified on simulated data.
Mikaeli, S; Thorsén, G; Karlberg, B
2001-01-12
A novel approach to multivariate evaluation of separation electrolytes for micellar electrokinetic chromatography is presented. An initial screening of the experimental parameters is performed using a Plackett-Burman design. Significant parameters are further evaluated using full factorial designs. The total resolution of the separation is calculated and used as response. The proposed scheme has been applied to the optimisation of the separation of phenols and the chiral separation of (+)-1-(9-anthryl)-2-propyl chloroformate-derivatized amino acids. A total of eight experimental parameters were evaluated and optimal conditions found in less than 48 experiments.
Multivariate spatial models of excess crash frequency at area level: case of Costa Rica.
Aguero-Valverde, Jonathan
2013-10-01
Recently, areal models of crash frequency have being used in the analysis of various area-wide factors affecting road crashes. On the other hand, disease mapping methods are commonly used in epidemiology to assess the relative risk of the population at different spatial units. A natural next step is to combine these two approaches to estimate the excess crash frequency at area level as a measure of absolute crash risk. Furthermore, multivariate spatial models of crash severity are explored in order to account for both frequency and severity of crashes and control for the spatial correlation frequently found in crash data. This paper aims to extent the concept of safety performance functions to be used in areal models of crash frequency. A multivariate spatial model is used for that purpose and compared to its univariate counterpart. Full Bayes hierarchical approach is used to estimate the models of crash frequency at canton level for Costa Rica. An intrinsic multivariate conditional autoregressive model is used for modeling spatial random effects. The results show that the multivariate spatial model performs better than its univariate counterpart in terms of the penalized goodness-of-fit measure Deviance Information Criteria. Additionally, the effects of the spatial smoothing due to the multivariate spatial random effects are evident in the estimation of excess equivalent property damage only crashes. Copyright © 2013 Elsevier Ltd. All rights reserved.
Empirical Bayes approach to the estimation of "unsafety": the multivariate regression method.
Hauer, E
1992-10-01
There are two kinds of clues to the unsafety of an entity: its traits (such as traffic, geometry, age, or gender) and its historical accident record. The Empirical Bayes approach to unsafety estimation makes use of both kinds of clues. It requires information about the mean and the variance of the unsafety in a "reference population" of similar entities. The method now in use for this purpose suffers from several shortcomings. First, a very large reference population is required. Second, the choice of reference population is to some extent arbitrary. Third, entities in the reference population usually cannot match the traits of the entity the unsafety of which is estimated. To alleviate these shortcomings the multivariate regression method for estimating the mean and variance of unsafety in reference populations is offered. Its logical foundations are described and its soundness is demonstrated. The use of the multivariate method makes the Empirical Bayes approach to unsafety estimation applicable to a wider range of circumstances and yields better estimates of unsafety. The application of the method to the tasks of identifying deviant entities and of estimating the effect of interventions on unsafety are discussed and illustrated by numerical examples.
Is levator hiatus distension associated with peripheral ligamentous laxity during pregnancy?
Gachon, Bertrand; Fritel, Xavier; Fradet, Laetitia; Decatoire, Arnaud; Lacouture, Patrick; Panjo, Henri; Pierre, Fabrice; Desseauve, David
2017-08-01
The impact of pregnancy on pelvic floor disorders remains poorly understood. During pregnancy, an increase in ligamentous laxity and pelvic organ mobility is often reported. Our main objective was to investigate a possible association between peripheral ligamentous laxity and levator hiatus (LH) distension during pregnancy. This was a prospective longitudinal study of 26 pregnant women followed up from the first to the third trimester. We collected the following information: occurrence of pelvic organ prolapse (POP) symptoms (score higher than 0 for the POP section of the Pelvic Floor Distress Inventory 20 questions score), 4D perineal ultrasound scan results with LH distension assessment and measurement of metacarpophalangeal joint mobility (MCP laxity). The association between MCP laxity and LH distension was estimated by mixed multilevel linear regression. The associations between MCP laxity and categorical parameters were estimated in a multivariate analysis using a generalized estimating equation model. MCP laxity and LH distension were correlated with a correlation coefficient of 0.26 (p = 0.02), and 6.8% of the LH distension variance was explained by MCP laxity. In the multivariate analysis, MCP laxity was associated with POP symptoms with an odds ratio at 1.05 (95% CI 1.01-1.11) for an increase of 1° in MCP laxity. LH distension and peripheral ligamentous laxity are significantly associated during pregnancy. However, the relationship is weak, and the results need to be confirmed in larger populations and with more specific techniques such as elastography to directly assess the elastic properties of the pelvic floor muscles.
Achana, Felix A; Cooper, Nicola J; Bujkiewicz, Sylwia; Hubbard, Stephanie J; Kendrick, Denise; Jones, David R; Sutton, Alex J
2014-07-21
Network meta-analysis (NMA) enables simultaneous comparison of multiple treatments while preserving randomisation. When summarising evidence to inform an economic evaluation, it is important that the analysis accurately reflects the dependency structure within the data, as correlations between outcomes may have implication for estimating the net benefit associated with treatment. A multivariate NMA offers a framework for evaluating multiple treatments across multiple outcome measures while accounting for the correlation structure between outcomes. The standard NMA model is extended to multiple outcome settings in two stages. In the first stage, information is borrowed across outcomes as well across studies through modelling the within-study and between-study correlation structure. In the second stage, we make use of the additional assumption that intervention effects are exchangeable between outcomes to predict effect estimates for all outcomes, including effect estimates on outcomes where evidence is either sparse or the treatment had not been considered by any one of the studies included in the analysis. We apply the methods to binary outcome data from a systematic review evaluating the effectiveness of nine home safety interventions on uptake of three poisoning prevention practices (safe storage of medicines, safe storage of other household products, and possession of poison centre control telephone number) in households with children. Analyses are conducted in WinBUGS using Markov Chain Monte Carlo (MCMC) simulations. Univariate and the first stage multivariate models produced broadly similar point estimates of intervention effects but the uncertainty around the multivariate estimates varied depending on the prior distribution specified for the between-study covariance structure. The second stage multivariate analyses produced more precise effect estimates while enabling intervention effects to be predicted for all outcomes, including intervention effects on outcomes not directly considered by the studies included in the analysis. Accounting for the dependency between outcomes in a multivariate meta-analysis may or may not improve the precision of effect estimates from a network meta-analysis compared to analysing each outcome separately.
Differentiation of benign and malignant ampullary obstruction by multi-row detector CT.
Angthong, Wirana; Jiarakoop, Kran; Tangtiang, Kaan
2018-05-21
To determine useful CT parameters to differentiate ampullary carcinomas from benign ampullary obstruction. This study included 93 patients who underwent abdominal CT, 31 patients with ampullary carcinomas, and 62 patients with benign ampullary obstruction. Two radiologists independently evaluated CT parameters then reached consensus decisions. Statistically significant CT parameters were identified through univariate and multivariate analyses. In univariate analysis, the presence of ampullary mass, asymmetric, abrupt narrowing of distal common bile duct (CBD), dilated intrahepatic bile duct (IHD), dilated pancreatic duct (PD), peripancreatic lymphadenopathy, duodenal wall thickening, and delayed enhancement were more frequently in ampullary carcinomas observed (P < 0.05). Multivariate logistic regression analysis using significant CT parameters and clinical data from univariate analysis, and clinical symptom with jaundice (P = 0.005) was an independent predictor of ampullary carcinomas. For multivariate analysis using only significant CT parameters, abrupt narrowing of distal CBD was an independent predictor of ampullary carcinomas (P = 0.019). Among various CT criteria, abrupt narrowing of distal CBD and dilated IHD had highest sensitivity (77.4%) and highest accuracy (90.3%). The abrupt narrowing of distal CBD and dilated IHD is useful for differentiation of ampullary carcinomas from benign entity in patients without the presence of mass.
Shah, Anoop D.; Bartlett, Jonathan W.; Carpenter, James; Nicholas, Owen; Hemingway, Harry
2014-01-01
Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The “true” imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001–2010) with complete data on all covariates. Variables were artificially made “missing at random,” and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data. PMID:24589914
Shah, Anoop D; Bartlett, Jonathan W; Carpenter, James; Nicholas, Owen; Hemingway, Harry
2014-03-15
Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The "true" imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001-2010) with complete data on all covariates. Variables were artificially made "missing at random," and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data.
Quantifying the impact of between-study heterogeneity in multivariate meta-analyses
Jackson, Dan; White, Ian R; Riley, Richard D
2012-01-01
Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950
NASA Technical Reports Server (NTRS)
Belcastro, Christine M.
1998-01-01
Robust control system analysis and design is based on an uncertainty description, called a linear fractional transformation (LFT), which separates the uncertain (or varying) part of the system from the nominal system. These models are also useful in the design of gain-scheduled control systems based on Linear Parameter Varying (LPV) methods. Low-order LFT models are difficult to form for problems involving nonlinear parameter variations. This paper presents a numerical computational method for constructing and LFT model for a given LPV model. The method is developed for multivariate polynomial problems, and uses simple matrix computations to obtain an exact low-order LFT representation of the given LPV system without the use of model reduction. Although the method is developed for multivariate polynomial problems, multivariate rational problems can also be solved using this method by reformulating the rational problem into a polynomial form.
Willis, Michael; Asseburg, Christian; Nilsson, Andreas; Johnsson, Kristina; Kartman, Bernt
2017-03-01
Type 2 diabetes mellitus (T2DM) is chronic and progressive and the cost-effectiveness of new treatment interventions must be established over long time horizons. Given the limited durability of drugs, assumptions regarding downstream rescue medication can drive results. Especially for insulin, for which treatment effects and adverse events are known to depend on patient characteristics, this can be problematic for health economic evaluation involving modeling. To estimate parsimonious multivariate equations of treatment effects and hypoglycemic event risks for use in parameterizing insulin rescue therapy in model-based cost-effectiveness analysis. Clinical evidence for insulin use in T2DM was identified in PubMed and from published reviews and meta-analyses. Study and patient characteristics and treatment effects and adverse event rates were extracted and the data used to estimate parsimonious treatment effect and hypoglycemic event risk equations using multivariate regression analysis. Data from 91 studies featuring 171 usable study arms were identified, mostly for premix and basal insulin types. Multivariate prediction equations for glycated hemoglobin A 1c lowering and weight change were estimated separately for insulin-naive and insulin-experienced patients. Goodness of fit (R 2 ) for both outcomes were generally good, ranging from 0.44 to 0.84. Multivariate prediction equations for symptomatic, nocturnal, and severe hypoglycemic events were also estimated, though considerable heterogeneity in definitions limits their usefulness. Parsimonious and robust multivariate prediction equations were estimated for glycated hemoglobin A 1c and weight change, separately for insulin-naive and insulin-experienced patients. Using these in economic simulation modeling in T2DM can improve realism and flexibility in modeling insulin rescue medication. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Dudásová, Dorota; Rune Flåten, Geir; Sjöblom, Johan; Øye, Gisle
2009-09-15
The transmission profiles of one- to three-component particle suspension mixtures were analyzed by multivariate methods such as principal component analysis (PCA) and partial least-squares regression (PLS). The particles mimic the solids present in oil-field-produced water. Kaolin and silica represent solids of reservoir origin, whereas FeS is the product of bacterial metabolic activities, and Fe(3)O(4) corrosion product (e.g., from pipelines). All particles were coated with crude oil surface active components to imitate particles in real systems. The effects of different variables (concentration, temperature, and coating) on the suspension stability were studied with Turbiscan LAb(Expert). The transmission profiles over 75 min represent the overall water quality, while the transmission during the first 15.5 min gives information for suspension behavior during a representative time period for the hold time in the separator. The behavior of the mixed particle suspensions was compared to that of the single particle suspensions and models describing the systems were built. The findings are summarized as follows: silica seems to dominate the mixture properties in the binary suspensions toward enhanced separation. For 75 min, temperature and concentration are the most significant, while for 15.5 min, concentration is the only significant variable. Models for prediction of transmission spectra from run parameters as well as particle type from transmission profiles (inverse calibration) give a reasonable description of the relationships. In ternary particle mixtures, silica is not dominant and for 75 min, the significant variables for mixture (temperature and coating) are more similar to single kaolin and FeS/Fe(3)O(4). On the other hand, for 15.5 min, the coating is the most significant and this is similar to one for silica (at 15.5 min). The model for prediction of transmission spectra from run parameters gives good estimates of the transmission profiles. Although the model for prediction of particle type from transmission parameters is able to predict some particles, further improvement is required before all particles are consistently correctly classified. Cross-validation was done for both models and estimation errors are reported.
Brito Lopes, Fernando; da Silva, Marcelo Corrêa; Magnabosco, Cláudio Ulhôa; Goncalves Narciso, Marcelo; Sainz, Roberto Daniel
2016-01-01
This research evaluated a multivariate approach as an alternative tool for the purpose of selection regarding expected progeny differences (EPDs). Data were fitted using a multi-trait model and consisted of growth traits (birth weight and weights at 120, 210, 365 and 450 days of age) and carcass traits (longissimus muscle area (LMA), back-fat thickness (BF), and rump fat thickness (RF)), registered over 21 years in extensive breeding systems of Polled Nellore cattle in Brazil. Multivariate analyses were performed using standardized (zero mean and unit variance) EPDs. The k mean method revealed that the best fit of data occurred using three clusters (k = 3) (P < 0.001). Estimates of genetic correlation among growth and carcass traits and the estimates of heritability were moderate to high, suggesting that a correlated response approach is suitable for practical decision making. Estimates of correlation between selection indices and the multivariate index (LD1) were moderate to high, ranging from 0.48 to 0.97. This reveals that both types of indices give similar results and that the multivariate approach is reliable for the purpose of selection. The alternative tool seems very handy when economic weights are not available or in cases where more rapid identification of the best animals is desired. Interestingly, multivariate analysis allowed forecasting information based on the relationships among breeding values (EPDs). Also, it enabled fine discrimination, rapid data summarization after genetic evaluation, and permitted accounting for maternal ability and the genetic direct potential of the animals. In addition, we recommend the use of longissimus muscle area and subcutaneous fat thickness as selection criteria, to allow estimation of breeding values before the first mating season in order to accelerate the response to individual selection. PMID:26789008
Brito Lopes, Fernando; da Silva, Marcelo Corrêa; Magnabosco, Cláudio Ulhôa; Goncalves Narciso, Marcelo; Sainz, Roberto Daniel
2016-01-01
This research evaluated a multivariate approach as an alternative tool for the purpose of selection regarding expected progeny differences (EPDs). Data were fitted using a multi-trait model and consisted of growth traits (birth weight and weights at 120, 210, 365 and 450 days of age) and carcass traits (longissimus muscle area (LMA), back-fat thickness (BF), and rump fat thickness (RF)), registered over 21 years in extensive breeding systems of Polled Nellore cattle in Brazil. Multivariate analyses were performed using standardized (zero mean and unit variance) EPDs. The k mean method revealed that the best fit of data occurred using three clusters (k = 3) (P < 0.001). Estimates of genetic correlation among growth and carcass traits and the estimates of heritability were moderate to high, suggesting that a correlated response approach is suitable for practical decision making. Estimates of correlation between selection indices and the multivariate index (LD1) were moderate to high, ranging from 0.48 to 0.97. This reveals that both types of indices give similar results and that the multivariate approach is reliable for the purpose of selection. The alternative tool seems very handy when economic weights are not available or in cases where more rapid identification of the best animals is desired. Interestingly, multivariate analysis allowed forecasting information based on the relationships among breeding values (EPDs). Also, it enabled fine discrimination, rapid data summarization after genetic evaluation, and permitted accounting for maternal ability and the genetic direct potential of the animals. In addition, we recommend the use of longissimus muscle area and subcutaneous fat thickness as selection criteria, to allow estimation of breeding values before the first mating season in order to accelerate the response to individual selection.
Probabilistic Modeling of the Renal Stone Formation Module
NASA Technical Reports Server (NTRS)
Best, Lauren M.; Myers, Jerry G.; Goodenow, Debra A.; McRae, Michael P.; Jackson, Travis C.
2013-01-01
The Integrated Medical Model (IMM) is a probabilistic tool, used in mission planning decision making and medical systems risk assessments. The IMM project maintains a database of over 80 medical conditions that could occur during a spaceflight, documenting an incidence rate and end case scenarios for each. In some cases, where observational data are insufficient to adequately define the inflight medical risk, the IMM utilizes external probabilistic modules to model and estimate the event likelihoods. One such medical event of interest is an unpassed renal stone. Due to a high salt diet and high concentrations of calcium in the blood (due to bone depletion caused by unloading in the microgravity environment) astronauts are at a considerable elevated risk for developing renal calculi (nephrolithiasis) while in space. Lack of observed incidences of nephrolithiasis has led HRP to initiate the development of the Renal Stone Formation Module (RSFM) to create a probabilistic simulator capable of estimating the likelihood of symptomatic renal stone presentation in astronauts on exploration missions. The model consists of two major parts. The first is the probabilistic component, which utilizes probability distributions to assess the range of urine electrolyte parameters and a multivariate regression to transform estimated crystal density and size distributions to the likelihood of the presentation of nephrolithiasis symptoms. The second is a deterministic physical and chemical model of renal stone growth in the kidney developed by Kassemi et al. The probabilistic component of the renal stone model couples the input probability distributions describing the urine chemistry, astronaut physiology, and system parameters with the physical and chemical outputs and inputs to the deterministic stone growth model. These two parts of the model are necessary to capture the uncertainty in the likelihood estimate. The model will be driven by Monte Carlo simulations, continuously randomly sampling the probability distributions of the electrolyte concentrations and system parameters that are inputs into the deterministic model. The total urine chemistry concentrations are used to determine the urine chemistry activity using the Joint Expert Speciation System (JESS), a biochemistry model. Information used from JESS is then fed into the deterministic growth model. Outputs from JESS and the deterministic model are passed back to the probabilistic model where a multivariate regression is used to assess the likelihood of a stone forming and the likelihood of a stone requiring clinical intervention. The parameters used to determine to quantify these risks include: relative supersaturation (RS) of calcium oxalate, citrate/calcium ratio, crystal number density, total urine volume, pH, magnesium excretion, maximum stone width, and ureteral location. Methods and Validation: The RSFM is designed to perform a Monte Carlo simulation to generate probability distributions of clinically significant renal stones, as well as provide an associated uncertainty in the estimate. Initially, early versions will be used to test integration of the components and assess component validation and verification (V&V), with later versions used to address questions regarding design reference mission scenarios. Once integrated with the deterministic component, the credibility assessment of the integrated model will follow NASA STD 7009 requirements.
Estimating an Effect Size in One-Way Multivariate Analysis of Variance (MANOVA)
ERIC Educational Resources Information Center
Steyn, H. S., Jr.; Ellis, S. M.
2009-01-01
When two or more univariate population means are compared, the proportion of variation in the dependent variable accounted for by population group membership is eta-squared. This effect size can be generalized by using multivariate measures of association, based on the multivariate analysis of variance (MANOVA) statistics, to establish whether…
Multivariable Parametric Cost Model for Ground Optical Telescope Assembly
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia
2005-01-01
A parametric cost model for ground-based telescopes is developed using multivariable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction-limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature are examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e., multi-telescope phased-array systems). Additionally, single variable models Based on aperture diameter are derived.
NASA Astrophysics Data System (ADS)
Gilmanov, T. G.; Wylie, B. K.; Gu, Y.; Howard, D. M.; Zhang, L.
2013-12-01
The physiologically based model of canopy CO2 exchange by Thornly and Johnson (2000) modified to incorporate vapor pressure deficit (VPD) limitation of photosynthesis is a robust tool for partitioning tower network net CO2 exchange data into gross photosynthesis (GPP) and ecosystem respiration (RE) (Gilmanov et al. 2013a, b). In addition to 30-min and daily photosynthesis and respiration values, the procedure generates daily estimates and uncertainties of essential ecosystem-scale parameters such as apparent quantum yield ALPHA, photosynthetic capacity AMAX, convexity of light response THETA, gross ecological light-use efficiency LUE, daytime ecosystem respiration rate RDAY, and nighttime ecosystem respiration rate RNIGHT. These ecosystem-scale parameters are highly demanded by the modeling community and open opportunities for comparison with the rich data of leaf-level estimates of corresponding parameters available from physiological studies of previous decades. Based on the data for 70+ site-years of flux tower measurements at the non-forest sites of the Ameriflux network and the non-affiliated sites, we present results of the comparative analysis and multi-site synthesis of the magnitudes, uncertainties, patterns of seasonal and yearly dynamics, and spatiotemporal distribution of these parameters for grasslands and croplands of the conterminous United States (CONUS). Combining this site-level parameter data set with the rich spatiotemporal data sets of a remotely sensed vegetation index, weather and climate conditions, and site biophysical and geophysical features (phenology, photosynthetically active radiation, and soil water holding capacity) using methods of multivariate analysis (e.g., Cubist regression tree) offers new opportunities for predictive modeling and scaling-up of ecosystem-scale parameters of carbon cycling in grassland and agricultural ecosystems of CONUS (Zhang et al. 2011; Gu et al. 2012). REFERENCES Gilmanov TG, Baker JM, Bernacchi CJ, Billesbach DP, Burba GG, et al. (2013a). Productivity and CO2 exchange of the leguminous crops: Estimates from flux tower measurements. Agronomy J (submitted). Gilmanov TG, Wylie BK, Tieszen LL, Meyers TP, Baron VS, et al. (2013b). CO2 uptake and ecophysiological parameters of the grain crops of midcontinent North America: Estimates from flux tower measurements. Agric Ecosyst Environm 164: 162-175 Gu Y, Howard DM, Wylie BK, and Zhang L (2012). Mapping carbon flux uncertainty and selecting optimal locations for future flux towers in the Great Plains: Landscape Ecology, 27: 319-326. Thornley JHM., Johnson IR (2000). Plant and crop modelling. A mathematical approach to plant and crop physiology. The Blackburn Press, Caldwell, New Jersey. Zhang L, Wylie BK, Ji L, Gilmanov TG, Tieszen LL, Howard DM (2011). Upscaling carbon fluxes over the Great Plains grasslands: Sinks and sources. J Geophys Res G: Biogeosciences 116: G00J3
Estimation of failure criteria in multivariate sensory shelf life testing using survival analysis.
Giménez, Ana; Gagliardi, Andrés; Ares, Gastón
2017-09-01
For most food products, shelf life is determined by changes in their sensory characteristics. A predetermined increase or decrease in the intensity of a sensory characteristic has frequently been used to signal that a product has reached the end of its shelf life. Considering all attributes change simultaneously, the concept of multivariate shelf life allows a single measurement of deterioration that takes into account all these sensory changes at a certain storage time. The aim of the present work was to apply survival analysis to estimate failure criteria in multivariate sensory shelf life testing using two case studies, hamburger buns and orange juice, by modelling the relationship between consumers' rejection of the product and the deterioration index estimated using PCA. In both studies, a panel of 13 trained assessors evaluated the samples using descriptive analysis whereas a panel of 100 consumers answered a "yes" or "no" question regarding intention to buy or consume the product. PC1 explained the great majority of the variance, indicating all sensory characteristics evolved similarly with storage time. Thus, PC1 could be regarded as index of sensory deterioration and a single failure criterion could be estimated through survival analysis for 25 and 50% consumers' rejection. The proposed approach based on multivariate shelf life testing may increase the accuracy of shelf life estimations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Vrancken, Bram; Lemey, Philippe; Rambaut, Andrew; Bedford, Trevor; Longdon, Ben; Günthard, Huldrych F.; Suchard, Marc A.
2014-01-01
Phylogenetic signal quantifies the degree to which resemblance in continuously-valued traits reflects phylogenetic relatedness. Measures of phylogenetic signal are widely used in ecological and evolutionary research, and are recently gaining traction in viral evolutionary studies. Standard estimators of phylogenetic signal frequently condition on data summary statistics of the repeated trait observations and fixed phylogenetics trees, resulting in information loss and potential bias. To incorporate the observation process and phylogenetic uncertainty in a model-based approach, we develop a novel Bayesian inference method to simultaneously estimate the evolutionary history and phylogenetic signal from molecular sequence data and repeated multivariate traits. Our approach builds upon a phylogenetic diffusion framework that model continuous trait evolution as a Brownian motion process and incorporates Pagel’s λ transformation parameter to estimate dependence among traits. We provide a computationally efficient inference implementation in the BEAST software package. We evaluate the synthetic performance of the Bayesian estimator of phylogenetic signal against standard estimators, and demonstrate the use of our coherent framework to address several virus-host evolutionary questions, including virulence heritability for HIV, antigenic evolution in influenza and HIV, and Drosophila sensitivity to sigma virus infection. Finally, we discuss model extensions that will make useful contributions to our flexible framework for simultaneously studying sequence and trait evolution. PMID:25780554
[Cardiovascular risk parameters, metabolic syndrome and alcohol consumption by workers].
Vicente-Herrero, María Teófila; López González, Ángel Arturo; Ramírez-Iñiguez de la Torre, María Victoria; Capdevila-García, Luisa; Terradillos-García, María Jesús; Aguilar-Jiménez, Encarna
2015-04-01
Prevalence of alcohol consumption is high in the general population and generates specific problems at the workplace. To establish benchmarks between levels of alcohol consumption and cardiovascular risk variables and metabolic syndrome. A cross-sectional study of 7,644 workers of Spanish companies (2,828 females and 4,816 males). Alcohol consumption and its relation to cardiovascular risk was assessed using Framingham calibrated for the Spanish population (REGICOR) and SCORE, and metabolic syndrome was assessed using modified ATPIII and IDF criteria and Castelli and atherogenic index and triglycerides/HDL ratio. A multivariate analysis was performed using logistic regression and odds ratios were estimated. Statistically significant differences were seen in the mean values of the different parameters studied in prevalence of metabolic syndrome, for both sexes and with modified ATPIII, IDF and REGICOR and SCORE. The sex, age, alcohol, and smoking variables were associated to cardiovascular risk parameters and metabolic syndrome. Physical exercise and stress are only associated to with some of them. The alcohol consumption affects all cardiovascular risk parameters and metabolic syndrome, being more negative the result in high level drinkers. Copyright © 2014 SEEN. Published by Elsevier España, S.L.U. All rights reserved.
Post-processing of multi-model ensemble river discharge forecasts using censored EMOS
NASA Astrophysics Data System (ADS)
Hemri, Stephan; Lisniak, Dmytro; Klein, Bastian
2014-05-01
When forecasting water levels and river discharge, ensemble weather forecasts are used as meteorological input to hydrologic process models. As hydrologic models are imperfect and the input ensembles tend to be biased and underdispersed, the output ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, statistical post-processing is required in order to achieve calibrated and sharp predictions. Standard post-processing methods such as Ensemble Model Output Statistics (EMOS) that have their origins in meteorological forecasting are now increasingly being used in hydrologic applications. Here we consider two sub-catchments of River Rhine, for which the forecasting system of the Federal Institute of Hydrology (BfG) uses runoff data that are censored below predefined thresholds. To address this methodological challenge, we develop a censored EMOS method that is tailored to such data. The censored EMOS forecast distribution can be understood as a mixture of a point mass at the censoring threshold and a continuous part based on a truncated normal distribution. Parameter estimates of the censored EMOS model are obtained by minimizing the Continuous Ranked Probability Score (CRPS) over the training dataset. Model fitting on Box-Cox transformed data allows us to take account of the positive skewness of river discharge distributions. In order to achieve realistic forecast scenarios over an entire range of lead-times, there is a need for multivariate extensions. To this end, we smooth the marginal parameter estimates over lead-times. In order to obtain realistic scenarios of discharge evolution over time, the marginal distributions have to be linked with each other. To this end, the multivariate dependence structure can either be adopted from the raw ensemble like in Ensemble Copula Coupling (ECC), or be estimated from observations in a training period. The censored EMOS model has been applied to multi-model ensemble forecasts issued on a daily basis over a period of three years. For the two catchments considered, this resulted in well calibrated and sharp forecast distributions over all lead-times from 1 to 114 h. Training observations tended to be better indicators for the dependence structure than the raw ensemble.
Cell nuclei and cytoplasm joint segmentation using the sliding band filter.
Quelhas, Pedro; Marcuzzo, Monica; Mendonça, Ana Maria; Campilho, Aurélio
2010-08-01
Microscopy cell image analysis is a fundamental tool for biological research. In particular, multivariate fluorescence microscopy is used to observe different aspects of cells in cultures. It is still common practice to perform analysis tasks by visual inspection of individual cells which is time consuming, exhausting and prone to induce subjective bias. This makes automatic cell image analysis essential for large scale, objective studies of cell cultures. Traditionally the task of automatic cell analysis is approached through the use of image segmentation methods for extraction of cells' locations and shapes. Image segmentation, although fundamental, is neither an easy task in computer vision nor is it robust to image quality changes. This makes image segmentation for cell detection semi-automated requiring frequent tuning of parameters. We introduce a new approach for cell detection and shape estimation in multivariate images based on the sliding band filter (SBF). This filter's design makes it adequate to detect overall convex shapes and as such it performs well for cell detection. Furthermore, the parameters involved are intuitive as they are directly related to the expected cell size. Using the SBF filter we detect cells' nucleus and cytoplasm location and shapes. Based on the assumption that each cell has the same approximate shape center in both nuclei and cytoplasm fluorescence channels, we guide cytoplasm shape estimation by the nuclear detections improving performance and reducing errors. Then we validate cell detection by gathering evidence from nuclei and cytoplasm channels. Additionally, we include overlap correction and shape regularization steps which further improve the estimated cell shapes. The approach is evaluated using two datasets with different types of data: a 20 images benchmark set of simulated cell culture images, containing 1000 simulated cells; a 16 images Drosophila melanogaster Kc167 dataset containing 1255 cells, stained for DNA and actin. Both image datasets present a difficult problem due to the high variability of cell shapes and frequent cluster overlap between cells. On the Drosophila dataset our approach achieved a precision/recall of 95%/69% and 82%/90% for nuclei and cytoplasm detection respectively and an overall accuracy of 76%.
García Nieto, Paulino José; González Suárez, Victor Manuel; Álvarez Antón, Juan Carlos; Mayo Bayón, Ricardo; Sirgo Blanco, José Ángel; Díaz Fernández, Ana María
2015-01-01
The aim of this study was to obtain a predictive model able to perform an early detection of central segregation severity in continuous cast steel slabs. Segregation in steel cast products is an internal defect that can be very harmful when slabs are rolled in heavy plate mills. In this research work, the central segregation was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. For this purpose, the most important physical-chemical parameters are considered. The results of the present study are two-fold. In the first place, the significance of each physical-chemical variable on the segregation is presented through the model. Second, a model for forecasting segregation is obtained. Regression with optimal hyperparameters was performed and coefficients of determination equal to 0.93 for continuity factor estimation and 0.95 for average width were obtained when the MARS technique was applied to the experimental dataset, respectively. The agreement between experimental data and the model confirmed the good performance of the latter.
Buried landmine detection using multivariate normal clustering
NASA Astrophysics Data System (ADS)
Duston, Brian M.
2001-10-01
A Bayesian classification algorithm is presented for discriminating buried land mines from buried and surface clutter in Ground Penetrating Radar (GPR) signals. This algorithm is based on multivariate normal (MVN) clustering, where feature vectors are used to identify populations (clusters) of mines and clutter objects. The features are extracted from two-dimensional images created from ground penetrating radar scans. MVN clustering is used to determine the number of clusters in the data and to create probability density models for target and clutter populations, producing the MVN clustering classifier (MVNCC). The Bayesian Information Criteria (BIC) is used to evaluate each model to determine the number of clusters in the data. An extension of the MVNCC allows the model to adapt to local clutter distributions by treating each of the MVN cluster components as a Poisson process and adaptively estimating the intensity parameters. The algorithm is developed using data collected by the Mine Hunter/Killer Close-In Detector (MH/K CID) at prepared mine lanes. The Mine Hunter/Killer is a prototype mine detecting and neutralizing vehicle developed for the U.S. Army to clear roads of anti-tank mines.
Development of a robust framework for controlling high performance turbofan engines
NASA Astrophysics Data System (ADS)
Miklosovic, Robert
This research involves the development of a robust framework for controlling complex and uncertain multivariable systems. Where mathematical modeling is often tedious or inaccurate, the new method uses an extended state observer (ESO) to estimate and cancel dynamic information in real time and dynamically decouple the system. As a result, controller design and tuning become transparent as the number of required model parameters is reduced. Much research has been devoted towards the application of modern multivariable control techniques on aircraft engines. However, few, if any, have been implemented on an operational aircraft, partially due to the difficulty in tuning the controller for satisfactory performance. The new technique is applied to a modern two-spool, high-pressure ratio, low-bypass turbofan with mixed-flow afterburning. A realistic Modular Aero-Propulsion System Simulation (MAPSS) package, developed by NASA, is used to demonstrate the new design process and compare its performance with that of a supplied nominal controller. This approach is expected to reduce gain scheduling over the full operating envelope of the engine and allow a controller to be tuned for engine-to-engine variations.
Implementation of the Iterative Proportion Fitting Algorithm for Geostatistical Facies Modeling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li Yupeng, E-mail: yupeng@ualberta.ca; Deutsch, Clayton V.
2012-06-15
In geostatistics, most stochastic algorithm for simulation of categorical variables such as facies or rock types require a conditional probability distribution. The multivariate probability distribution of all the grouped locations including the unsampled location permits calculation of the conditional probability directly based on its definition. In this article, the iterative proportion fitting (IPF) algorithm is implemented to infer this multivariate probability. Using the IPF algorithm, the multivariate probability is obtained by iterative modification to an initial estimated multivariate probability using lower order bivariate probabilities as constraints. The imposed bivariate marginal probabilities are inferred from profiles along drill holes or wells.more » In the IPF process, a sparse matrix is used to calculate the marginal probabilities from the multivariate probability, which makes the iterative fitting more tractable and practical. This algorithm can be extended to higher order marginal probability constraints as used in multiple point statistics. The theoretical framework is developed and illustrated with estimation and simulation example.« less
Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis
ERIC Educational Resources Information Center
Ansari, Asim; Iyengar, Raghuram
2006-01-01
We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…
Multivariable Parametric Cost Model for Ground Optical: Telescope Assembly
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia
2004-01-01
A parametric cost model for ground-based telescopes is developed using multi-variable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature were examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e. multi-telescope phased-array systems). Additionally, single variable models based on aperture diameter were derived.
Williams, L. Keoki; Buu, Anne
2017-01-01
We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206
Vilayphiou, Nicolas; Boutroy, Stephanie; Sornay-Rendu, Elisabeth; Van Rietbergen, Bert; Chapurlat, Roland
2016-02-01
The high resolution peripheral computed tomography (HR-pQCT) technique has seen recent developments with regard to the assessment of cortical porosity. In this study, we investigated the role of cortical porosity on bone strength in a large cohort of women. The distal radius and distal tibia were scanned by HR-pQCT. We assessed bone strength by estimating the failure load by microfinite element analysis (μFEA), with isotropic and homogeneous material properties. We built a multivariate model to predict it, using a few microarchitecture variables including cortical porosity. Among 857 Caucasian women analyzed with μFEA, we found that cortical and trabecular properties, along with the failure load, impaired slightly with advancing age in premenopausal women, the correlations with age being modest, with |rage| ranging from 0.14 to 0.38. After the onset of the menopause, those relationships with age were stronger for most parameters at both sites, with |rage| ranging from 0.10 to 0.64, notably for cortical porosity and failure load, which were markedly deteriorated with increasing age. Our multivariate model using microarchitecture parameters revealed that cortical porosity played a significant role in bone strength prediction, with semipartial r(2)=0.22 only at the tibia in postmenopausal women. In conclusion, in our large cohort of women, we observed a small decline of bone strength at the tibia before the onset of menopause. We also found an age-related increase of cortical porosity at both scanned sites in premenopausal women. In postmenopausal women, the relatively high increase of cortical porosity accounted for the decline in bone strength only at the tibia. Copyright © 2015 Elsevier Inc. All rights reserved.
Gaussian copula as a likelihood function for environmental models
NASA Astrophysics Data System (ADS)
Wani, O.; Espadas, G.; Cecinati, F.; Rieckermann, J.
2017-12-01
Parameter estimation of environmental models always comes with uncertainty. To formally quantify this parametric uncertainty, a likelihood function needs to be formulated, which is defined as the probability of observations given fixed values of the parameter set. A likelihood function allows us to infer parameter values from observations using Bayes' theorem. The challenge is to formulate a likelihood function that reliably describes the error generating processes which lead to the observed monitoring data, such as rainfall and runoff. If the likelihood function is not representative of the error statistics, the parameter inference will give biased parameter values. Several uncertainty estimation methods that are currently being used employ Gaussian processes as a likelihood function, because of their favourable analytical properties. Box-Cox transformation is suggested to deal with non-symmetric and heteroscedastic errors e.g. for flow data which are typically more uncertain in high flows than in periods with low flows. Problem with transformations is that the results are conditional on hyper-parameters, for which it is difficult to formulate the analyst's belief a priori. In an attempt to address this problem, in this research work we suggest learning the nature of the error distribution from the errors made by the model in the "past" forecasts. We use a Gaussian copula to generate semiparametric error distributions . 1) We show that this copula can be then used as a likelihood function to infer parameters, breaking away from the practice of using multivariate normal distributions. Based on the results from a didactical example of predicting rainfall runoff, 2) we demonstrate that the copula captures the predictive uncertainty of the model. 3) Finally, we find that the properties of autocorrelation and heteroscedasticity of errors are captured well by the copula, eliminating the need to use transforms. In summary, our findings suggest that copulas are an interesting departure from the usage of fully parametric distributions as likelihood functions - and they could help us to better capture the statistical properties of errors and make more reliable predictions.
Dinov, Ivo D.; Kamino, Scott; Bhakhrani, Bilal; Christou, Nicolas
2014-01-01
Summary Data analysis requires subtle probability reasoning to answer questions like What is the chance of event A occurring, given that event B was observed? This generic question arises in discussions of many intriguing scientific questions such as What is the probability that an adolescent weighs between 120 and 140 pounds given that they are of average height? and What is the probability of (monetary) inflation exceeding 4% and housing price index below 110? To address such problems, learning some applied, theoretical or cross-disciplinary probability concepts is necessary. Teaching such courses can be improved by utilizing modern information technology resources. Students’ understanding of multivariate distributions, conditional probabilities, correlation and causation can be significantly strengthened by employing interactive web-based science educational resources. Independent of the type of a probability course (e.g. majors, minors or service probability course, rigorous measure-theoretic, applied or statistics course) student motivation, learning experiences and knowledge retention may be enhanced by blending modern technological tools within the classical conceptual pedagogical models. We have designed, implemented and disseminated a portable open-source web-application for teaching multivariate distributions, marginal, joint and conditional probabilities using the special case of bivariate Normal distribution. A real adolescent height and weight dataset is used to demonstrate the classroom utilization of the new web-application to address problems of parameter estimation, univariate and multivariate inference. PMID:25419016
Dinov, Ivo D; Kamino, Scott; Bhakhrani, Bilal; Christou, Nicolas
2013-01-01
Data analysis requires subtle probability reasoning to answer questions like What is the chance of event A occurring, given that event B was observed? This generic question arises in discussions of many intriguing scientific questions such as What is the probability that an adolescent weighs between 120 and 140 pounds given that they are of average height? and What is the probability of (monetary) inflation exceeding 4% and housing price index below 110? To address such problems, learning some applied, theoretical or cross-disciplinary probability concepts is necessary. Teaching such courses can be improved by utilizing modern information technology resources. Students' understanding of multivariate distributions, conditional probabilities, correlation and causation can be significantly strengthened by employing interactive web-based science educational resources. Independent of the type of a probability course (e.g. majors, minors or service probability course, rigorous measure-theoretic, applied or statistics course) student motivation, learning experiences and knowledge retention may be enhanced by blending modern technological tools within the classical conceptual pedagogical models. We have designed, implemented and disseminated a portable open-source web-application for teaching multivariate distributions, marginal, joint and conditional probabilities using the special case of bivariate Normal distribution. A real adolescent height and weight dataset is used to demonstrate the classroom utilization of the new web-application to address problems of parameter estimation, univariate and multivariate inference.
Dong, Chunjiao; Clarke, David B; Richards, Stephen H; Huang, Baoshan
2014-01-01
The influence of intersection features on safety has been examined extensively because intersections experience a relatively large proportion of motor vehicle conflicts and crashes. Although there are distinct differences between passenger cars and large trucks-size, operating characteristics, dimensions, and weight-modeling crash counts across vehicle types is rarely addressed. This paper develops and presents a multivariate regression model of crash frequencies by collision vehicle type using crash data for urban signalized intersections in Tennessee. In addition, the performance of univariate Poisson-lognormal (UVPLN), multivariate Poisson (MVP), and multivariate Poisson-lognormal (MVPLN) regression models in establishing the relationship between crashes, traffic factors, and geometric design of roadway intersections is investigated. Bayesian methods are used to estimate the unknown parameters of these models. The evaluation results suggest that the MVPLN model possesses most of the desirable statistical properties in developing the relationships. Compared to the UVPLN and MVP models, the MVPLN model better identifies significant factors and predicts crash frequencies. The findings suggest that traffic volume, truck percentage, lighting condition, and intersection angle significantly affect intersection safety. Important differences in car, car-truck, and truck crash frequencies with respect to various risk factors were found to exist between models. The paper provides some new or more comprehensive observations that have not been covered in previous studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Lo, Kenneth
2011-01-01
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components. PMID:22125375
Lo, Kenneth; Gottardo, Raphael
2012-01-01
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.
Spatial hydrological drought characteristics in Karkheh River basin, southwest Iran using copulas
NASA Astrophysics Data System (ADS)
Dodangeh, Esmaeel; Shahedi, Kaka; Shiau, Jenq-Tzong; MirAkbari, Maryam
2017-08-01
Investigation on drought characteristics such as severity, duration, and frequency is crucial for water resources planning and management in a river basin. While the methodology for multivariate drought frequency analysis is well established by applying the copulas, the estimation on the associated parameters by various parameter estimation methods and the effects on the obtained results have not yet been investigated. This research aims at conducting a comparative analysis between the maximum likelihood parametric and non-parametric method of the Kendall τ estimation method for copulas parameter estimation. The methods were employed to study joint severity-duration probability and recurrence intervals in Karkheh River basin (southwest Iran) which is facing severe water-deficit problems. Daily streamflow data at three hydrological gauging stations (Tang Sazbon, Huleilan and Polchehr) near the Karkheh dam were used to draw flow duration curves (FDC) of these three stations. The Q_{75} index extracted from the FDC were set as threshold level to abstract drought characteristics such as drought duration and severity on the basis of the run theory. Drought duration and severity were separately modeled using the univariate probabilistic distributions and gamma-GEV, LN2-exponential, and LN2-gamma were selected as the best paired drought severity-duration inputs for copulas according to the Akaike Information Criteria (AIC), Kolmogorov-Smirnov and chi-square tests. Archimedean Clayton, Frank, and extreme value Gumbel copulas were employed to construct joint cumulative distribution functions (JCDF) of droughts for each station. Frank copula at Tang Sazbon and Gumbel at Huleilan and Polchehr stations were identified as the best copulas based on the performance evaluation criteria including AIC, BIC, log-likelihood and root mean square error (RMSE) values. Based on the RMSE values, nonparametric Kendall-τ is preferred to the parametric maximum likelihood estimation method. The results showed greater drought return periods by the parametric ML method in comparison to the nonparametric Kendall τ estimation method. The results also showed that stations located in tributaries (Huleilan and Polchehr) have close return periods, while the station along the main river (Tang Sazbon) has the smaller return periods for the drought events with identical drought duration and severity.
Fully probabilistic earthquake source inversion on teleseismic scales
NASA Astrophysics Data System (ADS)
Stähler, Simon; Sigloch, Karin
2017-04-01
Seismic source inversion is a non-linear problem in seismology where not just the earthquake parameters but also estimates of their uncertainties are of great practical importance. We have developed a method of fully Bayesian inference for source parameters, based on measurements of waveform cross-correlation between broadband, teleseismic body-wave observations and their modelled counterparts. This approach yields not only depth and moment tensor estimates but also source time functions. These unknowns are parameterised efficiently by harnessing as prior knowledge solutions from a large number of non-Bayesian inversions. The source time function is expressed as a weighted sum of a small number of empirical orthogonal functions, which were derived from a catalogue of >1000 source time functions (STFs) by a principal component analysis. We use a likelihood model based on the cross-correlation misfit between observed and predicted waveforms. The resulting ensemble of solutions provides full uncertainty and covariance information for the source parameters, and permits propagating these source uncertainties into travel time estimates used for seismic tomography. The computational effort is such that routine, global estimation of earthquake mechanisms and source time functions from teleseismic broadband waveforms is feasible. A prerequisite for Bayesian inference is the proper characterisation of the noise afflicting the measurements. We show that, for realistic broadband body-wave seismograms, the systematic error due to an incomplete physical model affects waveform misfits more strongly than random, ambient background noise. In this situation, the waveform cross-correlation coefficient CC, or rather its decorrelation D = 1 - CC, performs more robustly as a misfit criterion than ℓp norms, more commonly used as sample-by-sample measures of misfit based on distances between individual time samples. From a set of over 900 user-supervised, deterministic earthquake source solutions treated as a quality-controlled reference, we derive the noise distribution on signal decorrelation D of the broadband seismogram fits between observed and modelled waveforms. The noise on D is found to approximately follow a log-normal distribution, a fortunate fact that readily accommodates the formulation of an empirical likelihood function for D for our multivariate problem. The first and second moments of this multivariate distribution are shown to depend mostly on the signal-to-noise ratio (SNR) of the CC measurements and on the back-azimuthal distances of seismic stations. References: Stähler, S. C. and Sigloch, K.: Fully probabilistic seismic source inversion - Part 1: Efficient parameterisation, Solid Earth, 5, 1055-1069, doi:10.5194/se-5-1055-2014, 2014. Stähler, S. C. and Sigloch, K.: Fully probabilistic seismic source inversion - Part 2: Modelling errors and station covariances, Solid Earth, 7, 1521-1536, doi:10.5194/se-7-1521-2016, 2016.
Diagnosis of intrauterine growth restriction: comparison of ultrasound parameters.
Ott, William J
2002-04-01
The objective of this study is an attempt to evaluate the best ultrasonic method of diagnosing intrauterine growth restriction (IUGR); a retrospective study of patients with singleton pregnancies who had been scanned at the author's institution within 2 weeks of their delivery was undertaken. Estimated fetal weight, abdominal circumference, head circumference/abdominal circumference ratio, abdominal circumference/femur length ratio, and umbilical artery S/D ratio were compared for accuracy in prediction IUGR in the neonate using both univariant and multivariant statistical analysis. Five hundred one (501) patients were analyzed. One hundred fourteen (114) neonates were classified as IUGR (22.8%). Doppler evaluation of the umbilical artery showed the best sensitivity while both abdominal circumference alone and estimated fetal weight showed similar specificity, positive and negative predictive value, and lowest false-positive and -negative results. Logistic regression analysis confirmed the univariant results and showed that, when used in combination, abdominal circumference and Doppler, or estimated fetal weight and Doppler resulted in the best predictive values. Either estimated fetal weight or abdominal circumference (alone) are accurate predictors of IUGR. Combined with Doppler studies of the umbilical artery either method will provide accurate evaluation of suspected IUGR.
Robertson, David S; Prevost, A Toby; Bowden, Jack
2016-10-01
The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been proposed for a wide variety of trial settings, but where the population parameter estimates are assumed to be independent. We relax this assumption and derive the UMVCUE in the multivariate normal setting with an arbitrary known covariance structure. One area of application is the estimation of odds ratios (ORs) when combining a genome-wide scan with a replication study. Our framework explicitly accounts for correlated single nucleotide polymorphisms, as might occur due to linkage disequilibrium. We illustrate our approach on the measurement of the association between 11 genetic variants and the risk of Crohn's disease, as reported in Parkes and others (2007. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nat. Gen. 39: (7), 830-832.), and show that the estimated ORs can vary substantially if both selection and correlation are taken into account. © The Author 2016. Published by Oxford University Press.
Upper Gastrointestinal Hemorrhage: Development of the Severity Score.
Chaikitamnuaychok, Rangson; Patumanond, Jayanton
2012-12-01
Emergency endoscopy for every patient with upper gastrointestinal hemorrhage is not possible in many medical centers. Simple guidelines to select patients for emergency endoscopy are lacking. The aim of the present report is to develop a simple scoring system to classify upper gastrointestinal hemorrhage (UGIH) severity based on patient clinical profiles at the emergency departments. Retrospective data of patients with UGIH in a university affiliated hospital were analyzed. Patients were criterion-classified into 3 severity levels: mild, moderate and severe. Clinical and laboratory information were compared among the 3 groups. Significant parameters were selected as indicators of severity. Coefficients of significant multivariable parameters were transformed into item scores, which added up as individual severity scores. The scores were used to classify patients into 3 urgency levels: non-urgent, urgent and emergent groups. Score-classification and criterion-classification were compared. Significant parameters in the model were age ≥ 60 years, pulse rate ≥ 100/min, systolic blood pressure < 100 mmHg, hemoglobin < 10 g/dL, blood urea nitrogen ≥ 35 mg/dL, presence of cirrhosis and hepatic failure. The score ranged from 0 to 27, and classifying patients into 3 urgency groups: non-urgent (score < 4, n = 215, 21.2%), urgent (score 4 - 16, n = 677, 66.9%) and emergent (score > 16, n = 121, 11.9%). The score correctly classified 81.4% of the patients into their original (criterion-classified) severity groups. Under-estimation (7.5%) and over-estimation (11.1%) were clinically acceptable. Our UGIH severity scoring system classified patients into 3 urgency groups: non-urgent, urgent and emergent, with clinically acceptable small number of under- and over-estimations. Its discriminative ability and precision should be validated before adopting into clinical practice.
On the degrees of freedom of reduced-rank estimators in multivariate regression
Mukherjee, A.; Chen, K.; Wang, N.; Zhu, J.
2015-01-01
Summary We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example. PMID:26702155
Generalized semiparametric varying-coefficient models for longitudinal data
NASA Astrophysics Data System (ADS)
Qi, Li
In this dissertation, we investigate the generalized semiparametric varying-coefficient models for longitudinal data that can flexibly model three types of covariate effects: time-constant effects, time-varying effects, and covariate-varying effects, i.e., the covariate effects that depend on other possibly time-dependent exposure variables. First, we consider the model that assumes the time-varying effects are unspecified functions of time while the covariate-varying effects are parametric functions of an exposure variable specified up to a finite number of unknown parameters. The estimation procedures are developed using multivariate local linear smoothing and generalized weighted least squares estimation techniques. The asymptotic properties of the proposed estimators are established. The simulation studies show that the proposed methods have satisfactory finite sample performance. ACTG 244 clinical trial of HIV infected patients are applied to examine the effects of antiretroviral treatment switching before and after HIV developing the 215-mutation. Our analysis shows benefit of treatment switching before developing the 215-mutation. The proposed methods are also applied to the STEP study with MITT cases showing that they have broad applications in medical research.
Isberg, S R; Thomson, P C; Nicholas, F W; Barker, S G; Moran, C
2005-12-01
Crocodile morphometric (head, snout-vent and total length) measurements were recorded at three stages during the production chain: hatching, inventory [average age (+/-SE) is 265.1 +/- 0.4 days] and slaughter (average age is 1037.8 +/- 0.4 days). Crocodile skins are used for the manufacture of exclusive leather products, with the most common-sized skin sold having 35-45 cm in belly width. One of the breeding objectives for inclusion into a multitrait genetic improvement programme for saltwater crocodiles is the time taken for a juvenile to reach this size or age at slaughter. A multivariate restricted maximum likelihood analysis provided (co)variance components for estimating the first published genetic parameter estimates for these traits. Heritability (+/-SE) estimates for the traits hatchling snout-vent length, inventory head length and age at slaughter were 0.60 (0.15), 0.59 (0.12) and 0.40 (0.10) respectively. There were strong negative genetic (-0.81 +/- 0.08) and phenotypic (-0.82 +/- 0.02) correlations between age at slaughter and inventory head length.
Beer fermentation: monitoring of process parameters by FT-NIR and multivariate data analysis.
Grassi, Silvia; Amigo, José Manuel; Lyndgaard, Christian Bøge; Foschino, Roberto; Casiraghi, Ernestina
2014-07-15
This work investigates the capability of Fourier-Transform near infrared (FT-NIR) spectroscopy to monitor and assess process parameters in beer fermentation at different operative conditions. For this purpose, the fermentation of wort with two different yeast strains and at different temperatures was monitored for nine days by FT-NIR. To correlate the collected spectra with °Brix, pH and biomass, different multivariate data methodologies were applied. Principal component analysis (PCA), partial least squares (PLS) and locally weighted regression (LWR) were used to assess the relationship between FT-NIR spectra and the abovementioned process parameters that define the beer fermentation. The accuracy and robustness of the obtained results clearly show the suitability of FT-NIR spectroscopy, combined with multivariate data analysis, to be used as a quality control tool in the beer fermentation process. FT-NIR spectroscopy, when combined with LWR, demonstrates to be a perfectly suitable quantitative method to be implemented in the production of beer. Copyright © 2014 Elsevier Ltd. All rights reserved.
Improved Correction of Misclassification Bias With Bootstrap Imputation.
van Walraven, Carl
2018-07-01
Diagnostic codes used in administrative database research can create bias due to misclassification. Quantitative bias analysis (QBA) can correct for this bias, requires only code sensitivity and specificity, but may return invalid results. Bootstrap imputation (BI) can also address misclassification bias but traditionally requires multivariate models to accurately estimate disease probability. This study compared misclassification bias correction using QBA and BI. Serum creatinine measures were used to determine severe renal failure status in 100,000 hospitalized patients. Prevalence of severe renal failure in 86 patient strata and its association with 43 covariates was determined and compared with results in which renal failure status was determined using diagnostic codes (sensitivity 71.3%, specificity 96.2%). Differences in results (misclassification bias) were then corrected with QBA or BI (using progressively more complex methods to estimate disease probability). In total, 7.4% of patients had severe renal failure. Imputing disease status with diagnostic codes exaggerated prevalence estimates [median relative change (range), 16.6% (0.8%-74.5%)] and its association with covariates [median (range) exponentiated absolute parameter estimate difference, 1.16 (1.01-2.04)]. QBA produced invalid results 9.3% of the time and increased bias in estimates of both disease prevalence and covariate associations. BI decreased misclassification bias with increasingly accurate disease probability estimates. QBA can produce invalid results and increase misclassification bias. BI avoids invalid results and can importantly decrease misclassification bias when accurate disease probability estimates are used.
Di Nuovo, Alessandro G; Di Nuovo, Santo; Buono, Serafino
2012-02-01
The estimation of a person's intelligence quotient (IQ) by means of psychometric tests is indispensable in the application of psychological assessment to several fields. When complex tests as the Wechsler scales, which are the most commonly used and universally recognized parameter for the diagnosis of degrees of retardation, are not applicable, it is necessary to use other psycho-diagnostic tools more suited for the subject's specific condition. But to ensure a homogeneous diagnosis it is necessary to reach a common metric, thus, the aim of our work is to build models able to estimate accurately and reliably the Wechsler IQ, starting from different psycho-diagnostic tools. Four different psychometric tests (Leiter international performance scale; coloured progressive matrices test; the mental development scale; psycho educational profile), along with the Wechsler scale, were administered to a group of 40 mentally retarded subjects, with various pathologies, and control persons. The obtained database is used to evaluate Wechsler IQ estimation models starting from the scores obtained in the other tests. Five modelling methods, two statistical and three from machine learning, that belong to the family of artificial neural networks (ANNs) are employed to build the estimator. Several error metrics for estimated IQ and for retardation level classification are defined to compare the performance of the various models with univariate and multivariate analyses. Eight empirical studies show that, after ten-fold cross-validation, best average estimation error is of 3.37 IQ points and mental retardation level classification error of 7.5%. Furthermore our experiments prove the superior performance of ANN methods over statistical regression ones, because in all cases considered ANN models show the lowest estimation error (from 0.12 to 0.9 IQ points) and the lowest classification error (from 2.5% to 10%). Since the estimation performance is better than the confidence interval of Wechsler scales (five IQ points), we consider models built very accurate and reliable and they can be used into help clinical diagnosis. Therefore a computer software based on the results of our work is currently used in a clinical center and empirical trails confirm its validity. Furthermore positive results in our multivariate studies suggest new approaches for clinicians. Copyright © 2011 Elsevier B.V. All rights reserved.
Deterministic annealing for density estimation by multivariate normal mixtures
NASA Astrophysics Data System (ADS)
Kloppenburg, Martin; Tavan, Paul
1997-03-01
An approach to maximum-likelihood density estimation by mixtures of multivariate normal distributions for large high-dimensional data sets is presented. Conventionally that problem is tackled by notoriously unstable expectation-maximization (EM) algorithms. We remove these instabilities by the introduction of soft constraints, enabling deterministic annealing. Our developments are motivated by the proof that algorithmically stable fuzzy clustering methods that are derived from statistical physics analogs are special cases of EM procedures.
Spatio-temporal interpolation of precipitation during monsoon periods in Pakistan
NASA Astrophysics Data System (ADS)
Hussain, Ijaz; Spöck, Gunter; Pilz, Jürgen; Yu, Hwa-Lung
2010-08-01
Spatio-temporal estimation of precipitation over a region is essential to the modeling of hydrologic processes for water resources management. The changes of magnitude and space-time heterogeneity of rainfall observations make space-time estimation of precipitation a challenging task. In this paper we propose a Box-Cox transformed hierarchical Bayesian multivariate spatio-temporal interpolation method for the skewed response variable. The proposed method is applied to estimate space-time monthly precipitation in the monsoon periods during 1974-2000, and 27-year monthly average precipitation data are obtained from 51 stations in Pakistan. The results of transformed hierarchical Bayesian multivariate spatio-temporal interpolation are compared to those of non-transformed hierarchical Bayesian interpolation by using cross-validation. The software developed by [11] is used for Bayesian non-stationary multivariate space-time interpolation. It is observed that the transformed hierarchical Bayesian method provides more accuracy than the non-transformed hierarchical Bayesian method.
NASA Astrophysics Data System (ADS)
Schwartz, Craig R.; Thelen, Brian J.; Kenton, Arthur C.
1995-06-01
A statistical parametric multispectral sensor performance model was developed by ERIM to support mine field detection studies, multispectral sensor design/performance trade-off studies, and target detection algorithm development. The model assumes target detection algorithms and their performance models which are based on data assumed to obey multivariate Gaussian probability distribution functions (PDFs). The applicability of these algorithms and performance models can be generalized to data having non-Gaussian PDFs through the use of transforms which convert non-Gaussian data to Gaussian (or near-Gaussian) data. An example of one such transform is the Box-Cox power law transform. In practice, such a transform can be applied to non-Gaussian data prior to the introduction of a detection algorithm that is formally based on the assumption of multivariate Gaussian data. This paper presents an extension of these techniques to the case where the joint multivariate probability density function of the non-Gaussian input data is known, and where the joint estimate of the multivariate Gaussian statistics, under the Box-Cox transform, is desired. The jointly estimated multivariate Gaussian statistics can then be used to predict the performance of a target detection algorithm which has an associated Gaussian performance model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tucker, Susan L., E-mail: sltucker@mdanderson.org; Dong, Lei; Michalski, Jeff M.
2012-10-01
Purpose: To investigate whether the volumes of rectum exposed to intermediate doses, from 30 to 50 Gy, contribute to the risk of Grade {>=}2 late rectal toxicity among patients with prostate cancer receiving radiotherapy. Methods and Materials: Data from 1009 patients treated on Radiation Therapy Oncology Group protocol 94-06 were analyzed using three approaches. First, the contribution of intermediate doses to a previously published fit of the Lyman-Kutcher-Burman (LKB) normal tissue complication probability (NTCP) model was determined. Next, the extent to which intermediate doses provide additional risk information, after taking the LKB model into account, was investigated. Third, the proportionmore » of rectum receiving doses higher than a threshold, VDose, was computed for doses ranging from 5 to 85 Gy, and a multivariate Cox proportional hazards model was used to determine which of these parameters were significantly associated with time to Grade {>=}2 late rectal toxicity. Results: Doses <60 Gy had no detectable impact on the fit of the LKB model, as expected on the basis of the small estimate of the volume parameter (n = 0.077). Furthermore, there was no detectable difference in late rectal toxicity among cohorts with similar risk estimates from the LKB model but with different volumes of rectum exposed to intermediate doses. The multivariate Cox proportional hazards model selected V75 as the only value of VDose significantly associated with late rectal toxicity. Conclusions: There is no evidence from these data that intermediate doses influence the risk of Grade {>=}2 late rectal toxicity. Instead, the critical doses for this endpoint seem to be {>=}75 Gy. It is hypothesized that cases of Grade {>=}2 late rectal toxicity occurring among patients with V75 less than approximately 12% may be due to a 'background' level of risk, likely due mainly to biological factors.« less
2014-01-01
Background Network meta-analysis (NMA) enables simultaneous comparison of multiple treatments while preserving randomisation. When summarising evidence to inform an economic evaluation, it is important that the analysis accurately reflects the dependency structure within the data, as correlations between outcomes may have implication for estimating the net benefit associated with treatment. A multivariate NMA offers a framework for evaluating multiple treatments across multiple outcome measures while accounting for the correlation structure between outcomes. Methods The standard NMA model is extended to multiple outcome settings in two stages. In the first stage, information is borrowed across outcomes as well across studies through modelling the within-study and between-study correlation structure. In the second stage, we make use of the additional assumption that intervention effects are exchangeable between outcomes to predict effect estimates for all outcomes, including effect estimates on outcomes where evidence is either sparse or the treatment had not been considered by any one of the studies included in the analysis. We apply the methods to binary outcome data from a systematic review evaluating the effectiveness of nine home safety interventions on uptake of three poisoning prevention practices (safe storage of medicines, safe storage of other household products, and possession of poison centre control telephone number) in households with children. Analyses are conducted in WinBUGS using Markov Chain Monte Carlo (MCMC) simulations. Results Univariate and the first stage multivariate models produced broadly similar point estimates of intervention effects but the uncertainty around the multivariate estimates varied depending on the prior distribution specified for the between-study covariance structure. The second stage multivariate analyses produced more precise effect estimates while enabling intervention effects to be predicted for all outcomes, including intervention effects on outcomes not directly considered by the studies included in the analysis. Conclusions Accounting for the dependency between outcomes in a multivariate meta-analysis may or may not improve the precision of effect estimates from a network meta-analysis compared to analysing each outcome separately. PMID:25047164
A reduced adaptive observer for multivariable systems. [using reduced dynamic ordering
NASA Technical Reports Server (NTRS)
Carroll, R. L.; Lindorff, D. P.
1973-01-01
An adaptive observer for multivariable systems is presented for which the dynamic order of the observer is reduced, subject to mild restrictions. The observer structure depends directly upon the multivariable structure of the system rather than a transformation to a single-output system. The number of adaptive gains is at most the sum of the order of the system and the number of input parameters being adapted. Moreover, for the relatively frequent specific cases for which the number of required adaptive gains is less than the sum of system order and input parameters, the number of these gains is easily determined by inspection of the system structure. This adaptive observer possesses all the properties ascribed to the single-input single-output adpative observer. Like the other adaptive observers some restriction is required of the allowable system command input to guarantee convergence of the adaptive algorithm, but the restriction is more lenient than that required by the full-order multivariable observer. This reduced observer is not restricted to cycle systems.
Meeker, Daniella; Jiang, Xiaoqian; Matheny, Michael E; Farcas, Claudiu; D'Arcy, Michel; Pearlman, Laura; Nookala, Lavanya; Day, Michele E; Kim, Katherine K; Kim, Hyeoneui; Boxwala, Aziz; El-Kareh, Robert; Kuo, Grace M; Resnic, Frederic S; Kesselman, Carl; Ohno-Machado, Lucila
2015-11-01
Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner. The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies. Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network. The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws. Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
A refined method for multivariate meta-analysis and meta-regression.
Jackson, Daniel; Riley, Richard D
2014-02-20
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.
Andermahr, J; Greb, A; Hensler, T; Helling, H J; Bouillon, B; Sauerland, S; Rehm, K E; Neugebauer, E
2002-05-01
In a prospective trial 266 multiple injured patients were included to evaluate clinical risk factors and immune parameters related to pneumonia. Clinical and humoral parameters were assessed and multivariate analysis performed. The multivariate analysis (odds ratio with 95% confidence interval (CI)) revealed male gender (3.65), traumatic brain injury (TBI) (2.52), thorax trauma (AIS(thorax) > or = 3) (2.05), antibiotic prophylaxis (1.30), injury severity score (ISS) (1.03 per ISS point) and the age (1.02 per year) as risk factors for pneumonia. The main pathogens were Acinetobacter Baumannii (40%) and Staphylococcus aureus (25%). A tendency towards higher Procalcitonin (PCT) and Interleukin (IL)-6 levels two days after trauma was observed for pneumonia patients. The immune parameters (PCT, IL-6, IL-10, soluble tumor necrosis factor p-55 and p-75) could not confirm the diagnosis of pneumonia earlier than the clinical parameters.
Predicting Subnational Ebola Virus Disease Epidemic Dynamics from Sociodemographic Indicators
Valeri, Linda; Patterson-Lomba, Oscar; Gurmu, Yared; Ablorh, Akweley; Bobb, Jennifer; Townes, F. William; Harling, Guy
2016-01-01
Background The recent Ebola virus disease (EVD) outbreak in West Africa has spread wider than any previous human EVD epidemic. While individual-level risk factors that contribute to the spread of EVD have been studied, the population-level attributes of subnational regions associated with outbreak severity have not yet been considered. Methods To investigate the area-level predictors of EVD dynamics, we integrated time series data on cumulative reported cases of EVD from the World Health Organization and covariate data from the Demographic and Health Surveys. We first estimated the early growth rates of epidemics in each second-level administrative district (ADM2) in Guinea, Sierra Leone and Liberia using exponential, logistic and polynomial growth models. We then evaluated how these growth rates, as well as epidemic size within ADM2s, were ecologically associated with several demographic and socio-economic characteristics of the ADM2, using bivariate correlations and multivariable regression models. Results The polynomial growth model appeared to best fit the ADM2 epidemic curves, displaying the lowest residual standard error. Each outcome was associated with various regional characteristics in bivariate models, however in stepwise multivariable models only mean education levels were consistently associated with a worse local epidemic. Discussion By combining two common methods—estimation of epidemic parameters using mathematical models, and estimation of associations using ecological regression models—we identified some factors predicting rapid and severe EVD epidemics in West African subnational regions. While care should be taken interpreting such results as anything more than correlational, we suggest that our approach of using data sources that were publicly available in advance of the epidemic or in real-time provides an analytic framework that may assist countries in understanding the dynamics of future outbreaks as they occur. PMID:27732614
NASA Astrophysics Data System (ADS)
Mehdizadeh, Saeid; Behmanesh, Javad; Khalili, Keivan
2017-07-01
Soil temperature (T s) and its thermal regime are the most important factors in plant growth, biological activities, and water movement in soil. Due to scarcity of the T s data, estimation of soil temperature is an important issue in different fields of sciences. The main objective of the present study is to investigate the accuracy of multivariate adaptive regression splines (MARS) and support vector machine (SVM) methods for estimating the T s. For this aim, the monthly mean data of the T s (at depths of 5, 10, 50, and 100 cm) and meteorological parameters of 30 synoptic stations in Iran were utilized. To develop the MARS and SVM models, various combinations of minimum, maximum, and mean air temperatures (T min, T max, T); actual and maximum possible sunshine duration; sunshine duration ratio (n, N, n/N); actual, net, and extraterrestrial solar radiation data (R s, R n, R a); precipitation (P); relative humidity (RH); wind speed at 2 m height (u 2); and water vapor pressure (Vp) were used as input variables. Three error statistics including root-mean-square-error (RMSE), mean absolute error (MAE), and determination coefficient (R 2) were used to check the performance of MARS and SVM models. The results indicated that the MARS was superior to the SVM at different depths. In the test and validation phases, the most accurate estimations for the MARS were obtained at the depth of 10 cm for T max, T min, T inputs (RMSE = 0.71 °C, MAE = 0.54 °C, and R 2 = 0.995) and for RH, V p, P, and u 2 inputs (RMSE = 0.80 °C, MAE = 0.61 °C, and R 2 = 0.996), respectively.
Chung, Ho Seok; Hwang, Eu Chang; Yu, Ho Song; Jung, Seung Il; Lee, Sun Ju; Lim, Dong Hoon; Cho, Won Jin; Choe, Hyun Sop; Lee, Seung-Ju; Park, Sung Woon
2018-03-01
To estimate the prevalence of fluoroquinolone-resistant rectal flora in patients undergoing transrectal ultrasound-guided prostate needle biopsy and to identify the high-risk groups. From January 2015 to March 2016, rectal swabs of 557 men who underwent transrectal ultrasound-guided prostate needle biopsy were obtained from five institutions. Clinical variables, including demographics, rectal swab culture results and infectious complications, were evaluated. Univariable and multivariable analyses were used to identify the risk factors for fluoroquinolone resistance of rectal flora and infectious complications. The incidence of fluoroquinolone-resistant and extended-spectrum beta-lactamase production was 48.1 and 11.8%, respectively. The most common fluoroquinolone-resistant bacteria was Escherichia coli (81% of total fluoroquinolone-resistant bacteria, 39% of total rectal flora), and 16 (2.9%) patients had infectious complications. Univariable and multivariable analysis of clinical parameters affecting fluoroquinolone resistance showed no factor associated with fluoroquinolone resistance of rectal flora. The clinical parameter related to infectious complications after prostate biopsy was a history of operation within 6 months (relative risk 6.60; 95% confidence interval 1.99-21.8, P = 0.002). These findings suggest that a risk-based approach by history taking cannot predict antibiotic resistance of rectal flora, and physicians should consider targeted antibiotic prophylaxis or extended antibiotic prophylaxis for Korean patients undergoing transrectal ultrasound-guided prostate biopsy because of high antibiotic resistance of rectal flora. © 2017 The Japanese Urological Association.
Sensitivity Analysis in Sequential Decision Models.
Chen, Qiushi; Ayer, Turgay; Chhatwal, Jagpreet
2017-02-01
Sequential decision problems are frequently encountered in medical decision making, which are commonly solved using Markov decision processes (MDPs). Modeling guidelines recommend conducting sensitivity analyses in decision-analytic models to assess the robustness of the model results against the uncertainty in model parameters. However, standard methods of conducting sensitivity analyses cannot be directly applied to sequential decision problems because this would require evaluating all possible decision sequences, typically in the order of trillions, which is not practically feasible. As a result, most MDP-based modeling studies do not examine confidence in their recommended policies. In this study, we provide an approach to estimate uncertainty and confidence in the results of sequential decision models. First, we provide a probabilistic univariate method to identify the most sensitive parameters in MDPs. Second, we present a probabilistic multivariate approach to estimate the overall confidence in the recommended optimal policy considering joint uncertainty in the model parameters. We provide a graphical representation, which we call a policy acceptability curve, to summarize the confidence in the optimal policy by incorporating stakeholders' willingness to accept the base case policy. For a cost-effectiveness analysis, we provide an approach to construct a cost-effectiveness acceptability frontier, which shows the most cost-effective policy as well as the confidence in that for a given willingness to pay threshold. We demonstrate our approach using a simple MDP case study. We developed a method to conduct sensitivity analysis in sequential decision models, which could increase the credibility of these models among stakeholders.
Schumacher, Carsten; Eismann, Hendrik; Sieg, Lion; Friedrich, Lars; Scheinichen, Dirk; Vondran, Florian W R; Johanning, Kai
2018-01-01
Liver transplantation is a complex intervention, and early anticipation of personnel and logistic requirements is of great importance. Early identification of high-risk patients could prove useful. We therefore evaluated prognostic values of recipient parameters commonly available in the early preoperative stage regarding postoperative 30- and 90-day outcomes and intraoperative transfusion requirements in liver transplantation. All adult patients undergoing first liver transplantation at Hannover Medical School between January 2005 and December 2010 were included in this retrospective study. Demographic, clinical, and laboratory data as well as clinical courses were recorded. Prognostic values regarding 30- and 90-day outcomes were evaluated by uni- and multivariate statistical tests. Identified risk parameters were used to calculate risk scores. There were 426 patients (40.4% female) included with a mean age of 48.6 (11.9) years. Absolute 30-day mortality rate was 9.9%, and absolute 90-day mortality rate was 13.4%. Preoperative leukocyte count >5200/μL, platelet count <91 000/μL, and creatinine values ≥77 μmol/L were relevant risk factors for both observation periods ( P < .05, respectively). A score based on these factors significantly differentiated between groups of varying postoperative outcomes and intraoperative transfusion requirements ( P < .05, respectively). A score based on preoperative creatinine, leukocyte, and platelet values allowed early estimation of postoperative 30- and 90-day outcomes and intraoperative transfusion requirements in liver transplantation. Results might help to improve timely logistic and personal strategies.
Seresht, L. Mousavi; Golparvar, Mohammad; Yaraghi, Ahmad
2014-01-01
Background: Appropriate determination of tidal volume (VT) is important for preventing ventilation induced lung injury. We compared hemodynamic and respiratory parameters in two conditions of receiving VTs calculated by using body weight (BW), which was estimated by measured height (HBW) or demi-span based body weight (DBW). Materials and Methods: This controlled-trial was conducted in St. Alzahra Hospital in 2009 on American Society of Anesthesiologists (ASA) I and II, 18-65-years-old patients. Standing height and weight were measured and then height was calculated using demi-span method. BW and VT were calculated with acute respiratory distress syndrome-net formula. Patients were randomized and then crossed to receive ventilation with both calculated VTs for 20 min. Hemodynamic and respiratory parameters were analyzed with SPSS version 20.0 using univariate and multivariate analyses. Results: Forty nine patients were studied. Demi-span based body weight and thus VT (DTV) were lower than Height based body weight and VT (HTV) (P = 0.028), in male patients (P = 0.005). Difference was observed in peak airway pressure (PAP) and airway resistance (AR) changes with higher PAP and AR at 20 min after receiving HTV compared with DTV. Conclusions: Estimated VT based on measured height is higher than that based on demi-span and this difference exists only in females, and this higher VT results higher airway pressures during mechanical ventilation. PMID:24627845
Seresht, L Mousavi; Golparvar, Mohammad; Yaraghi, Ahmad
2014-01-01
Appropriate determination of tidal volume (VT) is important for preventing ventilation induced lung injury. We compared hemodynamic and respiratory parameters in two conditions of receiving VTs calculated by using body weight (BW), which was estimated by measured height (HBW) or demi-span based body weight (DBW). This controlled-trial was conducted in St. Alzahra Hospital in 2009 on American Society of Anesthesiologists (ASA) I and II, 18-65-years-old patients. Standing height and weight were measured and then height was calculated using demi-span method. BW and VT were calculated with acute respiratory distress syndrome-net formula. Patients were randomized and then crossed to receive ventilation with both calculated VTs for 20 min. Hemodynamic and respiratory parameters were analyzed with SPSS version 20.0 using univariate and multivariate analyses. Forty nine patients were studied. Demi-span based body weight and thus VT (DTV) were lower than Height based body weight and VT (HTV) (P = 0.028), in male patients (P = 0.005). Difference was observed in peak airway pressure (PAP) and airway resistance (AR) changes with higher PAP and AR at 20 min after receiving HTV compared with DTV. Estimated VT based on measured height is higher than that based on demi-span and this difference exists only in females, and this higher VT results higher airway pressures during mechanical ventilation.
Larrosa, José Manuel; Moreno-Montañés, Javier; Martinez-de-la-Casa, José María; Polo, Vicente; Velázquez-Villoria, Álvaro; Berrozpe, Clara; García-Granero, Marta
2015-10-01
The purpose of this study was to develop and validate a multivariate predictive model to detect glaucoma by using a combination of retinal nerve fiber layer (RNFL), retinal ganglion cell-inner plexiform (GCIPL), and optic disc parameters measured using spectral-domain optical coherence tomography (OCT). Five hundred eyes from 500 participants and 187 eyes of another 187 participants were included in the study and validation groups, respectively. Patients with glaucoma were classified in five groups based on visual field damage. Sensitivity and specificity of all glaucoma OCT parameters were analyzed. Receiver operating characteristic curves (ROC) and areas under the ROC (AUC) were compared. Three predictive multivariate models (quantitative, qualitative, and combined) that used a combination of the best OCT parameters were constructed. A diagnostic calculator was created using the combined multivariate model. The best AUC parameters were: inferior RNFL, average RNFL, vertical cup/disc ratio, minimal GCIPL, and inferior-temporal GCIPL. Comparisons among the parameters did not show that the GCIPL parameters were better than those of the RNFL in early and advanced glaucoma. The highest AUC was in the combined predictive model (0.937; 95% confidence interval, 0.911-0.957) and was significantly (P = 0.0001) higher than the other isolated parameters considered in early and advanced glaucoma. The validation group displayed similar results to those of the study group. Best GCIPL, RNFL, and optic disc parameters showed a similar ability to detect glaucoma. The combined predictive formula improved the glaucoma detection compared to the best isolated parameters evaluated. The diagnostic calculator obtained good classification from participants in both the study and validation groups.
Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F
2017-04-01
Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.
TORABIPOUR, Amin; ZERAATI, Hojjat; ARAB, Mohammad; RASHIDIAN, Arash; AKBARI SARI, Ali; SARZAIEM, Mahmuod Reza
2016-01-01
Background: To determine the hospital required beds using stochastic simulation approach in cardiac surgery departments. Methods: This study was performed from Mar 2011 to Jul 2012 in three phases: First, collection data from 649 patients in cardiac surgery departments of two large teaching hospitals (in Tehran, Iran). Second, statistical analysis and formulate a multivariate linier regression model to determine factors that affect patient's length of stay. Third, develop a stochastic simulation system (from admission to discharge) based on key parameters to estimate required bed capacity. Results: Current cardiac surgery department with 33 beds can only admit patients in 90.7% of days. (4535 d) and will be required to over the 33 beds only in 9.3% of days (efficient cut off point). According to simulation method, studied cardiac surgery department will requires 41–52 beds for admission of all patients in the 12 next years. Finally, one-day reduction of length of stay lead to decrease need for two hospital beds annually. Conclusion: Variation of length of stay and its affecting factors can affect required beds. Statistic and stochastic simulation model are applied and useful methods to estimate and manage hospital beds based on key hospital parameters. PMID:27957466
Combining markers with and without the limit of detection
Dong, Ting; Liu, Catherine Chunling; Petricoin, Emanuel F.; Tang, Liansheng Larry
2014-01-01
In this paper, we consider the combination of markers with and without the limit of detection (LOD). LOD is often encountered when measuring proteomic markers. Because of the limited detecting ability of an equipment or instrument, it is difficult to measure markers at a relatively low level. Suppose that after some monotonic transformation, the marker values approximately follow multivariate normal distributions. We propose to estimate distribution parameters while taking the LOD into account, and then combine markers using the results from the linear discriminant analysis. Our simulation results show that the ROC curve parameter estimates generated from the proposed method are much closer to the truth than simply using the linear discriminant analysis to combine markers without considering the LOD. In addition, we propose a procedure to select and combine a subset of markers when many candidate markers are available. The procedure based on the correlation among markers is different from a common understanding that a subset of the most accurate markers should be selected for the combination. The simulation studies show that the accuracy of a combined marker can be largely impacted by the correlation of marker measurements. Our methods are applied to a protein pathway dataset to combine proteomic biomarkers to distinguish cancer patients from non-cancer patients. PMID:24132938
Geiser, Christian; Bishop, Jacob; Lockhart, Ginger; Shiffman, Saul; Grenard, Jerry L.
2013-01-01
Latent state-trait (LST) and latent growth curve (LGC) models are frequently used in the analysis of longitudinal data. Although it is well-known that standard single-indicator LGC models can be analyzed within either the structural equation modeling (SEM) or multilevel (ML; hierarchical linear modeling) frameworks, few researchers realize that LST and multivariate LGC models, which use multiple indicators at each time point, can also be specified as ML models. In the present paper, we demonstrate that using the ML-SEM rather than the SL-SEM framework to estimate the parameters of these models can be practical when the study involves (1) a large number of time points, (2) individually-varying times of observation, (3) unequally spaced time intervals, and/or (4) incomplete data. Despite the practical advantages of the ML-SEM approach under these circumstances, there are also some limitations that researchers should consider. We present an application to an ecological momentary assessment study (N = 158 youths with an average of 23.49 observations of positive mood per person) using the software Mplus (Muthén and Muthén, 1998–2012) and discuss advantages and disadvantages of using the ML-SEM approach to estimate the parameters of LST and multiple-indicator LGC models. PMID:24416023
Bolduc, David L; Bünger, Rolf; Moroni, Maria; Blakely, William F
2016-12-01
Multiple hematological biomarkers (i.e. complete blood counts and serum chemistry parameters) were used in a multivariate linear-regression fit to create predictive algorithms for estimating the severity of hematopoietic acute radiation syndrome (H-ARS) using two different species (i.e. Göttingen Minipig and non-human primate (NHP) (Macacca mulatta)). Biomarker data were analyzed prior to irradiation and between 1-60 days (minipig) and 1-30 days (NHP) after irradiation exposures of 1.6-3.5 Gy (minipig) and 6.5 Gy (NHP) 60 Co gamma ray doses at 0.5-0.6 Gy min -1 and 0.4 Gy min -1 , respectively. Fitted radiation risk and injury categorization (RRIC) values and RRIC prediction percent accuracies were compared between the two models. Both models estimated H-ARS severity with over 80% overall predictive power and with receiver operating characteristic curve area values of 0.884 and 0.825. These results based on two animal radiation models support the concept for the use of a hematopoietic-based algorithm for predicting the risk of H-ARS in humans. Published by Oxford University Press 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Kaneoke, Y; Urakawa, T; Kakigi, R
2009-05-19
We investigated whether direction information is represented in the population-level neural response evoked by the visual motion stimulus, as measured by magnetoencephalography. Coherent motions with varied speed, varied direction, and different coherence level were presented using random dot kinematography. Peak latency of responses to motion onset was inversely related to speed in all directions, as previously reported, but no significant effect of direction on latency changes was identified. Mutual information entropy (IE) calculated using four-direction response data increased significantly (>2.14) after motion onset in 41.3% of response data and maximum IE was distributed at approximately 20 ms after peak response latency. When response waveforms showing significant differences (by multivariate discriminant analysis) in distribution of the three waveform parameters (peak amplitude, peak latency, and 75% waveform width) with stimulus directions were analyzed, 87 waveform stimulus directions (80.6%) were correctly estimated using these parameters. Correct estimation rate was unaffected by stimulus speed, but was affected by coherence level, even though both speed and coherence affected response amplitude similarly. Our results indicate that speed and direction of stimulus motion are represented in the distinct properties of a response waveform, suggesting that the human brain processes speed and direction separately, at least in part.
Pastore, Francesco; Conson, Manuel; D'Avino, Vittoria; Palma, Giuseppe; Liuzzi, Raffaele; Solla, Raffaele; Farella, Antonio; Salvatore, Marco; Cella, Laura; Pacelli, Roberto
2016-01-01
Severe acute radiation-induced skin toxicity (RIST) after breast irradiation is a side effect impacting the quality of life in breast cancer (BC) patients. The aim of the present study was to develop normal tissue complication probability (NTCP) models of severe acute RIST in BC patients. We evaluated 140 consecutive BC patients undergoing conventional three-dimensional conformal radiotherapy (3D-CRT) after breast conserving surgery in a prospective study assessing acute RIST. The acute RIST was classified according to the RTOG scoring system. Dose-surface histograms (DSHs) of the body structure in the breast region were extracted as representative of skin irradiation. Patient, disease, and treatment-related characteristics were analyzed along with DSHs. NTCP modeling by Lyman-Kutcher-Burman (LKB) and by multivariate logistic regression using bootstrap resampling techniques was performed. Models were evaluated by Spearman's Rs coefficient and ROC area. By the end of radiotherapy, 139 (99%) patients developed any degree of acute RIST. G3 RIST was found in 11 of 140 (8%) patients. Mild-moderate (G1-G2) RIST was still present at 40 days after treatment in six (4%) patients. Using DSHs for LKB modeling of acute RIST severity (RTOG G3 vs. G0-2), parameter estimates were TD50=39 Gy, n=0.38 and m=0.14 [Rs = 0.25, area under the curve (AUC) = 0.77, p = 0.003]. On multivariate analysis, the most predictive model of acute RIST severity was a two-variable model including the skin receiving ≥30 Gy (S30) and psoriasis [Rs = 0.32, AUC = 0.84, p < 0.001]. Using body DSH as representative of skin dose, the LKB n parameter was consistent with a surface effect for the skin. A good prediction performance was obtained using a data-driven multivariate model including S30 and a pre-existing skin disease (psoriasis) as a clinical factor.
Time-varying Concurrent Risk of Extreme Droughts and Heatwaves in California
NASA Astrophysics Data System (ADS)
Sarhadi, A.; Diffenbaugh, N. S.; Ausin, M. C.
2016-12-01
Anthropogenic global warming has changed the nature and the risk of extreme climate phenomena such as droughts and heatwaves. The concurrent of these nature-changing climatic extremes may result in intensifying undesirable consequences in terms of human health and destructive effects in water resources. The present study assesses the risk of concurrent extreme droughts and heatwaves under dynamic nonstationary conditions arising from climate change in California. For doing so, a generalized fully Bayesian time-varying multivariate risk framework is proposed evolving through time under dynamic human-induced environment. In this methodology, an extreme, Bayesian, dynamic copula (Gumbel) is developed to model the time-varying dependence structure between the two different climate extremes. The time-varying extreme marginals are previously modeled using a Generalized Extreme Value (GEV) distribution. Bayesian Markov Chain Monte Carlo (MCMC) inference is integrated to estimate parameters of the nonstationary marginals and copula using a Gibbs sampling method. Modelled marginals and copula are then used to develop a fully Bayesian, time-varying joint return period concept for the estimation of concurrent risk. Here we argue that climate change has increased the chance of concurrent droughts and heatwaves over decades in California. It is also demonstrated that a time-varying multivariate perspective should be incorporated to assess realistic concurrent risk of the extremes for water resources planning and management in a changing climate in this area. The proposed generalized methodology can be applied for other stochastic nature-changing compound climate extremes that are under the influence of climate change.
Chen, Szu-Chia; Lin, Tsung-Hsien; Hsu, Po-Chao; Chang, Jer-Ming; Lee, Chee-Siong; Tsai, Wei-Chung; Su, Ho-Ming; Voon, Wen-Chol; Chen, Hung-Chun
2011-09-01
Heart failure and increased arterial stiffness are associated with declining renal function. Few studies have evaluated the association between left ventricular ejection fraction (LVEF) and brachial-ankle pulse-wave velocity (baPWV) and renal function progression. The aim of this study was to assess whether LVEF<40% and baPWV are associated with a decline in the estimated glomerular filtration rate (eGFR) and the progression to a renal end point of ≥25% decline in eGFR. This longitudinal study included 167 patients. The baPWV was measured with an ankle-brachial index-form device. The change in renal function was estimated by eGFR slope. The renal end point was defined as ≥25% decline in eGFR. Clinical and echocardiographic parameters were compared and analyzed. After a multivariate analysis, serum hematocrit was positively associated with eGFR slope, and diabetes mellitus, baPWV (P=0.031) and LVEF<40% (P=0.001) were negatively associated with eGFR slope. Forty patients reached the renal end point. Multivariate, forward Cox regression analysis found that lower serum albumin and hematocrit levels, higher triglyceride levels, higher baPWV (P=0.039) and LVEF<40% (P<0.001) were independently associated with progression to the renal end point. Our results show that LVEF<40% and increased baPWV are independently associated with renal function decline and progression to the renal end point.
Dual adaptive control: Design principles and applications
NASA Technical Reports Server (NTRS)
Mookerjee, Purusottam
1988-01-01
The design of an actively adaptive dual controller based on an approximation of the stochastic dynamic programming equation for a multi-step horizon is presented. A dual controller that can enhance identification of the system while controlling it at the same time is derived for multi-dimensional problems. This dual controller uses sensitivity functions of the expected future cost with respect to the parameter uncertainties. A passively adaptive cautious controller and the actively adaptive dual controller are examined. In many instances, the cautious controller is seen to turn off while the latter avoids the turn-off of the control and the slow convergence of the parameter estimates, characteristic of the cautious controller. The algorithms have been applied to a multi-variable static model which represents a simplified linear version of the relationship between the vibration output and the higher harmonic control input for a helicopter. Monte Carlo comparisons based on parametric and nonparametric statistical analysis indicate the superiority of the dual controller over the baseline controller.
NASA Astrophysics Data System (ADS)
Thelen, Brian J.; Xique, Ismael J.; Burns, Joseph W.; Goley, G. Steven; Nolan, Adam R.; Benson, Jonathan W.
2017-04-01
In Bayesian decision theory, there has been a great amount of research into theoretical frameworks and information- theoretic quantities that can be used to provide lower and upper bounds for the Bayes error. These include well-known bounds such as Chernoff, Battacharrya, and J-divergence. Part of the challenge of utilizing these various metrics in practice is (i) whether they are "loose" or "tight" bounds, (ii) how they might be estimated via either parametric or non-parametric methods, and (iii) how accurate the estimates are for limited amounts of data. In general what is desired is a methodology for generating relatively tight lower and upper bounds, and then an approach to estimate these bounds efficiently from data. In this paper, we explore the so-called triangle divergence which has been around for a while, but was recently made more prominent in some recent research on non-parametric estimation of information metrics. Part of this work is motivated by applications for quantifying fundamental information content in SAR/LIDAR data, and to help in this, we have developed a flexible multivariate modeling framework based on multivariate Gaussian copula models which can be combined with the triangle divergence framework to quantify this information, and provide approximate bounds on Bayes error. In this paper we present an overview of the bounds, including those based on triangle divergence and verify that under a number of multivariate models, the upper and lower bounds derived from triangle divergence are significantly tighter than the other common bounds, and often times, dramatically so. We also propose some simple but effective means for computing the triangle divergence using Monte Carlo methods, and then discuss estimation of the triangle divergence from empirical data based on Gaussian Copula models.
Oliveras, Anna; Armario, Pedro; Martell-Clarós, Nieves; Ruilope, Luis M; de la Sierra, Alejandro
2011-03-01
Microalbuminuria is a known marker of subclinical organ damage. Its prevalence is higher in patients with resistant hypertension than in subjects with blood pressure at goal. On the other hand, some patients with apparently well-controlled hypertension still have microalbuminuria. The current study aimed to determine the relationship between microalbuminuria and both office and 24-hour ambulatory blood pressure. A cohort of 356 patients (mean age 64 ± 11 years; 40.2% females) with resistant hypertension (blood pressure ≥ 140 and/or 90 mm Hg despite treatment with ≥ 3 drugs, diuretic included) were selected from Spanish hypertension units. Patients with estimated glomerular filtration rate <30 mL/min/1.73 m(2) were excluded. All patients underwent clinical and demographic evaluation, complete laboratory analyses, and good technical-quality 24-hour ambulatory blood pressure monitoring. Urinary albumin/creatinine ratio was averaged from 3 first-morning void urine samples. Microalbuminuria (urinary albumin/creatinine ratio ≥ 2.5 mg/mmol in males or ≥ 3.5 mg/mmol in females) was detected in 46.6%, and impaired renal function (estimated glomerular filtration rate <60 mL/min/1.73 m(2)) was detected in 26.8%. Bivariate analyses showed significant associations of microalbuminuria with older age, reduced estimated glomerular filtration rate, increased nighttime systolic blood pressure, and elevated daytime, nighttime, and 24-hour diastolic blood pressure. In a logistic regression analysis, after age and sex adjustment, elevated nighttime systolic blood pressure (multivariate odds ratio, 1.014 [95% CI, 1.001 to 1.026]; P=0.029) and reduced estimated glomerular filtration rate (multivariate odds ratio, 2.79 [95% CI, 1.57 to 4.96]; P=0.0005) were independently associated with the presence of microalbuminuria. We conclude that microalbuminuria is better associated with increased nighttime systolic blood pressure than with any other office and 24-hour ambulatory blood pressure monitoring parameters.
Riley, Richard D; Elia, Eleni G; Malin, Gemma; Hemming, Karla; Price, Malcolm P
2015-07-30
A prognostic factor is any measure that is associated with the risk of future health outcomes in those with existing disease. Often, the prognostic ability of a factor is evaluated in multiple studies. However, meta-analysis is difficult because primary studies often use different methods of measurement and/or different cut-points to dichotomise continuous factors into 'high' and 'low' groups; selective reporting is also common. We illustrate how multivariate random effects meta-analysis models can accommodate multiple prognostic effect estimates from the same study, relating to multiple cut-points and/or methods of measurement. The models account for within-study and between-study correlations, which utilises more information and reduces the impact of unreported cut-points and/or measurement methods in some studies. The applicability of the approach is improved with individual participant data and by assuming a functional relationship between prognostic effect and cut-point to reduce the number of unknown parameters. The models provide important inferential results for each cut-point and method of measurement, including the summary prognostic effect, the between-study variance and a 95% prediction interval for the prognostic effect in new populations. Two applications are presented. The first reveals that, in a multivariate meta-analysis using published results, the Apgar score is prognostic of neonatal mortality but effect sizes are smaller at most cut-points than previously thought. In the second, a multivariate meta-analysis of two methods of measurement provides weak evidence that microvessel density is prognostic of mortality in lung cancer, even when individual participant data are available so that a continuous prognostic trend is examined (rather than cut-points). © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Sampling effort affects multivariate comparisons of stream assemblages
Cao, Y.; Larsen, D.P.; Hughes, R.M.; Angermeier, P.L.; Patton, T.M.
2002-01-01
Multivariate analyses are used widely for determining patterns of assemblage structure, inferring species-environment relationships and assessing human impacts on ecosystems. The estimation of ecological patterns often depends on sampling effort, so the degree to which sampling effort affects the outcome of multivariate analyses is a concern. We examined the effect of sampling effort on site and group separation, which was measured using a mean similarity method. Two similarity measures, the Jaccard Coefficient and Bray-Curtis Index were investigated with 1 benthic macroinvertebrate and 2 fish data sets. Site separation was significantly improved with increased sampling effort because the similarity between replicate samples of a site increased more rapidly than between sites. Similarly, the faster increase in similarity between sites of the same group than between sites of different groups caused clearer separation between groups. The strength of site and group separation completely stabilized only when the mean similarity between replicates reached 1. These results are applicable to commonly used multivariate techniques such as cluster analysis and ordination because these multivariate techniques start with a similarity matrix. Completely stable outcomes of multivariate analyses are not feasible. Instead, we suggest 2 criteria for estimating the stability of multivariate analyses of assemblage data: 1) mean within-site similarity across all sites compared, indicating sample representativeness, and 2) the SD of within-site similarity across sites, measuring sample comparability.
NASA Astrophysics Data System (ADS)
Roldán, J. B.; Miranda, E.; González-Cordero, G.; García-Fernández, P.; Romero-Zaliz, R.; González-Rodelas, P.; Aguilera, A. M.; González, M. B.; Jiménez-Molinos, F.
2018-01-01
A multivariate analysis of the parameters that characterize the reset process in Resistive Random Access Memory (RRAM) has been performed. The different correlations obtained can help to shed light on the current components that contribute in the Low Resistance State (LRS) of the technology considered. In addition, a screening method for the Quantum Point Contact (QPC) current component is presented. For this purpose, the second derivative of the current has been obtained using a novel numerical method which allows determining the QPC model parameters. Once the procedure is completed, a whole Resistive Switching (RS) series of thousands of curves is studied by means of a genetic algorithm. The extracted QPC parameter distributions are characterized in depth to get information about the filamentary pathways associated with LRS in the low voltage conduction regime.
Lee, Sang-Mok; Choi, Hyuk Jin; Choi, Heejin; Kim, Mee Kum; Wee, Won Ryang
2016-10-07
BACKGROUND: Though the development and fitting of scleral contact lenses are expanding steadily, there is no simple method to provide scleral metrics for scleral contact lens fitting yet. The aim of this study was to establish formulae for estimation of the axial radius of curvature (ARC) of the anterior sclera using ocular biometric parameters that can be easily obtained with conventional devices. A semi-automated stitching method and a computational analysis tool for calculating ARC were developed by using the ImageJ and MATLAB software. The ARC of all the ocular surface points were analyzed from the composite horizontal cross-sectional images of the right eyes of 24 volunteers; these measurements were obtained using anterior segment optical coherence tomography for a previous study (AS-OCT; Visante). Ocular biometric parameters were obtained from the same volunteers with slit-scanning topography and partial coherence interferometry. Correlation analysis was performed between the ARC at 8 mm to the axis line (ARC[8]) and other ocular parameters (including age). With ARC obtained on several nasal and temporal points (7.0, 7.5, 8.0, 8.5, and 9.0 mm from the axis line), univariate and multivariate linear regression analyses were performed to develop a model for estimating ARC with the help of ocular biometric parameters. Axial length, spherical equivalent, and angle kappa showed correlations with temporal ARC[8] (tARC[8]; Pearson's r = 0.653, -0.579, and -0.341; P = 0.001, 0.015, and 0.015, respectively). White-to-white corneal diameter (WTW) and anterior chamber depth (ACD) showed correlation with nasal ARC[8] (nARC[8]; Pearson's r = -0.492 and -0.461; P = 0.015 and 0.023, respectively). The formulae for estimating scleral curvatures (tARC, nARC, and average ARC) were developed as a function of axial length, ACD, WTW, and distance from the axis line, with good determinant power (72 - 80 %; SPSS ver. 22.0). Angle kappa showed strong correlation with axial length (Pearson's r = -0.813, P <0.001), and the different correlation patterns of nasal and temporal ARC with axial length can be explained by the ocular surface deviation represented by angle kappa. Axial length, ACD, and WTW are useful parameters for estimating the ARC of the anterior sclera, which is important for the haptic design of scleral contact lenses. Angle kappa affects the discrepancies between the nasal and temporal scleral curvature.
Byun, Bo-Ram; Kim, Yong-Il; Yamaguchi, Tetsutaro; Maki, Koutaro; Ko, Ching-Chang; Hwang, Dea-Seok; Park, Soo-Byung; Son, Woo-Sung
2015-11-01
The purpose of this study was to establish multivariable regression models for the estimation of skeletal maturation status in Japanese boys and girls using the cone-beam computed tomography (CBCT)-based cervical vertebral maturation (CVM) assessment method and hand-wrist radiography. The analyzed sample consisted of hand-wrist radiographs and CBCT images from 47 boys and 57 girls. To quantitatively evaluate the correlation between the skeletal maturation status and measurement ratios, a CBCT-based CVM assessment method was applied to the second, third, and fourth cervical vertebrae. Pearson's correlation coefficient analysis and multivariable regression analysis were used to determine the ratios for each of the cervical vertebrae (p < 0.05). Four characteristic parameters ((OH2 + PH2)/W2, (OH2 + AH2)/W2, D2, AH3/W3), as independent variables, were used to build the multivariable regression models: for the Japanese boys, the skeletal maturation status according to the CBCT-based quantitative cervical vertebral maturation (QCVM) assessment was 5.90 + 99.11 × AH3/W3 - 14.88 × (OH2 + AH2)/W2 + 13.24 × D2; for the Japanese girls, it was 41.39 + 59.52 × AH3/W3 - 15.88 × (OH2 + PH2)/W2 + 10.93 × D2. The CBCT-generated CVM images proved very useful to the definition of the cervical vertebral body and the odontoid process. The newly developed CBCT-based QCVM assessment method showed a high correlation between the derived ratios from the second cervical vertebral body and odontoid process. There are high correlations between the skeletal maturation status and the ratios of the second cervical vertebra based on the remnant of dentocentral synchondrosis.
Real-time realizations of the Bayesian Infrasonic Source Localization Method
NASA Astrophysics Data System (ADS)
Pinsky, V.; Arrowsmith, S.; Hofstetter, A.; Nippress, A.
2015-12-01
The Bayesian Infrasonic Source Localization method (BISL), introduced by Mordak et al. (2010) and upgraded by Marcillo et al. (2014) is destined for the accurate estimation of the atmospheric event origin at local, regional and global scales by the seismic and infrasonic networks and arrays. The BISL is based on probabilistic models of the source-station infrasonic signal propagation time, picking time and azimuth estimate merged with a prior knowledge about celerity distribution. It requires at each hypothetical source location, integration of the product of the corresponding source-station likelihood functions multiplied by a prior probability density function of celerity over the multivariate parameter space. The present BISL realization is generally time-consuming procedure based on numerical integration. The computational scheme proposed simplifies the target function so that integrals are taken exactly and are represented via standard functions. This makes the procedure much faster and realizable in real-time without practical loss of accuracy. The procedure executed as PYTHON-FORTRAN code demonstrates high performance on a set of the model and real data.
Li, Haocheng; Zhang, Yukun; Carroll, Raymond J; Keadle, Sarah Kozey; Sampson, Joshua N; Matthews, Charles E
2017-11-10
A mixed effect model is proposed to jointly analyze multivariate longitudinal data with continuous, proportion, count, and binary responses. The association of the variables is modeled through the correlation of random effects. We use a quasi-likelihood type approximation for nonlinear variables and transform the proposed model into a multivariate linear mixed model framework for estimation and inference. Via an extension to the EM approach, an efficient algorithm is developed to fit the model. The method is applied to physical activity data, which uses a wearable accelerometer device to measure daily movement and energy expenditure information. Our approach is also evaluated by a simulation study. Copyright © 2017 John Wiley & Sons, Ltd.
Bayesian estimation of a source term of radiation release with approximately known nuclide ratios
NASA Astrophysics Data System (ADS)
Tichý, Ondřej; Šmídl, Václav; Hofman, Radek
2016-04-01
We are concerned with estimation of a source term in case of an accidental release from a known location, e.g. a power plant. Usually, the source term of an accidental release of radiation comprises of a mixture of nuclide. The gamma dose rate measurements do not provide a direct information on the source term composition. However, physical properties of respective nuclide (deposition properties, decay half-life) can be used when uncertain information on nuclide ratios is available, e.g. from known reactor inventory. The proposed method is based on linear inverse model where the observation vector y arise as a linear combination y = Mx of a source-receptor-sensitivity (SRS) matrix M and the source term x. The task is to estimate the unknown source term x. The problem is ill-conditioned and further regularization is needed to obtain a reasonable solution. In this contribution, we assume that nuclide ratios of the release is known with some degree of uncertainty. This knowledge is used to form the prior covariance matrix of the source term x. Due to uncertainty in the ratios the diagonal elements of the covariance matrix are considered to be unknown. Positivity of the source term estimate is guaranteed by using multivariate truncated Gaussian distribution. Following Bayesian approach, we estimate all parameters of the model from the data so that y, M, and known ratios are the only inputs of the method. Since the inference of the model is intractable, we follow the Variational Bayes method yielding an iterative algorithm for estimation of all model parameters. Performance of the method is studied on simulated 6 hour power plant release where 3 nuclide are released and 2 nuclide ratios are approximately known. The comparison with method with unknown nuclide ratios will be given to prove the usefulness of the proposed approach. This research is supported by EEA/Norwegian Financial Mechanism under project MSMT-28477/2014 Source-Term Determination of Radionuclide Releases by Inverse Atmospheric Dispersion Modelling (STRADI).
Friedrich, Reinhard E.; Schmidt, Kirsten; Treszl, András; Kersten, Jan F.
2016-01-01
Introduction: Surgical procedures require informed patient consent, which is mandatory prior to any procedure. These requirements apply in particular to elective surgical procedures. The communication with the patient about the procedure has to be comprehensive and based on mutual understanding. Furthermore, the informed consent has to take into account whether a patient is of legal age. As a result of large-scale migration, there are eventually patients planned for medical procedures, whose chronological age can’t be assessed reliably by physical inspection alone. Age determination based on assessing wisdom tooth development stages can be used to help determining whether individuals involved in medical procedures are of legal age, i.e., responsible and accountable. At present, the assessment of wisdom tooth developmental stages barely allows a crude estimate of an individual’s age. This study explores possibilities for more precise predictions of the age of individuals with emphasis on the legal age threshold of 18 years. Material and Methods: 1,900 dental orthopantomograms (female 938, male 962, age: 15–24 years), taken between the years 2000 and 2013 for diagnosis and treatment of diseases of the jaws, were evaluated. 1,895 orthopantomograms (female 935, male 960) of 1,804 patients (female 872, male 932) met the inclusion criteria. The archives of the Department of Diagnostic Radiology in Dentistry, University Medical Center Hamburg-Eppendorf, and of an oral and maxillofacial office in Rostock, Germany, were used to collect a sufficient number of radiographs. An effort was made to achieve almost equal distribution of age categories in this study group; ‘age’ was given on a particular day. The radiological criteria of lower third molar investigation were: presence and extension of periodontal space, alveolar bone loss, emergence of tooth, and stage of tooth mineralization (according to Demirjian). Univariate and multivariate general linear models were calculated. Using hierarchical multivariate analyses a formula was derived quantifying the development of the four parameters of wisdom tooth over time. This model took repeated measurements of the same persons into account and is only applicable when a person is assessed a second time. The second approach investigates a linear regression model in order to predict the age. In a third approach, a classification and regression tree (CART) was developed to derive cut-off values for the four parameters, resulting in a classification with estimates for sensitivity and specificity. Results: No statistically significant differences were found between parameters related to wisdom tooth localization (right or left side). In univariate analyses being of legal age was associated with consecutive stages of wisdom tooth development, the obliteration of the periodontal space, and tooth emergence, as well with alveolar bone loss; no association was found with tooth mineralization. Multivariate models without repeated measurements revealed imprecise estimates because of the unknown individual-related variability. The precision of these models is thus not very good, although it improves with advancing age. When calculating a CART-analysis and a receiver operating characteristics – area under the curve of 78% was achieved; when maximizing both specificity and sensitivity, a Youden’s index of 47% was achieved (with 73% specificity and 74% sensitivity). Discussion: This study provides a basis to help determine whether a person is 18 years or older in individuals who are assumed to be between 15 and 24 years old. From repeated measurements, we found a linear effect of age on the four parameters in the individuals. However, this information can't be used for prognosis, because of the large intra-individual variability. Thus, although the development of the four parameters can be estimated over time, a direct conclusion with regard to age can’t be drawn from the parameters without previous biographic information about a person. While a single parameter is of limited value for calculating the target age of 18 years, combining several findings, that can be determined on a standard radiography, may potentially be a more reliable diagnostic tool for estimating the target age in both sexes. However, a high degree of precision can’t be achieved. The reason for persistent uncertainty lies in the wide chronological range of wisdom tooth development, which stretches from well below to above the 18th life year. The regression approach thus seems not optimal. Although sensitivity and specificity of the CART-model are moderately high, this model is still not reliable as a diagnostic tool. Our findings could have impact, e.g. on elective surgeries for young individuals with unknown biography. However, these results cannot replace social engagement, in particular thorough physical examination of patients and careful registration of their histories. Further studies on the use of this calculation method in different ethnic groups would be desirable. PMID:27975042
Friedrich, Reinhard E; Schmidt, Kirsten; Treszl, András; Kersten, Jan F
2016-01-01
Introduction: Surgical procedures require informed patient consent, which is mandatory prior to any procedure. These requirements apply in particular to elective surgical procedures. The communication with the patient about the procedure has to be comprehensive and based on mutual understanding. Furthermore, the informed consent has to take into account whether a patient is of legal age. As a result of large-scale migration, there are eventually patients planned for medical procedures, whose chronological age can't be assessed reliably by physical inspection alone. Age determination based on assessing wisdom tooth development stages can be used to help determining whether individuals involved in medical procedures are of legal age, i.e., responsible and accountable. At present, the assessment of wisdom tooth developmental stages barely allows a crude estimate of an individual's age. This study explores possibilities for more precise predictions of the age of individuals with emphasis on the legal age threshold of 18 years. Material and Methods: 1,900 dental orthopantomograms (female 938, male 962, age: 15-24 years), taken between the years 2000 and 2013 for diagnosis and treatment of diseases of the jaws, were evaluated. 1,895 orthopantomograms (female 935, male 960) of 1,804 patients (female 872, male 932) met the inclusion criteria. The archives of the Department of Diagnostic Radiology in Dentistry, University Medical Center Hamburg-Eppendorf, and of an oral and maxillofacial office in Rostock, Germany, were used to collect a sufficient number of radiographs. An effort was made to achieve almost equal distribution of age categories in this study group; 'age' was given on a particular day. The radiological criteria of lower third molar investigation were: presence and extension of periodontal space, alveolar bone loss, emergence of tooth, and stage of tooth mineralization (according to Demirjian). Univariate and multivariate general linear models were calculated. Using hierarchical multivariate analyses a formula was derived quantifying the development of the four parameters of wisdom tooth over time. This model took repeated measurements of the same persons into account and is only applicable when a person is assessed a second time. The second approach investigates a linear regression model in order to predict the age. In a third approach, a classification and regression tree (CART) was developed to derive cut-off values for the four parameters, resulting in a classification with estimates for sensitivity and specificity. Results: No statistically significant differences were found between parameters related to wisdom tooth localization (right or left side). In univariate analyses being of legal age was associated with consecutive stages of wisdom tooth development, the obliteration of the periodontal space, and tooth emergence, as well with alveolar bone loss; no association was found with tooth mineralization. Multivariate models without repeated measurements revealed imprecise estimates because of the unknown individual-related variability. The precision of these models is thus not very good, although it improves with advancing age. When calculating a CART-analysis and a receiver operating characteristics - area under the curve of 78% was achieved; when maximizing both specificity and sensitivity, a Youden's index of 47% was achieved (with 73% specificity and 74% sensitivity). Discussion: This study provides a basis to help determine whether a person is 18 years or older in individuals who are assumed to be between 15 and 24 years old. From repeated measurements, we found a linear effect of age on the four parameters in the individuals. However, this information can't be used for prognosis, because of the large intra-individual variability. Thus, although the development of the four parameters can be estimated over time, a direct conclusion with regard to age can't be drawn from the parameters without previous biographic information about a person. While a single parameter is of limited value for calculating the target age of 18 years, combining several findings, that can be determined on a standard radiography, may potentially be a more reliable diagnostic tool for estimating the target age in both sexes. However, a high degree of precision can't be achieved. The reason for persistent uncertainty lies in the wide chronological range of wisdom tooth development, which stretches from well below to above the 18 th life year. The regression approach thus seems not optimal. Although sensitivity and specificity of the CART-model are moderately high, this model is still not reliable as a diagnostic tool. Our findings could have impact, e.g. on elective surgeries for young individuals with unknown biography. However, these results cannot replace social engagement, in particular thorough physical examination of patients and careful registration of their histories. Further studies on the use of this calculation method in different ethnic groups would be desirable.
Akumu, Angela Oloo; English, Mike; Scott, J Anthony G; Griffiths, Ulla K
2007-07-01
Haemophilus influenzae type b (Hib) vaccine was introduced into routine immunization services in Kenya in 2001. We aimed to estimate the cost-effectiveness of Hib vaccine delivery. A model was developed to follow the Kenyan 2004 birth cohort until death, with and without Hib vaccine. Incidence of invasive Hib disease was estimated at Kilifi District Hospital and in the surrounding demographic surveillance system in coastal Kenya. National Hib disease incidence was estimated by adjusting incidence observed by passive hospital surveillance using assumptions about access to care. Case fatality rates were also assumed dependent on access to care. A price of US$ 3.65 per dose of pentavalent diphtheria-tetanus-pertussis-hep B-Hib vaccine was used. Multivariate Monte Carlo simulations were performed in order to assess the impact on the cost-effectiveness ratios of uncertainty in parameter values. The introduction of Hib vaccine reduced the estimated incidence of Hib meningitis per 100,000 children aged < 5 years from 71 to 8; of Hib non-meningitic invasive disease from 61 to 7; and of non-bacteraemic Hib pneumonia from 296 to 34. The costs per discounted disability adjusted life year (DALY) and per discounted death averted were US$ 38 (95% confidence interval, CI: 26-63) and US$ 1197 (95% CI: 814-2021) respectively. Most of the uncertainty in the results was due to uncertain access to care parameters. The break-even pentavalent vaccine price--where incremental Hib vaccination costs equal treatment costs averted from Hib disease--was US$ 1.82 per dose. Hib vaccine is a highly cost-effective intervention in Kenya. It would be cost-saving if the vaccine price was below half of its present level.
Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model
Ellefsen, Karl J.; Smith, David
2016-01-01
Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples.
Ganzer, Roman; Bründl, Johannes; Koch, Daniel; Wieland, Wolf F; Burger, Maximilian; Blana, Andreas
2015-01-01
To determine which pretreatment clinical parameters were predictive of a low prostate-specific antigen (PSA) nadir following high-intensity focused ultrasound (HIFU) treatment. Retrospective study of patients with clinically localised prostate cancer undergoing HIFU at a single centre between December 1997 and September 2009. Whole-gland treatment was applied. Patients also included if they had previously undergone transurethral resection of the prostate (TURP). TURP was also conducted simultaneously to HIFU. Biochemical failure based on Phoenix definition (PSA nadir + 2). Univariate and multivariate analysis of pretreatment clinical parameters conducted to assess those factors predictive of a PSA nadir ≤0.2 and >0.2 ng/ml. Mean (SD) follow-up was 6.2 (2.8) years; median (range) was 6.3 (1.1-12.2) years. Kaplan-Meier estimate of biochemical disease-free survival rate at 8 years was 83 and 48 % for patients achieving a PSA nadir of ≤0.2 and >0.2 ng/ml, respectively. Prostate volume and incidental finding of cancer were significant predictors of low PSA nadir (≤0.2 ng/ml). Prostate volume and incidental finding of cancer could be predictors for oncologic success of HIFU based on post-treatment PSA nadir.
Hasegawa, Takumi; Tachibana, Akira; Takeda, Daisuke; Iwata, Eiji; Arimoto, Satomi; Sakakibara, Akiko; Akashi, Masaya; Komori, Takahide
2016-12-01
The relationship between radiographic findings and the occurrence of oroantral perforation is controversial. Few studies have quantitatively analyzed the risk factors contributing to oroantral perforation, and no study has reported multivariate analysis of the relationship(s) between these various factors. This retrospective study aims to fill this void. Various risk factors for oroantral perforation during maxillary third molar extraction were investigated by univariate and multivariate analysis. The proximity of the roots to the maxillary sinus floor (root-sinus [RS] classification) was assessed using panoramic radiography and classified as types 1-5. The relationship between the maxillary second and third molars was classified according to a modified version of the Archer classification. The relative depth of the maxillary third molar in the bone was classified as class A-C, and its angulation relative to the long axis of the second molar was also recorded. Performance of an incision (OR 5.16), mesioangular tooth angulation (OR 6.05), and type 3 RS classification (i.e., significant superimposition of the roots of all posterior maxillary teeth with the sinus floor; OR 10.18) were all identified as risk factors with significant association to an outcome of oroantral perforation. To our knowledge, this is the first multivariate analysis of the risk factors for oroantral perforation during surgical extraction of the maxillary third molar. This RS classification may offer a new predictive parameter for estimating the risk of oroantral perforation.
Hot spots of multivariate extreme anomalies in Earth observations
NASA Astrophysics Data System (ADS)
Flach, M.; Sippel, S.; Bodesheim, P.; Brenning, A.; Denzler, J.; Gans, F.; Guanche, Y.; Reichstein, M.; Rodner, E.; Mahecha, M. D.
2016-12-01
Anomalies in Earth observations might indicate data quality issues, extremes or the change of underlying processes within a highly multivariate system. Thus, considering the multivariate constellation of variables for extreme detection yields crucial additional information over conventional univariate approaches. We highlight areas in which multivariate extreme anomalies are more likely to occur, i.e. hot spots of extremes in global atmospheric Earth observations that impact the Biosphere. In addition, we present the year of the most unusual multivariate extreme between 2001 and 2013 and show that these coincide with well known high impact extremes. Technically speaking, we account for multivariate extremes by using three sophisticated algorithms adapted from computer science applications. Namely an ensemble of the k-nearest neighbours mean distance, a kernel density estimation and an approach based on recurrences is used. However, the impact of atmosphere extremes on the Biosphere might largely depend on what is considered to be normal, i.e. the shape of the mean seasonal cycle and its inter-annual variability. We identify regions with similar mean seasonality by means of dimensionality reduction in order to estimate in each region both the `normal' variance and robust thresholds for detecting the extremes. In addition, we account for challenges like heteroscedasticity in Northern latitudes. Apart from hot spot areas, those anomalies in the atmosphere time series are of particular interest, which can only be detected by a multivariate approach but not by a simple univariate approach. Such an anomalous constellation of atmosphere variables is of interest if it impacts the Biosphere. The multivariate constellation of such an anomalous part of a time series is shown in one case study indicating that multivariate anomaly detection can provide novel insights into Earth observations.
IRT-ZIP Modeling for Multivariate Zero-Inflated Count Data
ERIC Educational Resources Information Center
Wang, Lijuan
2010-01-01
This study introduces an item response theory-zero-inflated Poisson (IRT-ZIP) model to investigate psychometric properties of multiple items and predict individuals' latent trait scores for multivariate zero-inflated count data. In the model, two link functions are used to capture two processes of the zero-inflated count data. Item parameters are…
NASA Astrophysics Data System (ADS)
Mortuza, M. R.; Demissie, Y. K.
2015-12-01
In lieu with the recent and anticipated more server and frequently droughts incidences in Yakima River Basin (YRB), a reliable and comprehensive drought assessment is deemed necessary to avoid major crop production loss and better manage the water right issues in the region during low precipitation and/or snow accumulation years. In this study, we have conducted frequency analysis of hydrological droughts and quantified associated uncertainty in the YRB under both historical and changing climate. Streamflow drought index (SDI) was employed to identify mutually correlated drought characteristics (e.g., severity, duration and peak). The historical and future characteristics of drought were estimated by applying tri-variate copulas probability distribution, which effectively describe the joint distribution and dependence of drought severity, duration, and peak. The associated prediction uncertainty, related to parameters of the joint probability and climate projections, were evaluated using the Bayesian approach with bootstrap resampling. For the climate change scenarios, two future representative pathways (RCP4.5 and RCP8.5) from University of Idaho's Multivariate Adaptive Constructed Analogs (MACA) database were considered. The results from the study are expected to provide useful information towards drought risk management in YRB under anticipated climate changes.
The Contribution of Missed Clinic Visits to Disparities in HIV Viral Load Outcomes
Westfall, Andrew O.; Gardner, Lytt I.; Giordano, Thomas P.; Wilson, Tracey E.; Drainoni, Mari-Lynn; Keruly, Jeanne C.; Rodriguez, Allan E.; Malitz, Faye; Batey, D. Scott; Mugavero, Michael J.
2015-01-01
Objectives. We explored the contribution of missed primary HIV care visits (“no-show”) to observed disparities in virological failure (VF) among Black persons and persons with injection drug use (IDU) history. Methods. We used patient-level data from 6 academic clinics, before the Centers for Disease Control and Prevention and Health Resources and Services Administration Retention in Care intervention. We employed staged multivariable logistic regression and multivariable models stratified by no-show visit frequency to evaluate the association of sociodemographic factors with VF. We used multiple imputations to assign missing viral load values. Results. Among 10 053 patients (mean age = 46 years; 35% female; 64% Black; 15% with IDU history), 31% experienced VF. Although Black patients and patients with IDU history were significantly more likely to experience VF in initial analyses, race and IDU parameter estimates were attenuated after sequential addition of no-show frequency. In stratified models, race and IDU were not statistically significantly associated with VF at any no-show level. Conclusions. Because missed clinic visits contributed to observed differences in viral load outcomes among Black and IDU patients, achieving an improved understanding of differential visit attendance is imperative to reducing disparities in HIV. PMID:26270301
Determining Sala mango qualities with the use of RGB images captured by a mobile phone camera
NASA Astrophysics Data System (ADS)
Yahaya, Ommi Kalsom Mardziah; Jafri, Mohd Zubir Mat; Aziz, Azlan Abdul; Omar, Ahmad Fairuz
2015-04-01
Sala mango (Mangifera indicia) is one of the Malaysia's most popular tropical fruits that are widely marketed within the country. The degrees of ripeness of mangoes have conventionally been evaluated manually on the basis of color parameters, but a simple non-destructive technique using the Samsung Galaxy Note 1 mobile phone camera is introduced to replace the destructive technique. In this research, color parameters in terms of RGB values acquired using the ENVI software system were linked to detect Sala mango quality parameters. The features of mango were extracted from the acquired images and then used to classify of fruit skin color, which relates to the stages of ripening. A multivariate analysis method, multiple linear regression, was employed with the purpose of using RGB color parameters to estimate the pH, soluble solids content (SSC), and firmness. The relationship between these qualities parameters of Sala mango and its mean pixel values in the RGB system is analyzed. Findings show that pH yields the highest accuracy with a correlation coefficient R = 0.913 and root mean square of error RMSE = 0.166 pH. Meanwhile, firmness has R = 0.875 and RMSE = 1.392 kgf, whereas soluble solid content has the lowest accuracy with R = 0.814 and RMSE = 1.218°Brix with the correlation between color parameters. Therefore, this non-invasive method can be used to determine the quality attributes of mangoes.
Borrowing of strength and study weights in multivariate and network meta-analysis.
Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D
2017-12-01
Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of 'borrowing of strength'. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis).
Borrowing of strength and study weights in multivariate and network meta-analysis
Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D
2016-01-01
Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of ‘borrowing of strength’. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis). PMID:26546254
A refined method for multivariate meta-analysis and meta-regression
Jackson, Daniel; Riley, Richard D
2014-01-01
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351
Multivariate meta-analysis using individual participant data
Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.
2016-01-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment–covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. PMID:26099484
Sato, Masashi; Yamashita, Okito; Sato, Masa-Aki; Miyawaki, Yoichi
2018-01-01
To understand information representation in human brain activity, it is important to investigate its fine spatial patterns at high temporal resolution. One possible approach is to use source estimation of magnetoencephalography (MEG) signals. Previous studies have mainly quantified accuracy of this technique according to positional deviations and dispersion of estimated sources, but it remains unclear how accurately MEG source estimation restores information content represented by spatial patterns of brain activity. In this study, using simulated MEG signals representing artificial experimental conditions, we performed MEG source estimation and multivariate pattern analysis to examine whether MEG source estimation can restore information content represented by patterns of cortical current in source brain areas. Classification analysis revealed that the corresponding artificial experimental conditions were predicted accurately from patterns of cortical current estimated in the source brain areas. However, accurate predictions were also possible from brain areas whose original sources were not defined. Searchlight decoding further revealed that this unexpected prediction was possible across wide brain areas beyond the original source locations, indicating that information contained in the original sources can spread through MEG source estimation. This phenomenon of "information spreading" may easily lead to false-positive interpretations when MEG source estimation and classification analysis are combined to identify brain areas that represent target information. Real MEG data analyses also showed that presented stimuli were able to be predicted in the higher visual cortex at the same latency as in the primary visual cortex, also suggesting that information spreading took place. These results indicate that careful inspection is necessary to avoid false-positive interpretations when MEG source estimation and multivariate pattern analysis are combined.
Sato, Masashi; Yamashita, Okito; Sato, Masa-aki
2018-01-01
To understand information representation in human brain activity, it is important to investigate its fine spatial patterns at high temporal resolution. One possible approach is to use source estimation of magnetoencephalography (MEG) signals. Previous studies have mainly quantified accuracy of this technique according to positional deviations and dispersion of estimated sources, but it remains unclear how accurately MEG source estimation restores information content represented by spatial patterns of brain activity. In this study, using simulated MEG signals representing artificial experimental conditions, we performed MEG source estimation and multivariate pattern analysis to examine whether MEG source estimation can restore information content represented by patterns of cortical current in source brain areas. Classification analysis revealed that the corresponding artificial experimental conditions were predicted accurately from patterns of cortical current estimated in the source brain areas. However, accurate predictions were also possible from brain areas whose original sources were not defined. Searchlight decoding further revealed that this unexpected prediction was possible across wide brain areas beyond the original source locations, indicating that information contained in the original sources can spread through MEG source estimation. This phenomenon of “information spreading” may easily lead to false-positive interpretations when MEG source estimation and classification analysis are combined to identify brain areas that represent target information. Real MEG data analyses also showed that presented stimuli were able to be predicted in the higher visual cortex at the same latency as in the primary visual cortex, also suggesting that information spreading took place. These results indicate that careful inspection is necessary to avoid false-positive interpretations when MEG source estimation and multivariate pattern analysis are combined. PMID:29912968
Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi
2015-01-01
Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.
Hybrid least squares multivariate spectral analysis methods
Haaland, David M.
2002-01-01
A set of hybrid least squares multivariate spectral analysis methods in which spectral shapes of components or effects not present in the original calibration step are added in a following estimation or calibration step to improve the accuracy of the estimation of the amount of the original components in the sampled mixture. The "hybrid" method herein means a combination of an initial classical least squares analysis calibration step with subsequent analysis by an inverse multivariate analysis method. A "spectral shape" herein means normally the spectral shape of a non-calibrated chemical component in the sample mixture but can also mean the spectral shapes of other sources of spectral variation, including temperature drift, shifts between spectrometers, spectrometer drift, etc. The "shape" can be continuous, discontinuous, or even discrete points illustrative of the particular effect.
Dajani, Hilmi R; Hosokawa, Kazuya; Ando, Shin-Ichi
2016-11-01
Lung-to-finger circulation time of oxygenated blood during nocturnal periodic breathing in heart failure patients measured using polysomnography correlates negatively with cardiac function but possesses limited accuracy for cardiac output (CO) estimation. CO was recalculated from lung-to-finger circulation time using a multivariable linear model with information on age and average overnight heart rate in 25 patients who underwent evaluation of heart failure. The multivariable model decreased the percentage error to 22.3% relative to invasive CO measured during cardiac catheterization. This improved automated noninvasive CO estimation using multiple variables meets a recently proposed performance criterion for clinical acceptability of noninvasive CO estimation, and compares very favorably with other available methods. Copyright © 2016 Elsevier Inc. All rights reserved.
Multivariate Meta-Analysis Using Individual Participant Data
ERIC Educational Resources Information Center
Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.
2015-01-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is…
Multivariate η-μ fading distribution with arbitrary correlation model
NASA Astrophysics Data System (ADS)
Ghareeb, Ibrahim; Atiani, Amani
2018-03-01
An extensive analysis for the multivariate ? distribution with arbitrary correlation is presented, where novel analytical expressions for the multivariate probability density function, cumulative distribution function and moment generating function (MGF) of arbitrarily correlated and not necessarily identically distributed ? power random variables are derived. Also, this paper provides exact-form expression for the MGF of the instantaneous signal-to-noise ratio at the combiner output in a diversity reception system with maximal-ratio combining and post-detection equal-gain combining operating in slow frequency nonselective arbitrarily correlated not necessarily identically distributed ?-fading channels. The average bit error probability of differentially detected quadrature phase shift keying signals with post-detection diversity reception system over arbitrarily correlated and not necessarily identical fading parameters ?-fading channels is determined by using the MGF-based approach. The effect of fading correlation between diversity branches, fading severity parameters and diversity level is studied.
A model-based approach to wildland fire reconstruction using sediment charcoal records
Itter, Malcolm S.; Finley, Andrew O.; Hooten, Mevin B.; Higuera, Philip E.; Marlon, Jennifer R.; Kelly, Ryan; McLachlan, Jason S.
2017-01-01
Lake sediment charcoal records are used in paleoecological analyses to reconstruct fire history, including the identification of past wildland fires. One challenge of applying sediment charcoal records to infer fire history is the separation of charcoal associated with local fire occurrence and charcoal originating from regional fire activity. Despite a variety of methods to identify local fires from sediment charcoal records, an integrated statistical framework for fire reconstruction is lacking. We develop a Bayesian point process model to estimate the probability of fire associated with charcoal counts from individual-lake sediments and estimate mean fire return intervals. A multivariate extension of the model combines records from multiple lakes to reduce uncertainty in local fire identification and estimate a regional mean fire return interval. The univariate and multivariate models are applied to 13 lakes in the Yukon Flats region of Alaska. Both models resulted in similar mean fire return intervals (100–350 years) with reduced uncertainty under the multivariate model due to improved estimation of regional charcoal deposition. The point process model offers an integrated statistical framework for paleofire reconstruction and extends existing methods to infer regional fire history from multiple lake records with uncertainty following directly from posterior distributions.
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin
2013-01-01
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin
2013-10-15
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.
Magnetic resonance analysis of malignant transformation in recurrent glioma.
Jalbert, Llewellyn E; Neill, Evan; Phillips, Joanna J; Lupo, Janine M; Olson, Marram P; Molinaro, Annette M; Berger, Mitchel S; Chang, Susan M; Nelson, Sarah J
2016-08-01
Patients with low-grade glioma (LGG) have a relatively long survival, and a balance is often struck between treating the tumor and impacting quality of life. While lesions may remain stable for many years, they may also undergo malignant transformation (MT) at the time of recurrence and require more aggressive intervention. Here we report on a state-of-the-art multiparametric MRI study of patients with recurrent LGG. One hundred and eleven patients previously diagnosed with LGG were scanned at either 1.5 T or 3 T MR at the time of recurrence. Volumetric and intensity parameters were estimated from anatomic, diffusion, perfusion, and metabolic MR data. Direct comparisons of histopathological markers from image-guided tissue samples with metrics derived from the corresponding locations on the in vivo images were made. A bioinformatics approach was applied to visualize and interpret these results, which included imaging heatmaps and network analysis. Multivariate linear-regression modeling was utilized for predicting transformation. Many advanced imaging parameters were found to be significantly different for patients with tumors that had undergone MT versus those that had not. Imaging metrics calculated at the tissue sample locations highlighted the distinct biological significance of the imaging and the heterogeneity present in recurrent LGG, while multivariate modeling yielded a 76.04% accuracy in predicting MT. The acquisition and quantitative analysis of such multiparametric MR data may ultimately allow for improved clinical assessment and treatment stratification for patients with recurrent LGG. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Neuro-Oncology.
Wen, Jiahuai; Yang, Yanning; Ye, Feng; Huang, Xiaojia; Li, Shuaijie; Wang, Qiong; Xie, Xiaoming
2015-12-01
Previous studies have suggested that plasma fibrinogen contributes to tumor cell proliferation, progression and metastasis. The current study was performed to evaluate the prognostic relevance of preoperative plasma fibrinogen in breast cancer patients. Data of 2073 consecutive breast cancer patients, who underwent surgery between January 2002 and December 2008 at the Sun Yat-sen University Cancer Center, were retrospectively evaluated. Plasma fibrinogen levels were routinely measured before surgeries. Participants were grouped by the cutoff value estimated by the receiver operating characteristic (ROC) curve analysis. Overall survival (OS) was assessed using Kaplan-Meier analysis, and multivariate Cox proportional hazards regression model was performed to evaluate the independent prognostic value of plasma fibrinogen level. The optimal cutoff value of preoperative plasma fibrinogen was determined to be 2.83 g/L. The Kaplan-Meier analysis showed that patients with high fibrinogen levels had shorter OS than patients with low fibrinogen levels (p < 0.001). Multivariate analysis suggested preoperative plasma fibrinogen as an independent prognostic factor for OS in breast cancer patients (HR = 1.475, 95% confidence interval (CI): 1.177-1.848, p = 0.001). Subgroup analyses revealed that plasma fibrinogen level was an unfavorable prognostic parameter in stage II-III, Luminal subtypes and triple-negative breast cancer patients. Elevated preoperative plasma fibrinogen was independently associated with poor prognosis in breast cancer patients and may serve as a valuable parameter for risk assessment in breast cancer patients. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Pereira, R J; Ayres, D R; El Faro, L; Verneque, R S; Vercesi Filho, A E; Albuquerque, L G
2013-09-27
We analyzed 46,161 monthly test-day records of milk production from 7453 first lactations of crossbred dairy Gyr (Bos indicus) x Holstein cows. The following seven models were compared: standard multivariate model (M10), three reduced rank models fitting the first 2, 3, or 4 genetic principal components, and three models considering a 2-, 3-, or 4-factor structure for the genetic covariance matrix. Full rank residual covariance matrices were considered for all models. The model fitting the first two principal components (PC2) was the best according to the model selection criteria. Similar phenotypic, genetic, and residual variances were obtained with models M10 and PC2. The heritability estimates ranged from 0.14 to 0.21 and from 0.13 to 0.21 for models M10 and PC2, respectively. The genetic correlations obtained with model PC2 were slightly higher than those estimated with model M10. PC2 markedly reduced the number of parameters estimated and the time spent to reach convergence. We concluded that two principal components are sufficient to model the structure of genetic covariances between test-day milk yields.
Spatio-temporal models of mental processes from fMRI.
Janoos, Firdaus; Machiraju, Raghu; Singh, Shantanu; Morocz, Istvan Ákos
2011-07-15
Understanding the highly complex, spatially distributed and temporally organized phenomena entailed by mental processes using functional MRI is an important research problem in cognitive and clinical neuroscience. Conventional analysis methods focus on the spatial dimension of the data discarding the information about brain function contained in the temporal dimension. This paper presents a fully spatio-temporal multivariate analysis method using a state-space model (SSM) for brain function that yields not only spatial maps of activity but also its temporal structure along with spatially varying estimates of the hemodynamic response. Efficient algorithms for estimating the parameters along with quantitative validations are given. A novel low-dimensional feature-space for representing the data, based on a formal definition of functional similarity, is derived. Quantitative validation of the model and the estimation algorithms is provided with a simulation study. Using a real fMRI study for mental arithmetic, the ability of this neurophysiologically inspired model to represent the spatio-temporal information corresponding to mental processes is demonstrated. Moreover, by comparing the models across multiple subjects, natural patterns in mental processes organized according to different mental abilities are revealed. Copyright © 2011 Elsevier Inc. All rights reserved.
Zhang, Peng; Luo, Dandan; Li, Pengfei; Sharpsten, Lucie; Medeiros, Felipe A.
2015-01-01
Glaucoma is a progressive disease due to damage in the optic nerve with associated functional losses. Although the relationship between structural and functional progression in glaucoma is well established, there is disagreement on how this association evolves over time. In addressing this issue, we propose a new class of non-Gaussian linear-mixed models to estimate the correlations among subject-specific effects in multivariate longitudinal studies with a skewed distribution of random effects, to be used in a study of glaucoma. This class provides an efficient estimation of subject-specific effects by modeling the skewed random effects through the log-gamma distribution. It also provides more reliable estimates of the correlations between the random effects. To validate the log-gamma assumption against the usual normality assumption of the random effects, we propose a lack-of-fit test using the profile likelihood function of the shape parameter. We apply this method to data from a prospective observation study, the Diagnostic Innovations in Glaucoma Study, to present a statistically significant association between structural and functional change rates that leads to a better understanding of the progression of glaucoma over time. PMID:26075565
Mirzaeinejad, Hossein; Mirzaei, Mehdi; Rafatnia, Sadra
2018-06-11
This study deals with the enhancement of directional stability of vehicle which turns with high speeds on various road conditions using integrated active steering and differential braking systems. In this respect, the minimum usage of intentional asymmetric braking force to compensate the drawbacks of active steering control with small reduction of vehicle longitudinal speed is desired. To this aim, a new optimal multivariable controller is analytically developed for integrated steering and braking systems based on the prediction of vehicle nonlinear responses. A fuzzy programming extracted from the nonlinear phase plane analysis is also used for managing the two control inputs in various driving conditions. With the proposed fuzzy programming, the weight factors of the control inputs are automatically tuned and softly changed. In order to simulate a real-world control system, some required information about the system states and parameters which cannot be directly measured, are estimated using the Unscented Kalman Filter (UKF). Finally, simulations studies are carried out using a validated vehicle model to show the effectiveness of the proposed integrated control system in the presence of model uncertainties and estimation errors. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murphy, Colin; Anderson, Penny R.; Li Tianyu
Purpose: We examined the impact of radiation tumor bed boost parameters in early-stage breast cancer on local control and cosmetic outcomes. Methods and Materials: A total of 3,186 women underwent postlumpectomy whole-breast radiation with a tumor bed boost for Tis to T2 breast cancer from 1970 to 2008. Boost parameters analyzed included size, energy, dose, and technique. Endpoints were local control, cosmesis, and fibrosis. The Kaplan-Meier method was used to estimate actuarial incidence, and a Cox proportional hazard model was used to determine independent predictors of outcomes on multivariate analysis (MVA). The median follow-up was 78 months (range, 1-305 months).more » Results: The crude cosmetic results were excellent in 54%, good in 41%, and fair/poor in 5% of patients. The 10-year estimate of an excellent cosmesis was 66%. On MVA, independent predictors for excellent cosmesis were use of electron boost, lower electron energy, adjuvant systemic therapy, and whole-breast IMRT. Fibrosis was reported in 8.4% of patients. The actuarial incidence of fibrosis was 11% at 5 years and 17% at 10 years. On MVA, independent predictors of fibrosis were larger cup size and higher boost energy. The 10-year actuarial local failure was 6.3%. There was no significant difference in local control by boost method, cut-out size, dose, or energy. Conclusions: Likelihood of excellent cosmesis or fibrosis are associated with boost technique, electron energy, and cup size. However, because of high local control and rare incidence of fair/poor cosmesis with a boost, the anatomy of the patient and tumor cavity should ultimately determine the necessary boost parameters.« less
Prediction of mortality rates using a model with stochastic parameters
NASA Astrophysics Data System (ADS)
Tan, Chon Sern; Pooi, Ah Hin
2016-10-01
Prediction of future mortality rates is crucial to insurance companies because they face longevity risks while providing retirement benefits to a population whose life expectancy is increasing. In the past literature, a time series model based on multivariate power-normal distribution has been applied on mortality data from the United States for the years 1933 till 2000 to forecast the future mortality rates for the years 2001 till 2010. In this paper, a more dynamic approach based on the multivariate time series will be proposed where the model uses stochastic parameters that vary with time. The resulting prediction intervals obtained using the model with stochastic parameters perform better because apart from having good ability in covering the observed future mortality rates, they also tend to have distinctly shorter interval lengths.
Amaral, Larissa S; Azevedo, Eduardo B; Perussi, Janice R
2018-06-01
Antimicrobial Photodynamic Inactivation (a-PDI) is based on the oxidative destruction of biological molecules by reactive oxygen species generated by the photo-excitation of a photosensitive molecule. When a-PDT is performed with the use of mathematical models, the optimal conditions for maximum inactivation are found. Experimental designs allow a multivariate analysis of the experimental parameters. This is usually made using a univariate approach, which demands a large number of experiments, being time and money consuming. This paper presents the use of the response surface methodology for improving the search for the best conditions to reduce E. coli survival levels by a-PDT using methylene blue (MB) and toluidine blue (TB) as photosensitizers and white light. The goal was achieved by analyzing the effects and interactions of the three main parameters involved in the process: incubation time (IT), photosensitizer concentration (C PS ), and light dose (LD). The optimization procedure began with a full 2 3 factorial design, followed by a central composite one, in which the optimal conditions were estimated. For MB, C PS was the most important parameter followed by LD and IT whereas, for TB, the main parameter was LD followed by C PS and IT. Using the estimated optimal conditions for inactivation, MB was able to inactivate 99.999999% CFU mL -1 of E. coli with IT of 28 min, LD of 31 J cm -2 , and C PS of 32 μmol L -1 , while TB required 18 min, 39 J cm -2 , and 37 μmol L -1 . The feasibility of using the response surface methodology with a-PDT was demonstrated, enabling enhanced photoinactivation efficiency and fast results with a minimal number of experiments. Copyright © 2018 Elsevier B.V. All rights reserved.
Masci, Ilaria; Vannozzi, Giuseppe; Bergamini, Elena; Pesce, Caterina; Getchell, Nancy; Cappozzo, Aurelio
2013-04-01
Objective quantitative evaluation of motor skill development is of increasing importance to carefully drive physical exercise programs in childhood. Running is a fundamental motor skill humans adopt to accomplish locomotion, which is linked to physical activity levels, although the assessment is traditionally carried out using qualitative evaluation tests. The present study aimed at investigating the feasibility of using inertial sensors to quantify developmental differences in the running pattern of young children. Qualitative and quantitative assessment tools were adopted to identify a skill-sensitive set of biomechanical parameters for running and to further our understanding of the factors that determine progression to skilled running performance. Running performances of 54 children between the ages of 2 and 12 years were submitted to both qualitative and quantitative analysis, the former using sequences of developmental level, the latter estimating temporal and kinematic parameters from inertial sensor measurements. Discriminant analysis with running developmental level as dependent variable allowed to identify a set of temporal and kinematic parameters, within those obtained with the sensor, that best classified children into the qualitative developmental levels (accuracy higher than 67%). Multivariate analysis of variance with the quantitative parameters as dependent variables allowed to identify whether and which specific parameters or parameter subsets were differentially sensitive to specific transitions between contiguous developmental levels. The findings showed that different sets of temporal and kinematic parameters are able to tap all steps of the transitional process in running skill described through qualitative observation and can be prospectively used for applied diagnostic and sport training purposes. Copyright © 2012 Elsevier B.V. All rights reserved.
A Bayesian approach to multivariate measurement system assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hamada, Michael Scott
This article considers system assessment for multivariate measurements and presents a Bayesian approach to analyzing gauge R&R study data. The evaluation of variances for univariate measurement becomes the evaluation of covariance matrices for multivariate measurements. The Bayesian approach ensures positive definite estimates of the covariance matrices and easily provides their uncertainty. Furthermore, various measurement system assessment criteria are easily evaluated. The approach is illustrated with data from a real gauge R&R study as well as simulated data.
A Bayesian approach to multivariate measurement system assessment
Hamada, Michael Scott
2016-07-01
This article considers system assessment for multivariate measurements and presents a Bayesian approach to analyzing gauge R&R study data. The evaluation of variances for univariate measurement becomes the evaluation of covariance matrices for multivariate measurements. The Bayesian approach ensures positive definite estimates of the covariance matrices and easily provides their uncertainty. Furthermore, various measurement system assessment criteria are easily evaluated. The approach is illustrated with data from a real gauge R&R study as well as simulated data.
Univariate Analysis of Multivariate Outcomes in Educational Psychology.
ERIC Educational Resources Information Center
Hubble, L. M.
1984-01-01
The author examined the prevalence of multiple operational definitions of outcome constructs and an estimate of the incidence of Type I error rates when univariate procedures were applied to multiple variables in educational psychology. Multiple operational definitions of constructs were advocated and wider use of multivariate analysis was…
Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace
ERIC Educational Resources Information Center
Culpepper, Steven Andrew; Park, Trevor
2017-01-01
A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…
System and Method for Outlier Detection via Estimating Clusters
NASA Technical Reports Server (NTRS)
Iverson, David J. (Inventor)
2016-01-01
An efficient method and system for real-time or offline analysis of multivariate sensor data for use in anomaly detection, fault detection, and system health monitoring is provided. Models automatically derived from training data, typically nominal system data acquired from sensors in normally operating conditions or from detailed simulations, are used to identify unusual, out of family data samples (outliers) that indicate possible system failure or degradation. Outliers are determined through analyzing a degree of deviation of current system behavior from the models formed from the nominal system data. The deviation of current system behavior is presented as an easy to interpret numerical score along with a measure of the relative contribution of each system parameter to any off-nominal deviation. The techniques described herein may also be used to "clean" the training data.
MacNab, Ying C
2016-08-01
This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics. © The Author(s) 2016.
Multivariate analysis of gamma spectra to characterize used nuclear fuel
Coble, Jamie; Orton, Christopher; Schwantes, Jon
2017-01-17
The Multi-Isotope Process (MIP) Monitor provides an efficient means to monitor the process conditions in used nuclear fuel reprocessing facilities to support process verification and validation. The MIP Monitor applies multivariate analysis to gamma spectroscopy of key stages in the reprocessing stream in order to detect small changes in the gamma spectrum, which may indicate changes in process conditions. This research extends the MIP Monitor by characterizing a used fuel sample after initial dissolution according to the type of reactor of origin (pressurized or boiling water reactor; PWR and BWR, respectively), initial enrichment, burn up, and cooling time. Simulated gammamore » spectra were used in this paper to develop and test three fuel characterization algorithms. The classification and estimation models employed are based on the partial least squares regression (PLS) algorithm. A PLS discriminate analysis model was developed which perfectly classified reactor type for the three PWR and three BWR reactor designs studied. Locally weighted PLS models were fitted on-the-fly to estimate the remaining fuel characteristics. For the simulated gamma spectra considered, burn up was predicted with 0.1% root mean squared percent error (RMSPE) and both cooling time and initial enrichment with approximately 2% RMSPE. Finally, this approach to automated fuel characterization can be used to independently verify operator declarations of used fuel characteristics and to inform the MIP Monitor anomaly detection routines at later stages of the fuel reprocessing stream to improve sensitivity to changes in operational parameters that may indicate issues with operational control or malicious activities.« less
Marques, Pedro; Leite, Valeriano; Bugalho, Maria João
2014-12-01
Papillary thyroid carcinoma (PTC) is the most common thyroid cancer. The widespread use of neck ultrasound (US) and US-guided fine-needle aspiration cytology is triggering an overdiagnosis of PTC. To evaluate clinical behavior and outcomes of patients with PTCs ≤2 cm, seeking for possible prognostic factors. Clinical records of cases with histological diagnosis of PTC ≤2 cm followed at the Endocrine Department of Instituto Português de Oncologia, Lisbon between 2002 and 2006 were analyzed retrospectively. We identified 255 PTCs, 111 were microcarcinomas. Most patients underwent near-total thyroidectomy, with lymph node dissections in 55 cases (21.6%). Radioiodine therapy was administered in 184 patients. At the last evaluation, 38 (14.9%) had evidence of disease. Two deaths were attributed to PTC. Median (±SD) follow-up was 74 (±23) months. Multivariate analysis identified vascular invasion, lymph node and systemic metastases significantly associated with recurrence/persistence of disease. In addition, lymph node involvement was significantly associated with extrathyroidal extension and angioinvasion. Median (±SD) disease-free survival (DFS) was estimated as 106 (±3) months and the 5-year DFS rate was 87.5%. Univariate Cox analysis identified some relevant parameters for DFS, but multivariate regression only identified lymph node and systemic metastases as significant independent factors. The median DFS estimated for lymph node and systemic metastases was 75 and 0 months, respectively. In the setting of small PTCs, vascular invasion, extrathyroidal extension and lymph node and/or systemic metastases may confer worse prognosis, perhaps justifying more aggressive therapeutic and follow-up approaches in such cases.
Multivariate analysis of gamma spectra to characterize used nuclear fuel
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coble, Jamie; Orton, Christopher; Schwantes, Jon
The Multi-Isotope Process (MIP) Monitor provides an efficient means to monitor the process conditions in used nuclear fuel reprocessing facilities to support process verification and validation. The MIP Monitor applies multivariate analysis to gamma spectroscopy of key stages in the reprocessing stream in order to detect small changes in the gamma spectrum, which may indicate changes in process conditions. This research extends the MIP Monitor by characterizing a used fuel sample after initial dissolution according to the type of reactor of origin (pressurized or boiling water reactor; PWR and BWR, respectively), initial enrichment, burn up, and cooling time. Simulated gammamore » spectra were used in this paper to develop and test three fuel characterization algorithms. The classification and estimation models employed are based on the partial least squares regression (PLS) algorithm. A PLS discriminate analysis model was developed which perfectly classified reactor type for the three PWR and three BWR reactor designs studied. Locally weighted PLS models were fitted on-the-fly to estimate the remaining fuel characteristics. For the simulated gamma spectra considered, burn up was predicted with 0.1% root mean squared percent error (RMSPE) and both cooling time and initial enrichment with approximately 2% RMSPE. Finally, this approach to automated fuel characterization can be used to independently verify operator declarations of used fuel characteristics and to inform the MIP Monitor anomaly detection routines at later stages of the fuel reprocessing stream to improve sensitivity to changes in operational parameters that may indicate issues with operational control or malicious activities.« less
Catlett, Kierstin K; Schwartz, Gary T; Godfrey, Laurie R; Jungers, William L
2010-07-01
Studies of primate life history variation are constrained by the fact that all large-bodied extant primates are haplorhines. However, large-bodied strepsirrhines recently existed. If we can extract life history information from their skeletons, these species can contribute to our understanding of primate life history variation. This is particularly important in light of new critiques of the classic "fast-slow continuum" as a descriptor of variation in life history profiles across mammals in general. We use established dental histological methods to estimate gestation length and age at weaning for five extinct lemur species. On the basis of these estimates, we reconstruct minimum interbirth intervals and maximum reproductive rates. We utilize principal components analysis to create a multivariate "life history space" that captures the relationships among reproductive parameters and brain and body size in extinct and extant lemurs. Our data show that, whereas large-bodied extinct lemurs can be described as "slow" in some fashion, they also varied greatly in their life history profiles. Those with relatively large brains also weaned their offspring late and had long interbirth intervals. These were not the largest of extinct lemurs. Thus, we distinguish size-related life history variation from variation that linked more strongly to ecological factors. Because all lemur species larger than 10 kg, regardless of life history profile, succumbed to extinction after humans arrived in Madagascar, we argue that large body size increased the probability of extinction independently of reproductive rate. We also provide some evidence that, among lemurs, brain size predicts reproductive rate better than body size. (c) 2010 Wiley-Liss, Inc.
Jin, Lei; Gao, Yufeng; Ye, Jun; Zou, Guizhou; Li, Xu
2017-09-01
The red blood cell distribution width (RDW) is increased in chronic liver disease, but its clinical significance in hepatitis B virus-related acute-on-chronic liver failure (HBV-ACLF) is still unclear. The aim of the present study was to investigate the clinical significance of RDW in HBV-ACLF patients. The medical records of HBV-ACLF patients who were admitted to The Second Affiliated Hospital of Anhui Medical University between April 2012 and December 2015 were retrospectively reviewed. Correlations between RDW, neutrophil lymphocyte ratio (NLR), and the model for end-stage liver disease (MELD) scores were analyzed using the Spearman's approach. Multivariable stepwise logistic regression test was used to evaluate independent clinical parameters predicting 3-month mortality of HBV-ACLF patients. The association between RDW and hospitalization outcome was estimated by receiver operating curve (ROC) analysis. Patient survival was estimated by Kaplan-Meier analysis and subsequently compared by log-rank test. Sixty-two HBV-ACLF patients and sixty CHB patients were enrolled. RDW were increased in HBVACLF patients and positively correlated with the NLR as well as MELD scores. Multivariate analysis demonstrated that RDW value was an independent predictor for mortality. RDW had an area under the ROC of 0.799 in predicting 3-month mortality of HBV-ACLF patients. Patients with HBV-ACLF who had RDW > 17% showed significantly poorer survival than those who had RDW ≤ 17%. RDW values are significantly increased in patients with HBV-ACLF. Moreover, RDW values are an independent predicting factor for an in-hospital mortality in patients with HBV-ACLF.
NASA Astrophysics Data System (ADS)
Chan, C. H.; Brown, G.; Rikvold, P. A.
2017-05-01
A generalized approach to Wang-Landau simulations, macroscopically constrained Wang-Landau, is proposed to simulate the density of states of a system with multiple macroscopic order parameters. The method breaks a multidimensional random-walk process in phase space into many separate, one-dimensional random-walk processes in well-defined subspaces. Each of these random walks is constrained to a different set of values of the macroscopic order parameters. When the multivariable density of states is obtained for one set of values of fieldlike model parameters, the density of states for any other values of these parameters can be obtained by a simple transformation of the total system energy. All thermodynamic quantities of the system can then be rapidly calculated at any point in the phase diagram. We demonstrate how to use the multivariable density of states to draw the phase diagram, as well as order-parameter probability distributions at specific phase points, for a model spin-crossover material: an antiferromagnetic Ising model with ferromagnetic long-range interactions. The fieldlike parameters in this model are an effective magnetic field and the strength of the long-range interaction.
Regional regression models of watershed suspended-sediment discharge for the eastern United States
NASA Astrophysics Data System (ADS)
Roman, David C.; Vogel, Richard M.; Schwarz, Gregory E.
2012-11-01
SummaryEstimates of mean annual watershed sediment discharge, derived from long-term measurements of suspended-sediment concentration and streamflow, often are not available at locations of interest. The goal of this study was to develop multivariate regression models to enable prediction of mean annual suspended-sediment discharge from available basin characteristics useful for most ungaged river locations in the eastern United States. The models are based on long-term mean sediment discharge estimates and explanatory variables obtained from a combined dataset of 1201 US Geological Survey (USGS) stations derived from a SPAtially Referenced Regression on Watershed attributes (SPARROW) study and the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES) database. The resulting regional regression models summarized for major US water resources regions 1-8, exhibited prediction R2 values ranging from 76.9% to 92.7% and corresponding average model prediction errors ranging from 56.5% to 124.3%. Results from cross-validation experiments suggest that a majority of the models will perform similarly to calibration runs. The 36-parameter regional regression models also outperformed a 16-parameter national SPARROW model of suspended-sediment discharge and indicate that mean annual sediment loads in the eastern United States generally correlates with a combination of basin area, land use patterns, seasonal precipitation, soil composition, hydrologic modification, and to a lesser extent, topography.
Regional regression models of watershed suspended-sediment discharge for the eastern United States
Roman, David C.; Vogel, Richard M.; Schwarz, Gregory E.
2012-01-01
Estimates of mean annual watershed sediment discharge, derived from long-term measurements of suspended-sediment concentration and streamflow, often are not available at locations of interest. The goal of this study was to develop multivariate regression models to enable prediction of mean annual suspended-sediment discharge from available basin characteristics useful for most ungaged river locations in the eastern United States. The models are based on long-term mean sediment discharge estimates and explanatory variables obtained from a combined dataset of 1201 US Geological Survey (USGS) stations derived from a SPAtially Referenced Regression on Watershed attributes (SPARROW) study and the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES) database. The resulting regional regression models summarized for major US water resources regions 1–8, exhibited prediction R2 values ranging from 76.9% to 92.7% and corresponding average model prediction errors ranging from 56.5% to 124.3%. Results from cross-validation experiments suggest that a majority of the models will perform similarly to calibration runs. The 36-parameter regional regression models also outperformed a 16-parameter national SPARROW model of suspended-sediment discharge and indicate that mean annual sediment loads in the eastern United States generally correlates with a combination of basin area, land use patterns, seasonal precipitation, soil composition, hydrologic modification, and to a lesser extent, topography.
Genetic parameter and breeding value estimation of donkeys' problem-focused coping styles.
Navas González, Francisco Javier; Jordana Vidal, Jordi; León Jurado, José Manuel; Arando Arbulu, Ander; McLean, Amy Katherine; Delgado Bermejo, Juan Vicente
2018-05-12
Donkeys are recognized therapy or leisure-riding animals. Anecdotal evidence has suggested that more reactive donkeys or those more easily engaging flight mechanisms tend to be easier to train compared to those displaying the natural donkey behaviour of fight. This context brings together the need to quantify such traits and to genetically select donkeys displaying a neutral reaction during training, because of its implication with handler/rider safety and trainability. We analysed the scores for coping style traits from 300 Andalusian donkeys from 2013 to 2015. Three scales were applied to describe donkeys' response to 12 stimuli. Genetic parameters were estimated using multivariate models with year, sex, husbandry system and stimulus as fixed effects and age as a linear and quadratic covariable. Heritabilities were moderate, 0.18 ± 0.020 to 0.21 ± 0.021. Phenotypic correlations between intensity and mood/emotion or response type were negative and moderate (-0.21 and -0.25, respectively). Genetic correlations between the same variables were negative and moderately high (-0.46 and -0.53, respectively). Phenotypic and genetic correlations between mood/emotion and response type were positive and high (0.92 and 0.95, respectively). Breeding values enable selection methods that could lead to endangered breed preservation and genetically selecting donkeys for the uses that they may be most suitable. Copyright © 2018 Elsevier B.V. All rights reserved.
Kim, Eunji; Ivanov, Ivan; Hua, Jianping; Lampe, Johanna W; Hullar, Meredith Aj; Chapkin, Robert S; Dougherty, Edward R
2017-01-01
Ranking feature sets for phenotype classification based on gene expression is a challenging issue in cancer bioinformatics. When the number of samples is small, all feature selection algorithms are known to be unreliable, producing significant error, and error estimators suffer from different degrees of imprecision. The problem is compounded by the fact that the accuracy of classification depends on the manner in which the phenomena are transformed into data by the measurement technology. Because next-generation sequencing technologies amount to a nonlinear transformation of the actual gene or RNA concentrations, they can potentially produce less discriminative data relative to the actual gene expression levels. In this study, we compare the performance of ranking feature sets derived from a model of RNA-Seq data with that of a multivariate normal model of gene concentrations using 3 measures: (1) ranking power, (2) length of extensions, and (3) Bayes features. This is the model-based study to examine the effectiveness of reporting lists of small feature sets using RNA-Seq data and the effects of different model parameters and error estimators. The results demonstrate that the general trends of the parameter effects on the ranking power of the underlying gene concentrations are preserved in the RNA-Seq data, whereas the power of finding a good feature set becomes weaker when gene concentrations are transformed by the sequencing machine.
Metocean design parameter estimation for fixed platform based on copula functions
NASA Astrophysics Data System (ADS)
Zhai, Jinjin; Yin, Qilin; Dong, Sheng
2017-08-01
Considering the dependent relationship among wave height, wind speed, and current velocity, we construct novel trivariate joint probability distributions via Archimedean copula functions. Total 30-year data of wave height, wind speed, and current velocity in the Bohai Sea are hindcast and sampled for case study. Four kinds of distributions, namely, Gumbel distribution, lognormal distribution, Weibull distribution, and Pearson Type III distribution, are candidate models for marginal distributions of wave height, wind speed, and current velocity. The Pearson Type III distribution is selected as the optimal model. Bivariate and trivariate probability distributions of these environmental conditions are established based on four bivariate and trivariate Archimedean copulas, namely, Clayton, Frank, Gumbel-Hougaard, and Ali-Mikhail-Haq copulas. These joint probability models can maximize marginal information and the dependence among the three variables. The design return values of these three variables can be obtained by three methods: univariate probability, conditional probability, and joint probability. The joint return periods of different load combinations are estimated by the proposed models. Platform responses (including base shear, overturning moment, and deck displacement) are further calculated. For the same return period, the design values of wave height, wind speed, and current velocity obtained by the conditional and joint probability models are much smaller than those by univariate probability. Considering the dependence among variables, the multivariate probability distributions provide close design parameters to actual sea state for ocean platform design.
FREQ: A computational package for multivariable system loop-shaping procedures
NASA Technical Reports Server (NTRS)
Giesy, Daniel P.; Armstrong, Ernest S.
1989-01-01
Many approaches in the field of linear, multivariable time-invariant systems analysis and controller synthesis employ loop-sharing procedures wherein design parameters are chosen to shape frequency-response singular value plots of selected transfer matrices. A software package, FREQ, is documented for computing within on unified framework many of the most used multivariable transfer matrices for both continuous and discrete systems. The matrices are evaluated at user-selected frequency-response values, and singular values against frequency. Example computations are presented to demonstrate the use of the FREQ code.
Ghisi, Nédia C; Oliveira, Elton C; Mendonça Mota, Thais F; Vanzetto, Guilherme V; Roque, Aliciane A; Godinho, Jayson P; Bettim, Franciele Lima; Silva de Assis, Helena Cristina da; Prioli, Alberto J
2016-10-01
Aquatic pollutants produce multiple consequences in organisms, populations, communities and ecosystems, affecting the function of organs, reproductive state, population size, species survival and even biodiversity. In order to monitor the health of aquatic organisms, biomarkers have been used as effective tools in environmental risk assessment. The aim of this study is to evaluate, through a multivariate and integrative analysis, the response of the native species Hypostomus ancistroides over a pollution gradient in the main water supply body of northwestern Paraná state (Brazil). The condition factor, micronucleus test and erythrocyte nuclear abnormalities (ENA), comet assay, measurement of the cerebral and muscular enzyme acetylcholinesterase (AChE), and histopathological analysis of liver and gill were evaluated in fishes from three sites of the Pirapó River during the dry and rainy seasons. The multivariate general result showed that the interaction between the seasons and the sites was significant: there are variations in the rates of alterations in the biological parameters, depending on the time of year researched at each site. In general, the best results were observed for the site nearest the spring, and alterations in the parameters at the intermediate and downstream sites. In sum, the results of this study showed the necessity of a multivariate analysis, evaluating several biological parameters, to obtain an integrated response to the effects of the environmental pollutants on the organisms. Copyright © 2016 Elsevier Ltd. All rights reserved.
Osorio-Yáñez, Citlalli; Ayllon-Vergara, Julio C; Arreola-Mendoza, Laura; Aguilar-Madrid, Guadalupe; Hernández-Castellanos, Erika; Sánchez-Peña, Luz C; Del Razo, Luz M
2015-06-01
Inorganic arsenic (iAs) is a ubiquitous element present in the groundwater worldwide. Cardiovascular effects related to iAs exposure have been studied extensively in adult populations. Few epidemiological studies have been focused on iAs exposure-related cardiovascular disease in children. In this study we investigated the association between iAs exposure, blood pressure (BP), and functional and anatomical echocardiographic parameters in children. A cross-sectional study of 161 children between 3 and 8 years was conducted in Central Mexico. The total concentration of arsenic (As) species in urine (U-tAs) was determined by hydride generation-cryotrapping-atomic absorption spectrometry and lifetime iAs exposure was estimated by multiplying As concentrations measured in drinking water by the duration of water consumption in years (LAsE). BP was measured by standard protocols, and M-mode echocardiographic parameters were determined by ultrasonography. U-tAs concentration and LAsE were significantly associated with diastolic (DBP) and systolic blood pressure (SBP) in multivariable linear regression models: DBP and SBP were 0.013 (95% CI: 0.002, 0.024) and 0.021 (95% CI: 0.004, 0.037) mmHg higher in association with each 1-ng/mL increase in U-tAs (p < 0.025), respectively. Left ventricular mass (LVM) was significantly associated with LAsE [5.5 g higher (95% CI: 0.65, 10.26) in children with LAsE > 620 compared with < 382 μg/L-year; p = 0.03] in an adjusted multivariable model. The systolic function parameters left ventricular ejection fraction (EF) and shortening fraction were 3.67% (95% CI: -7.14, -0.20) and 3.41% (95% CI: -6.44, -0.37) lower, respectively, in children with U-tAs > 70 ng/mL compared with < 35 ng/mL. Early-life exposure to iAs was significantly associated with higher BP and LVM and with lower EF in our study population of Mexican children.
The significance of serum urea and renal function in patients with heart failure.
Gotsman, Israel; Zwas, Donna; Planer, David; Admon, Dan; Lotan, Chaim; Keren, Andre
2010-07-01
Renal function and urea are frequently abnormal in patients with heart failure (HF) and are predictive of increased mortality. The relative importance of each parameter is less clear. We prospectively compared the predictive value of renal function and serum urea on clinical outcome in patients with HF. Patients hospitalized with definite clinical diagnosis of HF (n = 355) were followed for short-term (1 yr) and long-term (mean, 6.5 yr) survival and HF rehospitalization. Increasing tertiles of discharge estimated glomerular filtration rate (eGFR) were an independent predictor of increased long-term survival (hazard ratio [HR], 0.65; 95% confidence interval [CI], 0.47-0.91; p = 0.01) but not short-term survival. Admission and discharge serum urea and blood urea nitrogen (BUN)/creatinine ratio were predictors of reduced short- and long-term survival on multivariate Cox regression analysis. Increasing tertiles of discharge urea were a predictor of reduced 1-year survival (HR, 2.13; 95% CI, 1.21-3.73; p = 0.009) and long-term survival (HR, 1.93; 95% CI, 1.37-2.71; p < 0.0001). Multivariate analysis including discharge eGFR and serum urea demonstrated that only serum urea remained a significant predictor of long-term survival; however, eGFR and BUN/creatinine ratio were both independently predictive of survival. Urea was more discriminative than eGFR in predicting long-term survival by area under the receiver operating characteristic curve (0.803 vs. 0.787; p = 0.01). Increasing tertiles of discharge serum urea and BUN/creatinine were independent predictors of HF rehospitalization and combined death and HF rehospitalization. This study suggests that serum urea is a more powerful predictor of survival than eGFR in patients with HF. This may be due to urea's relation to key biological parameters including renal, hemodynamic, and neurohormonal parameters pertaining to the overall clinical status of the patient with chronic HF.
NASA Technical Reports Server (NTRS)
Borovikov, Anna; Rienecker, Michele M.; Keppenne, Christian; Johnson, Gregory C.
2004-01-01
One of the most difficult aspects of ocean state estimation is the prescription of the model forecast error covariances. The paucity of ocean observations limits our ability to estimate the covariance structures from model-observation differences. In most practical applications, simple covariances are usually prescribed. Rarely are cross-covariances between different model variables used. Here a comparison is made between a univariate Optimal Interpolation (UOI) scheme and a multivariate OI algorithm (MvOI) in the assimilation of ocean temperature. In the UOI case only temperature is updated using a Gaussian covariance function and in the MvOI salinity, zonal and meridional velocities as well as temperature, are updated using an empirically estimated multivariate covariance matrix. Earlier studies have shown that a univariate OI has a detrimental effect on the salinity and velocity fields of the model. Apparently, in a sequential framework it is important to analyze temperature and salinity together. For the MvOI an estimation of the model error statistics is made by Monte-Carlo techniques from an ensemble of model integrations. An important advantage of using an ensemble of ocean states is that it provides a natural way to estimate cross-covariances between the fields of different physical variables constituting the model state vector, at the same time incorporating the model's dynamical and thermodynamical constraints as well as the effects of physical boundaries. Only temperature observations from the Tropical Atmosphere-Ocean array have been assimilated in this study. In order to investigate the efficacy of the multivariate scheme two data assimilation experiments are validated with a large independent set of recently published subsurface observations of salinity, zonal velocity and temperature. For reference, a third control run with no data assimilation is used to check how the data assimilation affects systematic model errors. While the performance of the UOI and MvOI is similar with respect to the temperature field, the salinity and velocity fields are greatly improved when multivariate correction is used, as evident from the analyses of the rms differences of these fields and independent observations. The MvOI assimilation is found to improve upon the control run in generating the water masses with properties close to the observed, while the UOI failed to maintain the temperature and salinity structure.
NASA Astrophysics Data System (ADS)
Bril, A.; Oshchepkov, S.; Yokota, T.; Yoshida, Y.; Morino, I.; Uchino, O.; Belikov, D. A.; Maksyutov, S. S.
2014-12-01
We retrieved the column-averaged dry air mole fraction of atmospheric carbon dioxide (XCO2) and methane (XCH4) from the radiance spectra measured by Greenhouse gases Observing SATellite (GOSAT) for 48 months of the satellite operation from June 2009. Recent version of the Photon path-length Probability Density Function (PPDF)-based algorithm was used to estimate XCO2 and optical path modifications in terms of PPDF parameters. We also present results of numerical simulations for over-land observations and "sharp edge" tests for sun-glint mode to discuss the algorithm accuracy under conditions of strong optical path modification. For the methane abundance retrieved from 1.67-µm-absorption band we applied optical path correction based on PPDF parameters from 1.6-µm carbon dioxide (CO2) absorption band. Similarly to CO2-proxy technique, this correction assumes identical light path modifications in 1.67-µm and 1.6-µm bands. However, proxy approach needs pre-defined XCO2 values to compute XCH4, whilst the PPDF-based approach does not use prior assumptions on CO2 concentrations.Post-processing data correction for XCO2 and XCH4 over land observations was performed using regression matrix based on multivariate analysis of variance (MANOVA). The MANOVA statistics was applied to the GOSAT retrievals using reference collocated measurements of Total Carbon Column Observing Network (TCCON). The regression matrix was constructed using the parameters that were found to correlate with GOSAT-TCCON discrepancies: PPDF parameters α and ρ, that are mainly responsible for shortening and lengthening of the optical path due to atmospheric light scattering; solar and satellite zenith angles; surface pressure; surface albedo in three GOSAT short wave infrared (SWIR) bands. Application of the post-correction generally improves statistical characteristics of the GOSAT-TCCON correlation diagrams for individual stations as well as for aggregated data.In addition to the analysis of the observations over 12 TCCON stations we estimated temporal and spatial trends (interannual XCO2 and XCH4 variations, seasonal cycles, latitudinal gradients) and compared them with modeled results as well as with similar estimates from other GOSAT retrievals.
NASA Astrophysics Data System (ADS)
Ghotbi, Saba; Sotoudeheian, Saeed; Arhami, Mohammad
2016-09-01
Satellite remote sensing products of AOD from MODIS along with appropriate meteorological parameters were used to develop statistical models and estimate ground-level PM10. Most of previous studies obtained meteorological data from synoptic weather stations, with rather sparse spatial distribution, and used it along with 10 km AOD product to develop statistical models, applicable for PM variations in regional scale (resolution of ≥10 km). In the current study, meteorological parameters were simulated with 3 km resolution using WRF model and used along with the rather new 3 km AOD product (launched in 2014). The resulting PM statistical models were assessed for a polluted and largely variable urban area, Tehran, Iran. Despite the critical particulate pollution problem, very few PM studies were conducted in this area. The issue of rather poor direct PM-AOD associations existed, due to different factors such as variations in particles optical properties, in addition to bright background issue for satellite data, as the studied area located in the semi-arid areas of Middle East. Statistical approach of linear mixed effect (LME) was used, and three types of statistical models including single variable LME model (using AOD as independent variable) and multiple variables LME model by using meteorological data from two sources, WRF model and synoptic stations, were examined. Meteorological simulations were performed using a multiscale approach and creating an appropriate physic for the studied region, and the results showed rather good agreements with recordings of the synoptic stations. The single variable LME model was able to explain about 61%-73% of daily PM10 variations, reflecting a rather acceptable performance. Statistical models performance improved through using multivariable LME and incorporating meteorological data as auxiliary variables, particularly by using fine resolution outputs from WRF (R2 = 0.73-0.81). In addition, rather fine resolution for PM estimates was mapped for the studied city, and resulting concentration maps were consistent with PM recordings at the existing stations.
Predictive and mechanistic multivariate linear regression models for reaction development
Santiago, Celine B.; Guo, Jing-Yao
2018-01-01
Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
2016-06-01
unlimited. v List of Tables Table 1 Single-lap-joint experimental parameters ..............................................7 Table 2 Survey ...Joints: Experimental and Workflow Protocols by Robert E Jensen, Daniel C DeSchepper, and David P Flanagan Approved for...TR-7696 ● JUNE 2016 US Army Research Laboratory Multivariate Analysis of High Through-Put Adhesively Bonded Single Lap Joints: Experimental
On Some Multiple Decision Problems
1976-08-01
parameter space. Some recent results in the area of subset selection formulation are Gnanadesikan and Gupta [28], Gupta and Studden [43], Gupta and...York, pp. 363-376. [27) Gnanadesikan , M. (1966). Some Selection and Ranking Procedures for Multivariate Normal Populations. Ph.D. Thesis. Dept. of...Statist., Purdue Univ., West Lafayette, Indiana 47907. [28) Gnanadesikan , M. and Gupta, S. S. (1970). Selection procedures for multivariate normal
Maione, Camila; Barbosa, Rommel Melgaço
2018-01-24
Rice is one of the most important staple foods around the world. Authentication of rice is one of the most addressed concerns in the present literature, which includes recognition of its geographical origin and variety, certification of organic rice and many other issues. Good results have been achieved by multivariate data analysis and data mining techniques when combined with specific parameters for ascertaining authenticity and many other useful characteristics of rice, such as quality, yield and others. This paper brings a review of the recent research projects on discrimination and authentication of rice using multivariate data analysis and data mining techniques. We found that data obtained from image processing, molecular and atomic spectroscopy, elemental fingerprinting, genetic markers, molecular content and others are promising sources of information regarding geographical origin, variety and other aspects of rice, being widely used combined with multivariate data analysis techniques. Principal component analysis and linear discriminant analysis are the preferred methods, but several other data classification techniques such as support vector machines, artificial neural networks and others are also frequently present in some studies and show high performance for discrimination of rice.
SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *
Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.
2014-01-01
The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844
Ensemble-Based Parameter Estimation in a Coupled General Circulation Model
Liu, Y.; Liu, Z.; Zhang, S.; ...
2014-09-10
Parameter estimation provides a potentially powerful approach to reduce model bias for complex climate models. Here, in a twin experiment framework, the authors perform the first parameter estimation in a fully coupled ocean–atmosphere general circulation model using an ensemble coupled data assimilation system facilitated with parameter estimation. The authors first perform single-parameter estimation and then multiple-parameter estimation. In the case of the single-parameter estimation, the error of the parameter [solar penetration depth (SPD)] is reduced by over 90% after ~40 years of assimilation of the conventional observations of monthly sea surface temperature (SST) and salinity (SSS). The results of multiple-parametermore » estimation are less reliable than those of single-parameter estimation when only the monthly SST and SSS are assimilated. Assimilating additional observations of atmospheric data of temperature and wind improves the reliability of multiple-parameter estimation. The errors of the parameters are reduced by 90% in ~8 years of assimilation. Finally, the improved parameters also improve the model climatology. With the optimized parameters, the bias of the climatology of SST is reduced by ~90%. Altogether, this study suggests the feasibility of ensemble-based parameter estimation in a fully coupled general circulation model.« less
Bayesian Estimation of Random Coefficient Dynamic Factor Models
ERIC Educational Resources Information Center
Song, Hairong; Ferrer, Emilio
2012-01-01
Dynamic factor models (DFMs) have typically been applied to multivariate time series data collected from a single unit of study, such as a single individual or dyad. The goal of DFMs application is to capture dynamics of multivariate systems. When multiple units are available, however, DFMs are not suited to capture variations in dynamics across…
Parametric Cost Models for Space Telescopes
NASA Technical Reports Server (NTRS)
Stahl, H. Philip
2010-01-01
A study is in-process to develop a multivariable parametric cost model for space telescopes. Cost and engineering parametric data has been collected on 30 different space telescopes. Statistical correlations have been developed between 19 variables of 59 variables sampled. Single Variable and Multi-Variable Cost Estimating Relationships have been developed. Results are being published.
A Simpli ed, General Approach to Simulating from Multivariate Copula Functions
Barry Goodwin
2012-01-01
Copulas have become an important analytic tool for characterizing multivariate distributions and dependence. One is often interested in simulating data from copula estimates. The process can be analytically and computationally complex and usually involves steps that are unique to a given parametric copula. We describe an alternative approach that uses \\probability{...
ERIC Educational Resources Information Center
Lix, Lisa M.; Algina, James; Keselman, H. J.
2003-01-01
The approximate degrees of freedom Welch-James (WJ) and Brown-Forsythe (BF) procedures for testing within-subjects effects in multivariate groups by trials repeated measures designs were investigated under departures from covariance homogeneity and normality. Empirical Type I error and power rates were obtained for least-squares estimators and…
Controlled Multivariate Evaluation of Open Education: Application of a Critical Model.
ERIC Educational Resources Information Center
Sewell, Alan F.; And Others
This paper continues previous reports of a controlled multivariate evaluation of a junior high school open-education program. A new method of estimating program objectives and implementation is presented, together with the nature and degree of obtained student outcomes. Open-program students were found to approve more highly of their learning…
Model transformations for state-space self-tuning control of multivariable stochastic systems
NASA Technical Reports Server (NTRS)
Shieh, Leang S.; Bao, Yuan L.; Coleman, Norman P.
1988-01-01
The design of self-tuning controllers for multivariable stochastic systems is considered analytically. A long-division technique for finding the similarity transformation matrix and transforming the estimated left MFD to the right MFD is developed; the derivation is given in detail, and the procedures involved are briefly characterized.
Multivariate meta-analysis using individual participant data.
Riley, R D; Price, M J; Jackson, D; Wardle, M; Gueyffier, F; Wang, J; Staessen, J A; White, I R
2015-06-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment-covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. © 2014 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.
Winterer, G; Androsova, G; Bender, O; Boraschi, D; Borchers, F; Dschietzig, T B; Feinkohl, I; Fletcher, P; Gallinat, J; Hadzidiakos, D; Haynes, J D; Heppner, F; Hetzer, S; Hendrikse, J; Ittermann, B; Kant, I M J; Kraft, A; Krannich, A; Krause, R; Kühn, S; Lachmann, G; van Montfort, S J T; Müller, A; Nürnberg, P; Ofosu, K; Pietsch, M; Pischon, T; Preller, J; Renzulli, E; Scheurer, K; Schneider, R; Slooter, A J C; Spies, C; Stamatakis, E; Volk, H D; Weber, S; Wolf, A; Yürek, F; Zacharias, N
2018-04-01
Postoperative cognitive impairment is among the most common medical complications associated with surgical interventions - particularly in elderly patients. In our aging society, it is an urgent medical need to determine preoperative individual risk prediction to allow more accurate cost-benefit decisions prior to elective surgeries. So far, risk prediction is mainly based on clinical parameters. However, these parameters only give a rough estimate of the individual risk. At present, there are no molecular or neuroimaging biomarkers available to improve risk prediction and little is known about the etiology and pathophysiology of this clinical condition. In this short review, we summarize the current state of knowledge and briefly present the recently started BioCog project (Biomarker Development for Postoperative Cognitive Impairment in the Elderly), which is funded by the European Union. It is the goal of this research and development (R&D) project, which involves academic and industry partners throughout Europe, to deliver a multivariate algorithm based on clinical assessments as well as molecular and neuroimaging biomarkers to overcome the currently unsatisfying situation. Copyright © 2017. Published by Elsevier Masson SAS.
A semi-parametric within-subject mixture approach to the analyses of responses and response times.
Molenaar, Dylan; Bolsinova, Maria; Vermunt, Jeroen K
2018-05-01
In item response theory, modelling the item response times in addition to the item responses may improve the detection of possible between- and within-subject differences in the process that resulted in the responses. For instance, if respondents rely on rapid guessing on some items but not on all, the joint distribution of the responses and response times will be a multivariate within-subject mixture distribution. Suitable parametric methods to detect these within-subject differences have been proposed. In these approaches, a distribution needs to be assumed for the within-class response times. In this paper, it is demonstrated that these parametric within-subject approaches may produce false positives and biased parameter estimates if the assumption concerning the response time distribution is violated. A semi-parametric approach is proposed which resorts to categorized response times. This approach is shown to hardly produce false positives and parameter bias. In addition, the semi-parametric approach results in approximately the same power as the parametric approach. © 2017 The British Psychological Society.
On construction of stochastic genetic networks based on gene expression sequences.
Ching, Wai-Ki; Ng, Michael M; Fung, Eric S; Akutsu, Tatsuya
2005-08-01
Reconstruction of genetic regulatory networks from time series data of gene expression patterns is an important research topic in bioinformatics. Probabilistic Boolean Networks (PBNs) have been proposed as an effective model for gene regulatory networks. PBNs are able to cope with uncertainty, corporate rule-based dependencies between genes and discover the sensitivity of genes in their interactions with other genes. However, PBNs are unlikely to use directly in practice because of huge amount of computational cost for obtaining predictors and their corresponding probabilities. In this paper, we propose a multivariate Markov model for approximating PBNs and describing the dynamics of a genetic network for gene expression sequences. The main contribution of the new model is to preserve the strength of PBNs and reduce the complexity of the networks. The number of parameters of our proposed model is O(n2) where n is the number of genes involved. We also develop efficient estimation methods for solving the model parameters. Numerical examples on synthetic data sets and practical yeast data sequences are given to demonstrate the effectiveness of the proposed model.
ERIC Educational Resources Information Center
Zu, Jiyun; Yuan, Ke-Hai
2012-01-01
In the nonequivalent groups with anchor test (NEAT) design, the standard error of linear observed-score equating is commonly estimated by an estimator derived assuming multivariate normality. However, real data are seldom normally distributed, causing this normal estimator to be inconsistent. A general estimator, which does not rely on the…
Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K
2017-01-01
The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.
Geostatistics and petroleum geology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hohn, M.E.
1988-01-01
This book examines purpose and use of geostatistics in exploration and development of oil and gas with an emphasis on appropriate and pertinent case studies. It present an overview of geostatistics. Topics covered include: The semivariogram; Linear estimation; Multivariate geostatistics; Nonlinear estimation; From indicator variables to nonparametric estimation; and More detail, less certainty; conditional simulation.
Different methods of hilar clamping during partial nephrectomy: Impact on renal function.
Lee, Jeong Woo; Kim, Hwanik; Choo, Minsoo; Park, Yong Hyun; Ku, Ja Hyeon; Kim, Hyeon Hoe; Kwak, Cheol
2014-03-01
To evaluate the impact of different hilar clamping methods on changes in renal function after partial nephrectomy. We analyzed the clinical data of 369 patients who underwent partial nephrectomy for a single renal tumor of size ≤4.0 cm and a normal contralateral kidney. Patients were separated into three groups depending on hilar clamping method: non-clamping, cold ischemia and warm ischemia. Estimated glomerular filtration rate was examined at preoperative, nadir and 1 year postoperatively. Percent change in estimated glomerular filtration rate was used as the parameter to assess the renal functional outcome. Percent change in nadir estimated glomerular filtration rate in the non-clamping group was significantly less compared with the cold ischemia and warm ischemia groups (P < 0.001). However, no significant differences among the groups were noted in percent change of estimated glomerular filtration rate at 1 year (P = 0.348). The cold ischemia group had a similar serial change of postoperative renal function compared with the warm ischemia group. Percent change in 1-year estimated glomerular filtration rate increased with increasing ischemia time in the cold ischemia (P for trend = 0.073) and warm ischemia groups (P for trend = 0.010). On multivariate analysis, hilar clamping (both warm ischemia and cold ischemia) were significantly associated with percent change in nadir estimated glomerular filtration rate, but not in 1-year estimated glomerular filtration rate. Non-clamping partial nephrectomy results in a lower percent change in nadir estimated glomerular filtration rate, whereas it carries an estimated glomerular filtration rate change at 1 year that is similar to partial nephrectomy with cold ischemia and warm ischemia. Cold ischemia and warm ischemia provide a similar effect on renal function. Therefore, when hilar clamping is required, minimization of ischemia time is necessary. © 2013 The Japanese Urological Association.
Yue, Chen; Chen, Shaojie; Sair, Haris I; Airan, Raag; Caffo, Brian S
2015-09-01
Data reproducibility is a critical issue in all scientific experiments. In this manuscript, the problem of quantifying the reproducibility of graphical measurements is considered. The image intra-class correlation coefficient (I2C2) is generalized and the graphical intra-class correlation coefficient (GICC) is proposed for such purpose. The concept for GICC is based on multivariate probit-linear mixed effect models. A Markov Chain Monte Carlo EM (mcm-cEM) algorithm is used for estimating the GICC. Simulation results with varied settings are demonstrated and our method is applied to the KIRBY21 test-retest dataset.
Pätzug, Konrad; Friedrich, Nele; Kische, Hanna; Hannemann, Anke; Völzke, Henry; Nauck, Matthias; Keevil, Brian G; Haring, Robin
2017-12-01
The present study investigates potential associations between liquid chromatography-mass spectrometry (LC-MS) measured sex hormones, dehydroepiandrosterone sulphate, sex hormone-binding globulin (SHBG) and bone ultrasound parameters at the heel in men and women from the general population. Data from 502 women and 425 men from the population-based Study of Health in Pomerania (SHIP-TREND) were used. Cross-sectional associations of sex hormones including testosterone (TT), calculated free testosterone (FT), dehydroepiandrosterone sulphate (DHEAS), androstenedione (ASD), estrone (E1) and SHBG with quantitative ultrasound (QUS) parameters at the heel, including broadband ultrasound attenuation (BUA), speed of sound (SOS) and stiffness index (SI) were examined by analysis of variance (ANOVA) and multivariable quantile regression models. Multivariable regression analysis showed a sex-specific inverse association of DHEAS with SI in men (Beta per SI unit = - 3.08, standard error (SE) = 0.88), but not in women (Beta = - 0.01, SE = 2.09). Furthermore, FT was positively associated with BUA in men (Beta per BUA unit = 29.0, SE = 10.1). None of the other sex hormones (ASD, E1) or SHBG was associated with QUS parameters after multivariable adjustment. This cross-sectional population-based study revealed independent associations of DHEAS and FT with QUS parameters in men, suggesting a potential influence on male bone metabolism. The predictive role of DHEAS and FT as a marker for osteoporosis in men warrants further investigation in clinical trials and large-scale observational studies.
NASA Astrophysics Data System (ADS)
Bagán, H.; Tarancón, A.; Rauret, G.; García, J. F.
2008-07-01
The quenching parameters used to model detection efficiency variations in scintillation measurements have not evolved since the decade of 1970s. Meanwhile, computer capabilities have increased enormously and ionization quenching has appeared in practical measurements using plastic scintillation. This study compares the results obtained in activity quantification by plastic scintillation of 14C samples that contain colour and ionization quenchers, using classical (SIS, SCR-limited, SCR-non-limited, SIS(ext), SQP(E)) and evolved (MWA-SCR and WDW) parameters and following three calibration approaches: single step, which does not take into account the quenching mechanism; two steps, which takes into account the quenching phenomena; and multivariate calibration. Two-step calibration (ionization followed by colour) yielded the lowest relative errors, which means that each quenching phenomenon must be specifically modelled. In addition, the sample activity was quantified more accurately when the evolved parameters were used. Multivariate calibration-PLS also yielded better results than those obtained using classical parameters, which confirms that the quenching phenomena must be taken into account. The detection limits for each calibration method and each parameter were close to those obtained theoretically using the Currie approach.
Estimation of Genetic Parameters from Longitudinal Records of Body Weight of Berkshire Pigs
Lee, Dong-Hee; Do, Chang-Hee
2012-01-01
Direct and maternal genetic heritabilities and their correlations with body weight at 5 stages in the life span of purebred Berkshire pigs, from birth to harvest, were estimated to scrutinize body weight development with the records for 5,088 purebred Berkshire pigs in a Korean farm, using the REML based on an animal model. Body weights were measured at birth (Birth), at weaning (Weaning: mean 22.9 d), at the beginning of a performance test (On: mean 72.7 d), at the end of a performance test (Off: mean 152.4 d), and at harvest (Finish: mean 174.3 d). Ordinary polynomials and Legendre with order 1, 2, and 3 were adopted to adjust body weight with age in the multivariate animal models. Legendre with order 3 fitted best concerning prediction error deviation (PED) and yielded the lowest AIC for multivariate analysis of longitudinal body weights. Direct genetic correlations between body weight at Birth and body weight at Weaning, On, Off, and Finish were 0.48, 0.36, 0.10, and 0.10, respectively. The estimated maternal genetic correlations of body weight at Finish with body weight at Birth, Weaning, On, and Off were 0.39, 0.49, 0.65, and 0.90, respectively. Direct genetic heritabilities progressively increased from birth to harvest and were 0.09, 0.11, 0.20, 0.31, and 0.43 for body weight at Birth, Weaning, On, Off, and Finish, respectively. Maternal genetic heritabilities generally decreased and were 0.26, 0.34, 0.15, 0.10, and 0.10 for body weight at Birth, Weaning, On, Off, and Finish, respectively. As pigs age, maternal genetic effects on growth are reduced and pigs begin to rely more on the expression of their own genes. Although maternal genetic effects on body weight may not be large, they are sustained through life. PMID:25049624
Mikulich-Gilbertson, Susan K; Wagner, Brandie D; Grunwald, Gary K; Riggs, Paula D; Zerbe, Gary O
2018-01-01
Medical research is often designed to investigate changes in a collection of response variables that are measured repeatedly on the same subjects. The multivariate generalized linear mixed model (MGLMM) can be used to evaluate random coefficient associations (e.g. simple correlations, partial regression coefficients) among outcomes that may be non-normal and differently distributed by specifying a multivariate normal distribution for their random effects and then evaluating the latent relationship between them. Empirical Bayes predictors are readily available for each subject from any mixed model and are observable and hence, plotable. Here, we evaluate whether second-stage association analyses of empirical Bayes predictors from a MGLMM, provide a good approximation and visual representation of these latent association analyses using medical examples and simulations. Additionally, we compare these results with association analyses of empirical Bayes predictors generated from separate mixed models for each outcome, a procedure that could circumvent computational problems that arise when the dimension of the joint covariance matrix of random effects is large and prohibits estimation of latent associations. As has been shown in other analytic contexts, the p-values for all second-stage coefficients that were determined by naively assuming normality of empirical Bayes predictors provide a good approximation to p-values determined via permutation analysis. Analyzing outcomes that are interrelated with separate models in the first stage and then associating the resulting empirical Bayes predictors in a second stage results in different mean and covariance parameter estimates from the maximum likelihood estimates generated by a MGLMM. The potential for erroneous inference from using results from these separate models increases as the magnitude of the association among the outcomes increases. Thus if computable, scatterplots of the conditionally independent empirical Bayes predictors from a MGLMM are always preferable to scatterplots of empirical Bayes predictors generated by separate models, unless the true association between outcomes is zero.
NASA Technical Reports Server (NTRS)
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
Estimation of railroad capacity using parametric methods.
DOT National Transportation Integrated Search
2013-12-01
This paper reviews different methodologies used for railroad capacity estimation and presents a user-friendly method to measure capacity. The objective of this paper is to use multivariate regression analysis to develop a continuous relation of the d...
Fusion of AIRSAR and TM Data for Parameter Classification and Estimation in Dense and Hilly Forests
NASA Technical Reports Server (NTRS)
Moghaddam, Mahta; Dungan, J. L.; Coughlan, J. C.
2000-01-01
The expanded remotely sensed data space consisting of coincident radar backscatter and optical reflectance data provides for a more complete description of the Earth surface. This is especially useful where many parameters are needed to describe a certain scene, such as in the presence of dense and complex-structured vegetation or where there is considerable underlying topography. The goal of this paper is to use a combination of radar and optical data to develop a methodology for parameter classification for dense and hilly forests, and further, class-specific parameter estimation. The area to be used in this study is the H. J. Andrews Forest in Oregon, one of the Long-Term Ecological Research (LTER) sites in the US. This area consists of various dense old-growth conifer stands, and contains significant topographic relief. The Andrews forest has been the subject of many ecological studies over several decades, resulting in an abundance of ground measurements. Recently, biomass and leaf-area index (LAI) values for approximately 30 reference stands have also become available which span a large range of those parameters. The remote sensing data types to be used are the C-, L-, and P-band polarimetric radar data from the JPL airborne SAR (AIRSAR), the C-band single-polarization data from the JPL topographic SAR (TOPSAR), and the Thematic Mapper (TM) data from Landsat, all acquired in late April 1998. The total number of useful independent data channels from the AIRSAR is 15 (three frequencies, each with three unique polarizations and amplitude and phase of the like-polarized correlation), from the TOPSAR is 2 (amplitude and phase of the interferometric correlation), and from the TM is 6 (the thermal band is not used). The range pixel spacing of the AIRSAR is 3.3m for C- and L-bands and 6.6m for P-band. The TOPSAR pixel spacing is 10m, and the TM pixel size is 30m. To achieve parameter classification, first a number of parameters are defined which are of interest to ecologists for forest process modeling. These parameters include total biomass, leaf biomass, LAI, and tree height. The remote sensing data from radar and TM are used to formulate a multivariate analysis problem given the ground measurements of the parameters. Each class of each parameter is defined by a probability density function (pdf), the spread of which defines the range of that class. High classification accuracy results from situations in which little overlap occurs between pdfs. Classification results provide the basis for the future work of class-specific parameter estimation using radar and optical data. This work was performed in part by the Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, and in part by the NASA Ames Research Center, Moffett Field, CA, both under contract from the National Aeronautics and Space Administration.
Norby, Richard J.; Gu, Lianhong; Haworth, Ivan C.; ...
2016-11-21
Here, our objective was to analyze and summarize data describing photosynthetic parameters and foliar nutrient concentrations from tropical forests in Panama to inform model representation of phosphorus (P) limitation of tropical forest productivity. Gas exchange and nutrient content data were collected from 144 observations of upper canopy leaves from at least 65 species at two forest sites in Panama, differing in species composition, rainfall and soil fertility. Photosynthetic parameters were derived from analysis of assimilation rate vs internal CO 2 concentration curves ( A/C i), and relationships with foliar nitrogen (N) and P content were developed. The relationships between area-basedmore » photosynthetic parameters and nutrients were of similar strength for N and P and robust across diverse species and site conditions. The strongest relationship expressed maximum electron transport rate (J max) as a multivariate function of both N and P, and this relationship was improved with the inclusion of independent data on wood density. Models that estimate photosynthesis from foliar N would be improved only modestly by including additional data on foliar P, but doing so may increase the capability of models to predict future conditions in P-limited tropical forests, especially when combined with data on edaphic conditions and other environmental drivers.« less
Haller, Florian; Zhang, Jitao David; Moskalev, Evgeny A; Braun, Alexander; Otto, Claudia; Geddert, Helene; Riazalhosseini, Yasser; Ward, Aoife; Balwierz, Aleksandra; Schaefer, Inga-Marie; Cameron, Silke; Ghadimi, B Michael; Agaimy, Abbas; Fletcher, Jonathan A; Hoheisel, Jörg; Hartmann, Arndt; Werner, Martin; Wiemann, Stefan; Sahin, Ozgür
2015-03-01
Gastrointestinal stromal tumors (GISTs) have distinct gene expression patterns according to localization, genotype and aggressiveness. DNA methylation at CpG dinucleotides is an important mechanism for regulation of gene expression. We performed targeted DNA methylation analysis of 1.505 CpG loci in 807 cancer-related genes in a cohort of 76 GISTs, combined with genome-wide mRNA expression analysis in 22 GISTs, to identify signatures associated with clinicopathological parameters and prognosis. Principal component analysis revealed distinct DNA methylation patterns associated with anatomical localization, genotype, mitotic counts and clinical follow-up. Methylation of a single CpG dinucleotide in the non-CpG island promoter of SPP1 was significantly correlated with shorter disease-free survival. Hypomethylation of this CpG was an independent prognostic parameter in a multivariate analysis compared to anatomical localization, genotype, tumor size and mitotic counts in a cohort of 141 GISTs with clinical follow-up. The epigenetic regulation of SPP1 was confirmed in vitro, and the functional impact of SPP1 protein on tumorigenesis-related signaling pathways was demonstrated. In summary, SPP1 promoter methylation is a novel and independent prognostic parameter in GISTs, and might be helpful in estimating the aggressiveness of GISTs from the intermediate-risk category. © 2014 UICC.
NASA Technical Reports Server (NTRS)
Hague, D. S.; Merz, A. W.
1975-01-01
Multivariable search techniques are applied to a particular class of airfoil optimization problems. These are the maximization of lift and the minimization of disturbance pressure magnitude in an inviscid nonlinear flow field. A variety of multivariable search techniques contained in an existing nonlinear optimization code, AESOP, are applied to this design problem. These techniques include elementary single parameter perturbation methods, organized search such as steepest-descent, quadratic, and Davidon methods, randomized procedures, and a generalized search acceleration technique. Airfoil design variables are seven in number and define perturbations to the profile of an existing NACA airfoil. The relative efficiency of the techniques are compared. It is shown that elementary one parameter at a time and random techniques compare favorably with organized searches in the class of problems considered. It is also shown that significant reductions in disturbance pressure magnitude can be made while retaining reasonable lift coefficient values at low free stream Mach numbers.
Fast computation of the multivariable stability margin for real interrelated uncertain parameters
NASA Technical Reports Server (NTRS)
Sideris, Athanasios; Sanchez Pena, Ricardo S.
1988-01-01
A novel algorithm for computing the multivariable stability margin for checking the robust stability of feedback systems with real parametric uncertainty is proposed. This method eliminates the need for the frequency search involved in another given algorithm by reducing it to checking a finite number of conditions. These conditions have a special structure, which allows a significant improvement on the speed of computations.
Fixed order dynamic compensation for multivariable linear systems
NASA Technical Reports Server (NTRS)
Kramer, F. S.; Calise, A. J.
1986-01-01
This paper considers the design of fixed order dynamic compensators for multivariable time invariant linear systems, minimizing a linear quadratic performance cost functional. Attention is given to robustness issues in terms of multivariable frequency domain specifications. An output feedback formulation is adopted by suitably augmenting the system description to include the compensator states. Either a controller or observer canonical form is imposed on the compensator description to reduce the number of free parameters to its minimal number. The internal structure of the compensator is prespecified by assigning a set of ascending feedback invariant indices, thus forming a Brunovsky structure for the nominal compensator.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Y.; Liu, Z.; Zhang, S.
Parameter estimation provides a potentially powerful approach to reduce model bias for complex climate models. Here, in a twin experiment framework, the authors perform the first parameter estimation in a fully coupled ocean–atmosphere general circulation model using an ensemble coupled data assimilation system facilitated with parameter estimation. The authors first perform single-parameter estimation and then multiple-parameter estimation. In the case of the single-parameter estimation, the error of the parameter [solar penetration depth (SPD)] is reduced by over 90% after ~40 years of assimilation of the conventional observations of monthly sea surface temperature (SST) and salinity (SSS). The results of multiple-parametermore » estimation are less reliable than those of single-parameter estimation when only the monthly SST and SSS are assimilated. Assimilating additional observations of atmospheric data of temperature and wind improves the reliability of multiple-parameter estimation. The errors of the parameters are reduced by 90% in ~8 years of assimilation. Finally, the improved parameters also improve the model climatology. With the optimized parameters, the bias of the climatology of SST is reduced by ~90%. Altogether, this study suggests the feasibility of ensemble-based parameter estimation in a fully coupled general circulation model.« less
A Self-Organizing Maps approach to assess the wave climate of the Adriatic Sea
NASA Astrophysics Data System (ADS)
Barbariol, Francesco; Marcello Falcieri, Francesco; Scotton, Carlotta; Benetazzo, Alvise; Bergamasco, Andrea; Bergamasco, Filippo; Bonaldo, Davide; Carniel, Sandro; Sclavo, Mauro
2015-04-01
The assessment of wave conditions at sea is fruitful for many research fields in marine and atmospheric sciences and for the human activities in the marine environment. To this end, in the last decades the observational network, that mostly relies on buoys, satellites and other probes from fixed platforms, has been integrated with numerical models outputs, which allow to compute the parameters of sea states (e.g. the significant wave height, the mean and peak wave periods, the mean and peak wave directions) over wider regions. Apart from the collection of wave parameters observed at specific sites or modeled on arbitrary domains, the data processing performed to infer the wave climate at those sites is a crucial step in order to provide high quality data and information to the community. In this context, several statistical techniques has been used to model the randomness of wave parameters. While univariate and bivariate probability distribution functions (pdf) are routinely used, multivariate pdfs that model the probability structure of more than two wave parameters are hardly managed. Recently, the Self-Organizing Maps (SOM) technique has been successfully applied to represent the multivariate random wave climate at sites around the Iberian peninsula and the South America continent. Indeed, the visualization properties offered by this technique allow to get the dependencies between the different parameters by visual inspection. In this study, carried out in the frame of the Italian National Flagship Project "RITMARE", we take advantage of the SOM technique to assess the multivariate wave climate over the Adriatic Sea, a semi-enclosed basin in the north-eastern Mediterranean Sea, where winds from North-East (called "Bora") and South-East (called "Sirocco") mainly blow causing sea storms. By means of the SOM techniques we can observe the multivariate character of the typical Bora and Sirocco wave features in the Adriatic Sea. To this end, we used both observed and modeled wave parameters. The "Acqua Alta" oceanographic tower in the northern Adriatic Sea (ISMAR-CNR) and the Italian Data Buoy Network (RON, managed by ISPRA) off the western Adriatic coasts furnished the wave parameters at specific sites of interest. Widespread wave parameters were obtained by means of a numerical SWAN wave model that was implemented on the whole Adriatic Sea with a 6x6 km2 resolution and forced by the high resolution COSMO-I7 atmospheric model for the period 2007-2013.
Nguyen, Nguyen H.; Hamzah, Azhar; Thoa, Ngo P.
2017-01-01
The extent to which genetic gain achieved from selection programs under strictly controlled environments in the nucleus that can be expressed in commercial production systems is not well-documented in aquaculture species. The main aim of this paper was to assess the effects of genotype by environment interaction on genetic response and genetic parameters for four body traits (harvest weight, standard length, body depth, body width) and survival in Red tilapia (Oreochromis spp.). The growth and survival data were recorded on 19,916 individual fish from a pedigreed population undergoing three generations of selection for increased harvest weight in earthen ponds from 2010 to 2012 at the Aquaculture Extension Center, Department of Fisheries, Jitra in Kedah, Malaysia. The pedigree comprised a total of 224 sires and 262 dams, tracing back to the base population in 2009. A multivariate animal model was used to measure genetic response and estimate variance and covariance components. When the homologous body traits in freshwater pond and cage were treated as genetically distinct traits, the genetic correlations between the two environments were high (0.85–0.90) for harvest weight and square root of harvest weight but the estimates were of lower magnitudes for length, width and depth (0.63–0.79). The heritabilities estimated for the five traits studied differed between pond (0.02 to 0.22) and cage (0.07 to 0.68). The common full-sib effects were large, ranging from 0.23 to 0.59 in pond and 0.11 to 0.31 in cage across all traits. The direct and correlated responses for four body traits were generally greater in pond than in cage environments (0.011–1.561 vs. −0.033–0.567 genetic standard deviation units, respectively). Selection for increased harvest body weight resulted in positive genetic changes in survival rate in both pond and cage culture. In conclusion, the reduced selection response and the magnitude of the genetic parameter estimates in the production environment (i.e., cage) relative to those achieved in the nucleus (pond) were a result of the genotype by environment interaction and this effect should be taken into consideration in the future breeding program for Red tilapia. PMID:28659970
The Dirichlet-Multinomial Model for Multivariate Randomized Response Data and Small Samples
ERIC Educational Resources Information Center
Avetisyan, Marianna; Fox, Jean-Paul
2012-01-01
In survey sampling the randomized response (RR) technique can be used to obtain truthful answers to sensitive questions. Although the individual answers are masked due to the RR technique, individual (sensitive) response rates can be estimated when observing multivariate response data. The beta-binomial model for binary RR data will be generalized…
Denis Valle; Benjamin Baiser; Christopher W. Woodall; Robin Chazdon; Jerome Chave
2014-01-01
We propose a novel multivariate method to analyse biodiversity data based on the Latent Dirichlet Allocation (LDA) model. LDA, a probabilistic model, reduces assemblages to sets of distinct component communities. It produces easily interpretable results, can represent abrupt and gradual changes in composition, accommodates missing data and allows for coherent estimates...
2011-01-01
where r << P. The use of PCA for finding outliers in multivariate data is surveyed by Gnanadesikan and Kettenring16 and Rao.17 As alluded to earlier...1984. 16. Gnanadesikan R and Kettenring JR. Robust estimates, residu als, and outlier detection with multiresponse data. Biometrics 1972; 28: 81–124
NASA Astrophysics Data System (ADS)
Chen, Yi-Ying; Chu, Chia-Ren; Li, Ming-Hsu
2012-10-01
SummaryIn this paper we present a semi-parametric multivariate gap-filling model for tower-based measurement of latent heat flux (LE). Two statistical techniques, the principal component analysis (PCA) and a nonlinear interpolation approach were integrated into this LE gap-filling model. The PCA was first used to resolve the multicollinearity relationships among various environmental variables, including radiation, soil moisture deficit, leaf area index, wind speed, etc. Two nonlinear interpolation methods, multiple regressions (MRS) and the K-nearest neighbors (KNNs) were examined with random selected flux gaps for both clear sky and nighttime/cloudy data to incorporate into this LE gap-filling model. Experimental results indicated that the KNN interpolation approach is able to provide consistent LE estimations while MRS presents over estimations during nighttime/cloudy. Rather than using empirical regression parameters, the KNN approach resolves the nonlinear relationship between the gap-filled LE flux and principal components with adaptive K values under different atmospheric states. The developed LE gap-filling model (PCA with KNN) works with a RMSE of 2.4 W m-2 (˜0.09 mm day-1) at a weekly time scale by adding 40% artificial flux gaps into original dataset. Annual evapotranspiration at this study site were estimated at 736 mm (1803 MJ) and 728 mm (1785 MJ) for year 2008 and 2009, respectively.
Kalman filter for statistical monitoring of forest cover across sub-continental regions
Raymond L. Czaplewski
1991-01-01
The Kalman filter is a multivariate generalization of the composite estimator which recursively combines a current direct estimate with a past estimate that is updated for expected change over time with a prediction model. The Kalman filter can estimate proportions of different cover types for sub-continental regions each year. A random sample of high-resolution...
Cerebral metastases in metastatic breast cancer: disease-specific risk factors and survival.
Heitz, F; Rochon, J; Harter, P; Lueck, H-J; Fisseler-Eckhoff, A; Barinoff, J; Traut, A; Lorenz-Salehi, F; du Bois, A
2011-07-01
Survival of patients suffering from cerebral metastases (CM) is limited. Identification of patients with a high risk for CM is warranted to adjust follow-up care and to evaluate preventive strategies. Exploratory analysis of disease-specific parameter in patients with metastatic breast cancer (MBC) treated between 1998 and 2008 using cumulative incidences and Fine and Grays' multivariable regression analyses. After a median follow-up of 4.0 years, 66 patients (10.5%) developed CM. The estimated probability for CM was 5%, 12% and 15% at 1, 5 and 10 years; in contrast, the probability of death without CM was 21%, 61% and 76%, respectively. A small tumor size, ER status, ductal histology, lung and lymph node metastases, human epidermal growth factor receptor 2 positive (HER2+) tumors, younger age and M0 were associated with CM in univariate analyses, the latter three being risk factors in the multivariable model. Survival was shortened in patient developing CM (24.0 months) compared with patients with no CM (33.6 months) in the course of MBC. Young patients, primary with non-metastatic disease and HER2+ tumors, have a high risk to develop CM in MBC. Survival of patients developing CM in the course of MBC is impaired compared with patients without CM.
Factors Controlling Sediment Load in The Central Anatolia Region of Turkey: Ankara River Basin.
Duru, Umit; Wohl, Ellen; Ahmadi, Mehdi
2017-05-01
Better understanding of the factors controlling sediment load at a catchment scale can facilitate estimation of soil erosion and sediment transport rates. The research summarized here enhances understanding of correlations between potential control variables on suspended sediment loads. The Soil and Water Assessment Tool was used to simulate flow and sediment at the Ankara River basin. Multivariable regression analysis and principal component analysis were then performed between sediment load and controlling variables. The physical variables were either directly derived from a Digital Elevation Model or from field maps or computed using established equations. Mean observed sediment rate is 6697 ton/year and mean sediment yield is 21 ton/y/km² from the gage. Soil and Water Assessment Tool satisfactorily simulated observed sediment load with Nash-Sutcliffe efficiency, relative error, and coefficient of determination (R²) values of 0.81, -1.55, and 0.93, respectively in the catchment. Therefore, parameter values from the physically based model were applied to the multivariable regression analysis as well as principal component analysis. The results indicate that stream flow, drainage area, and channel width explain most of the variability in sediment load among the catchments. The implications of the results, efficient siltation management practices in the catchment should be performed to stream flow, drainage area, and channel width.
Factors Controlling Sediment Load in The Central Anatolia Region of Turkey: Ankara River Basin
NASA Astrophysics Data System (ADS)
Duru, Umit; Wohl, Ellen; Ahmadi, Mehdi
2017-05-01
Better understanding of the factors controlling sediment load at a catchment scale can facilitate estimation of soil erosion and sediment transport rates. The research summarized here enhances understanding of correlations between potential control variables on suspended sediment loads. The Soil and Water Assessment Tool was used to simulate flow and sediment at the Ankara River basin. Multivariable regression analysis and principal component analysis were then performed between sediment load and controlling variables. The physical variables were either directly derived from a Digital Elevation Model or from field maps or computed using established equations. Mean observed sediment rate is 6697 ton/year and mean sediment yield is 21 ton/y/km² from the gage. Soil and Water Assessment Tool satisfactorily simulated observed sediment load with Nash-Sutcliffe efficiency, relative error, and coefficient of determination ( R²) values of 0.81, -1.55, and 0.93, respectively in the catchment. Therefore, parameter values from the physically based model were applied to the multivariable regression analysis as well as principal component analysis. The results indicate that stream flow, drainage area, and channel width explain most of the variability in sediment load among the catchments. The implications of the results, efficient siltation management practices in the catchment should be performed to stream flow, drainage area, and channel width.
Multivariable harmonic balance analysis of the neuronal oscillator for leech swimming.
Chen, Zhiyong; Zheng, Min; Friesen, W Otto; Iwasaki, Tetsuya
2008-12-01
Biological systems, and particularly neuronal circuits, embody a very high level of complexity. Mathematical modeling is therefore essential for understanding how large sets of neurons with complex multiple interconnections work as a functional system. With the increase in computing power, it is now possible to numerically integrate a model with many variables to simulate behavior. However, such analysis can be time-consuming and may not reveal the mechanisms underlying the observed phenomena. An alternative, complementary approach is mathematical analysis, which can demonstrate direct and explicit relationships between a property of interest and system parameters. This paper introduces a mathematical tool for analyzing neuronal oscillator circuits based on multivariable harmonic balance (MHB). The tool is applied to a model of the central pattern generator (CPG) for leech swimming, which comprises a chain of weakly coupled segmental oscillators. The results demonstrate the effectiveness of the MHB method and provide analytical explanations for some CPG properties. In particular, the intersegmental phase lag is estimated to be the sum of a nominal value and a perturbation, where the former depends on the structure and span of the neuronal connections and the latter is roughly proportional to the period gradient, communication delay, and the reciprocal of the intersegmental coupling strength.
Wang, Ming; Li, Zheng; Lee, Eun Young; Lewis, Mechelle M; Zhang, Lijun; Sterling, Nicholas W; Wagner, Daymond; Eslinger, Paul; Du, Guangwei; Huang, Xuemei
2017-09-25
It is challenging for current statistical models to predict clinical progression of Parkinson's disease (PD) because of the involvement of multi-domains and longitudinal data. Past univariate longitudinal or multivariate analyses from cross-sectional trials have limited power to predict individual outcomes or a single moment. The multivariate generalized linear mixed-effect model (GLMM) under the Bayesian framework was proposed to study multi-domain longitudinal outcomes obtained at baseline, 18-, and 36-month. The outcomes included motor, non-motor, and postural instability scores from the MDS-UPDRS, and demographic and standardized clinical data were utilized as covariates. The dynamic prediction was performed for both internal and external subjects using the samples from the posterior distributions of the parameter estimates and random effects, and also the predictive accuracy was evaluated based on the root of mean square error (RMSE), absolute bias (AB) and the area under the receiver operating characteristic (ROC) curve. First, our prediction model identified clinical data that were differentially associated with motor, non-motor, and postural stability scores. Second, the predictive accuracy of our model for the training data was assessed, and improved prediction was gained in particularly for non-motor (RMSE and AB: 2.89 and 2.20) compared to univariate analysis (RMSE and AB: 3.04 and 2.35). Third, the individual-level predictions of longitudinal trajectories for the testing data were performed, with ~80% observed values falling within the 95% credible intervals. Multivariate general mixed models hold promise to predict clinical progression of individual outcomes in PD. The data was obtained from Dr. Xuemei Huang's NIH grant R01 NS060722 , part of NINDS PD Biomarker Program (PDBP). All data was entered within 24 h of collection to the Data Management Repository (DMR), which is publically available ( https://pdbp.ninds.nih.gov/data-management ).
Application of two tests of multivariate discordancy to fisheries data sets
Stapanian, M.A.; Kocovsky, P.M.; Garner, F.C.
2008-01-01
The generalized (Mahalanobis) distance and multivariate kurtosis are two powerful tests of multivariate discordancies (outliers). Unlike the generalized distance test, the multivariate kurtosis test has not been applied as a test of discordancy to fisheries data heretofore. We applied both tests, along with published algorithms for identifying suspected causal variable(s) of discordant observations, to two fisheries data sets from Lake Erie: total length, mass, and age from 1,234 burbot, Lota lota; and 22 combinations of unique subsets of 10 morphometrics taken from 119 yellow perch, Perca flavescens. For the burbot data set, the generalized distance test identified six discordant observations and the multivariate kurtosis test identified 24 discordant observations. In contrast with the multivariate tests, the univariate generalized distance test identified no discordancies when applied separately to each variable. Removing discordancies had a substantial effect on length-versus-mass regression equations. For 500-mm burbot, the percent difference in estimated mass after removing discordancies in our study was greater than the percent difference in masses estimated for burbot of the same length in lakes that differed substantially in productivity. The number of discordant yellow perch detected ranged from 0 to 2 with the multivariate generalized distance test and from 6 to 11 with the multivariate kurtosis test. With the kurtosis test, 108 yellow perch (90.7%) were identified as discordant in zero to two combinations, and five (4.2%) were identified as discordant in either all or 21 of the 22 combinations. The relationship among the variables included in each combination determined which variables were identified as causal. The generalized distance test identified between zero and six discordancies when applied separately to each variable. Removing the discordancies found in at least one-half of the combinations (k=5) had a marked effect on a principal components analysis. In particular, the percent of the total variation explained by second and third principal components, which explain shape, increased by 52 and 44% respectively when the discordancies were removed. Multivariate applications of the tests have numerous ecological advantages over univariate applications, including improved management of fish stocks and interpretation of multivariate morphometric data. ?? 2007 Springer Science+Business Media B.V.
Tracking the time-varying cortical connectivity patterns by adaptive multivariate estimators.
Astolfi, L; Cincotti, F; Mattia, D; De Vico Fallani, F; Tocci, A; Colosimo, A; Salinari, S; Marciani, M G; Hesse, W; Witte, H; Ursino, M; Zavaglia, M; Babiloni, F
2008-03-01
The directed transfer function (DTF) and the partial directed coherence (PDC) are frequency-domain estimators that are able to describe interactions between cortical areas in terms of the concept of Granger causality. However, the classical estimation of these methods is based on the multivariate autoregressive modelling (MVAR) of time series, which requires the stationarity of the signals. In this way, transient pathways of information transfer remains hidden. The objective of this study is to test a time-varying multivariate method for the estimation of rapidly changing connectivity relationships between cortical areas of the human brain, based on DTF/PDC and on the use of adaptive MVAR modelling (AMVAR) and to apply it to a set of real high resolution EEG data. This approach will allow the observation of rapidly changing influences between the cortical areas during the execution of a task. The simulation results indicated that time-varying DTF and PDC are able to estimate correctly the imposed connectivity patterns under reasonable operative conditions of signal-to-noise ratio (SNR) ad number of trials. An SNR of five and a number of trials of at least 20 provide a good accuracy in the estimation. After testing the method by the simulation study, we provide an application to the cortical estimations obtained from high resolution EEG data recorded from a group of healthy subject during a combined foot-lips movement and present the time-varying connectivity patterns resulting from the application of both DTF and PDC. Two different cortical networks were detected with the proposed methods, one constant across the task and the other evolving during the preparation of the joint movement.
Probabilistic images (PBIS): A concise image representation technique for multiple parameters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, L.C.; Yeh, S.H.; Chen, Z.
1984-01-01
Based on m parametric images (PIs) derived from a dynamic series (DS), each pixel of DS is regarded as an m-dimensional vector. Given one set of normal samples (pixels) N and another of abnormal samples A, probability density functions (pdfs) of both sets are estimated. Any unknown sample is classified into N or A by calculating the probability of its being in the abnormal set using the Bayes' theorem. Instead of estimating the multivariate pdfs, a distance ratio transformation is introduced to map the m-dimensional sample space to one dimensional Euclidean space. Consequently, the image that localizes the regional abnormalitiesmore » is characterized by the probability of being abnormal. This leads to the new representation scheme of PBIs. Tc-99m HIDA study for detecting intrahepatic lithiasis (IL) was chosen as an example of constructing PBI from 3 parameters derived from DS and such a PBI was compared with those 3 PIs, namely, retention ratio image (RRI), peak time image (TNMAX) and excretion mean transit time image (EMTT). 32 normal subjects and 20 patients with proved IL were collected and analyzed. The resultant sensitivity and specificity of PBI were 97% and 98% respectively. They were superior to those of any of the 3 PIs: RRI (94/97), TMAX (86/88) and EMTT (94/97). Furthermore, the contrast of PBI was much better than that of any other image. This new image formation technique, based on multiple parameters, shows the functional abnormalities in a structural way. Its good contrast makes the interpretation easy. This technique is powerful compared to the existing parametric image method.« less
NASA Astrophysics Data System (ADS)
Wang, C.; Rubin, Y.
2014-12-01
Spatial distribution of important geotechnical parameter named compression modulus Es contributes considerably to the understanding of the underlying geological processes and the adequate assessment of the Es mechanics effects for differential settlement of large continuous structure foundation. These analyses should be derived using an assimilating approach that combines in-situ static cone penetration test (CPT) with borehole experiments. To achieve such a task, the Es distribution of stratum of silty clay in region A of China Expo Center (Shanghai) is studied using the Bayesian-maximum entropy method. This method integrates rigorously and efficiently multi-precision of different geotechnical investigations and sources of uncertainty. Single CPT samplings were modeled as a rational probability density curve by maximum entropy theory. Spatial prior multivariate probability density function (PDF) and likelihood PDF of the CPT positions were built by borehole experiments and the potential value of the prediction point, then, preceding numerical integration on the CPT probability density curves, the posterior probability density curve of the prediction point would be calculated by the Bayesian reverse interpolation framework. The results were compared between Gaussian Sequential Stochastic Simulation and Bayesian methods. The differences were also discussed between single CPT samplings of normal distribution and simulated probability density curve based on maximum entropy theory. It is shown that the study of Es spatial distributions can be improved by properly incorporating CPT sampling variation into interpolation process, whereas more informative estimations are generated by considering CPT Uncertainty for the estimation points. Calculation illustrates the significance of stochastic Es characterization in a stratum, and identifies limitations associated with inadequate geostatistical interpolation techniques. This characterization results will provide a multi-precision information assimilation method of other geotechnical parameters.
Is the bronchodilator test an useful tool to measure asthma control?
Ferrer Galván, Marta; Javier Alvarez Gutiérrez, Francisco; Romero Falcón, Auxiliadora; Romero Romero, Beatriz; Sáez, Antonia; Medina Gallardo, Juan Francisco
2017-05-01
Asthma control includes the control of symptoms and future risk. We sought to evaluate the usefulness of the degree of spirometric reversibility of the forced expiratory volume in one second (FEV 1 ) as the target parameter of control. Patients with bronchial asthma were followed up for one year. The clinical, functional, inflammatory and control parameters of the asthma were collected. The area under the curve (AUC) was estimated to establish the cutoff point of the post-bronchodilator FEV 1 reversibility in relation to non-control asthma. In the univariate analysis, the differences between groups were studied based on the degree of estimated reversibility. Factors with a significance <0.1 were included in the multivariate analysis by binary logistic regression. A total of 407 patients with a mean age of 38.1 ± 16.7 years were included. When the patients were grouped into controlled and non-controlled groups, compared with post-bronchodilator FEV 1 reversibility, the cutoff point obtained for the non-controlled group was ≥10% (sensitivity: 65.8%, specificity: 48.4%, positive predictive value: 69.5%, and AUC: 0.619 [0.533-0.700], p < 0.01). In the year-long follow-up of this group (post-bronchodilator FEV 1 ≥10), an increased use of relief medication was observed, along with a significantly progressive drop in post-bronchodilator FEV 1 and post-bronchodilator FEV 1 /FVC (forced expiratory volume in one second/forced vital capacity). Spirometric reversibility can be useful in assessing control in asthmatic patients and can predict future risk parameters. The cutoff point related to the non-control of asthma found in our work was ≥10%. Copyright © 2017 Elsevier Ltd. All rights reserved.
Alós, Josep; Palmer, Miquel; Balle, Salvador; Arlinghaus, Robert
2016-01-01
State-space models (SSM) are increasingly applied in studies involving biotelemetry-generated positional data because they are able to estimate movement parameters from positions that are unobserved or have been observed with non-negligible observational error. Popular telemetry systems in marine coastal fish consist of arrays of omnidirectional acoustic receivers, which generate a multivariate time-series of detection events across the tracking period. Here we report a novel Bayesian fitting of a SSM application that couples mechanistic movement properties within a home range (a specific case of random walk weighted by an Ornstein-Uhlenbeck process) with a model of observational error typical for data obtained from acoustic receiver arrays. We explored the performance and accuracy of the approach through simulation modelling and extensive sensitivity analyses of the effects of various configurations of movement properties and time-steps among positions. Model results show an accurate and unbiased estimation of the movement parameters, and in most cases the simulated movement parameters were properly retrieved. Only in extreme situations (when fast swimming speeds are combined with pooling the number of detections over long time-steps) the model produced some bias that needs to be accounted for in field applications. Our method was subsequently applied to real acoustic tracking data collected from a small marine coastal fish species, the pearly razorfish, Xyrichtys novacula. The Bayesian SSM we present here constitutes an alternative for those used to the Bayesian way of reasoning. Our Bayesian SSM can be easily adapted and generalized to any species, thereby allowing studies in freely roaming animals on the ecological and evolutionary consequences of home ranges and territory establishment, both in fishes and in other taxa. PMID:27119718
Liu, Dungang; Liu, Regina; Xie, Minge
2014-01-01
Meta-analysis has been widely used to synthesize evidence from multiple studies for common hypotheses or parameters of interest. However, it has not yet been fully developed for incorporating heterogeneous studies, which arise often in applications due to different study designs, populations or outcomes. For heterogeneous studies, the parameter of interest may not be estimable for certain studies, and in such a case, these studies are typically excluded from conventional meta-analysis. The exclusion of part of the studies can lead to a non-negligible loss of information. This paper introduces a metaanalysis for heterogeneous studies by combining the confidence density functions derived from the summary statistics of individual studies, hence referred to as the CD approach. It includes all the studies in the analysis and makes use of all information, direct as well as indirect. Under a general likelihood inference framework, this new approach is shown to have several desirable properties, including: i) it is asymptotically as efficient as the maximum likelihood approach using individual participant data (IPD) from all studies; ii) unlike the IPD analysis, it suffices to use summary statistics to carry out the CD approach. Individual-level data are not required; and iii) it is robust against misspecification of the working covariance structure of the parameter estimates. Besides its own theoretical significance, the last property also substantially broadens the applicability of the CD approach. All the properties of the CD approach are further confirmed by data simulated from a randomized clinical trials setting as well as by real data on aircraft landing performance. Overall, one obtains an unifying approach for combining summary statistics, subsuming many of the existing meta-analysis methods as special cases. PMID:26190875
Fibroblast growth factor 23 and renal function among young and healthy individuals.
Bernasconi, Raffaele; Aeschbacher, Stefanie; Blum, Steffen; Mongiat, Michel; Girod, Marc; Todd, John; Estis, Joel; Nolan, Niamh; Renz, Harald; Risch, Lorenz; Conen, David; Risch, Martin
2018-05-01
Fibroblast growth factor 23 (FGF-23), an osteocyte hormone involved in the regulation of phosphate metabolism, is associated with incident and progressive chronic kidney disease. We aimed to assess the association of FGF-23 with renal parameters, vascular function and phosphate metabolism in a large cohort of young and healthy individuals. Healthy individuals aged 25-41 years were included in a prospective population-based study. Fasting venous blood and morning urinary samples were used to measure plasma creatinine, cystatin C, endothelin-1, phosphate and plasma FGF-23 as well as urinary creatinine and phosphate. Multivariable regression models were constructed to assess the relationship of FGF-23 with parameters of renal function, endothelin-1 and fractional phosphate excretion. The median age of 2077 participants was 37 years, 46% were males. The mean estimated glomerular filtration rate (eGFR - CKD-EPI creatinine-cystatin C equation) and fractional phosphate excretion were 110 mL/min/1.73 m2 and 8.7%, respectively. After multivariable adjustment, there was a significant inverse relationship of FGF-23 with eGFR (β per 1 log-unit increase -3.81; 95% CI [-5.42; -2.20]; p<0.0001). Furthermore, we found a linear association between FGF-23 and endothelin-1 (β per 1 log-unit increase 0.06; [0.01, 0.11]; p=0.01). In addition, we established a significant relationship of FGF-23 with fractional phosphate excretion (β per 1 log-unit increase 0.62; [0.08, 1.16]; p=0.03). Increasing plasma FGF-23 levels are strongly associated with decreasing eGFR and increasing urinary phosphate excretion, suggesting an important role of FGF-23 in the regulation of kidney function in young and healthy adults.
Self-injurious behavior among Greek male prisoners: prevalence and risk factors.
Sakelliadis, E I; Papadodima, S A; Sergentanis, T N; Giotakos, O; Spiliopoulou, C A
2010-04-01
Self-harm among prisoners is a common phenomenon. This study aims to estimate the prevalence of self-injurious behavior (SIB) among Greek male prisoners, record their motives and determine independent risk factors. A self-administered, anonymous questionnaire was administered to 173 male prisoners in the Chalkida prison, Greece. The questionnaire included items on self-harm/SIB, demographic parameters, childhood history, family history, physical and mental disease, lifestyle and smoking habits, alcohol dependence (CAGE questionnaire), illicit substance use, aggression (Buss-Perry Aggression Questionnaire [BPAQ] and Lifetime History of Aggression [LTHA]), impulsivity (Barrat Impulsivity Scale-11) and suicidal ideation (Spectrum of Suicidal Behavior Scale). Univariate nonparametric statistics and multivariate ordinal logistic regression were performed. Of all the participants, 49.4% (95% CI: 41.5-57.3%) disclosed self-harm (direct or indirect). The prevalence of SIB was equal to 34.8% (95% CI: 27.5-42.6%). Most frequently, SIB coexisted with indirect self-harm (80.7%). The most common underlying motives were to obtain emotional release (31.6%) and to release anger (21.1%). At the univariate analysis, SIB was positively associated with a host of closely related factors: low education, physical/sexual abuse in childhood, parental neglect, parental divorce, alcoholism in family, psychiatric condition in family, recidivism, age, sentence already served, impulsivity, aggression, alcohol dependence, self-reported diagnosed psychiatric condition and illicit substance use. Childhood variables were particularly associated with the presence of diagnosed psychiatric condition. At the multivariate analysis, however, only three parameters were proven independent risk factors: self-reported diagnosed psychiatric condition, illicit substance use and aggression (BPAQ scale). The prevalence of SIB is particularly high. Psychiatric condition, illicit substance use and aggression seem to be the most meaningful risk factors; childhood events seem only to act indirectly. Copyright (c) 2009 Elsevier Masson SAS. All rights reserved.
Melgarejo, Jesús D; Lee, Joseph H; Petitto, Michele; Yépez, Juan B; Murati, Felipe A; Jin, Zhezhen; Chávez, Carlos A; Pirela, Rosa V; Calmón, Gustavo E; Lee, Winston; Johnson, Matthew P; Mena, Luis J; Al-Aswad, Lama A; Terwilliger, Joseph D; Allikmets, Rando; Maestre, Gladys E; De Moraes, C Gustavo
2018-06-01
To determine which nocturnal blood pressure (BP) parameters (low levels or extreme dipper status) are associated with an increased risk of glaucomatous damage in Hispanics. Observational cross-sectional study. A subset (n = 93) of the participants from the Maracaibo Aging Study (MAS) who met the study eligibility criteria were included. These participants, who were at least 40 years of age, had measurements for optical tomography coherence, visual field (VF) tests, 24-hour BP, office BP, and intraocular pressure <22 mmHg. Univariate and multivariate logistic regression analyses under the generalized estimating equations (GEE) framework were used to examine the relationships between glaucomatous damage and BP parameters, with particular attention to decreases in nocturnal BP. Glaucomatous optic neuropathy (GON) based on the presence of optic nerve damage and VF defects. The mean age was 61.9 years, and 87.1% were women. Of 185 eyes evaluated, 19 (26.5%) had signs of GON. Individuals with GON had significantly lower 24-hour and nighttime diastolic BP levels than those without. However, results of the multivariate GEE models indicated that the glaucomatous damage was not related to the average systolic or diastolic BP levels measured over 24 hours, daytime, or nighttime. In contrast, extreme decreases in nighttime systolic and diastolic BP (>20% compared with daytime BP) were significant risk factors for glaucomatous damage (odds ratio, 19.78 and 5.55, respectively). In this population, the link between nocturnal BP and GON is determined by extreme dipping effects rather than low nocturnal BP levels alone. Further studies considering extreme decreases in nocturnal BP in individuals at high risk of glaucoma are warranted. Copyright © 2018 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Showalter, Timothy N.; Winter, Kathryn A.; Berger, Adam C., E-mail: adam.berger@jefferson.edu
2011-12-01
Purpose: Lymph node status is an important predictor of survival in pancreatic cancer. We performed a secondary analysis of Radiation Therapy Oncology Group (RTOG) 9704, an adjuvant chemotherapy and chemoradiation trial, to determine the influence of lymph node factors-number of positive nodes (NPN), total nodes examined (TNE), and lymph node ratio (LNR ratio of NPN to TNE)-on OS and disease-free survival (DFS). Patient and Methods: Eligible patients from RTOG 9704 form the basis of this secondary analysis of lymph node parameters. Actuarial estimates for OS and DFS were calculated using Kaplan-Meier methods. Cox proportional hazards models were performed to evaluatemore » associations of NPN, TNE, and LNR with OS and DFS. Multivariate Cox proportional hazards models were also performed. Results: There were 538 patients enrolled in the RTOG 9704 trial. Of these, 445 patients were eligible with lymph nodes removed. Overall median NPN was 1 (min-max, 0-18). Increased NPN was associated with worse OS (HR = 1.06, p = 0.001) and DFS (HR = 1.05, p = 0.01). In multivariate analyses, both NPN and TNE were associated with OS and DFS. TNE > 12, and >15 were associated with increased OS for all patients, but not for node-negative patients (n = 142). Increased LNR was associated with worse OS (HR = 1.01, p < 0.0001) and DFS (HR = 1.006, p = 0.002). Conclusion: In patients who undergo surgical resection followed by adjuvant chemoradiation, TNE, NPN, and LNR are associated with OS and DFS. This secondary analysis of a prospective, cooperative group trial supports the influence of these lymph node parameters on outcomes after surgery and adjuvant therapy using contemporary techniques.« less
Evaluation of respiratory parameters in finswimmers regarding gender, swimming style and distance.
Stavrou, V; Vavougios, G; Karetsi, E; Adam, G; Daniil, Z; Gourgoulianis, K I
2018-04-13
The purpose of our study was to investigate the differences in the finswimmers' physiological characteristics, as far as gender, the swimming style and the different swimming distance are concerned. 52 finswimmers participated in our study (Age: 17.4 ± 2.1yrs, BMI: 21.8 ± 2.3, body fat: 12.2 ± 4.7%) and were allocated into groups [Gender: Female vs. Male, swimming style: Bifin vs. Surface, and swimming distance: <200 m vs. ≥200 m]. Anthropometric characteristics, handgrip, estimated strength of inspiratory muscles (PI max ) and pulmonary function parameters (FEV 1 , FVC and PEF) were measured. The Independent T-test was used for statistical comparisons between groups. Multivariate analyses were performed via binary logistic regression. The results showed differences between groups in gender in PEF (p < 0.05), PI max (p < 0.05) and handgrip (p < 0.001) in swimming style in handgrip (p < 0.05), FEV 1 (p < 0.05) and FVC (p < 0.05) and in swimming distance (p < 0.05) in hours/day spent at the gym (p < 0.05) and FVC (p < 0.05). In multivariate analyses handgrip remained an independent predictor of style (OR: 1.154; 95%CI: 1.022-1.303, p = .021), and hours/day spent at the gym was retained as an independent predictor of distance (OR: 131.607; 95%CI: 3.655-4739.441, p = .008). The data from the present study reveal that handgrip was associated with style, and hours per day spent at the gym were associated with distance. Copyright © 2018 Elsevier B.V. All rights reserved.
Keserci, Bilgin; Duc, Nguyen Minh
2018-03-07
We aimed to investigate the role of magnetic resonance imaging parameters in predicting the treatment outcome of high-intensity focused ultrasound (HIFU) ablation of uterine fibroids with a nonperfused volume (NPV) ratio of at least 90%. A total of 120 women who underwent HIFU treatment were divided into groups 1 (n = 72) and 2 (n = 48), comprising patients with an NPV ratio of at least 90% and less than 90%, respectively. Multivariate logistic regression analyses were carried out to investigate the potential predictors of the NPV ratio of at least 90%. The NPV ratios immediately post-treatment, therapeutic efficacy at 6 months' follow-up, and safety in terms of adverse effects and changes in anti-Mullerian hormone level were assessed. By introducing multiple predictors obtained from multivariate analyses into a generalized estimating equation model, the results showed that the thickness of the subcutaneous fat layer in the anterior abdominal wall, peak enhancement of fibroid, time to peak of fibroid, and the ratio of area under the curve of fibroid to myometrium were statistically significant, except T2 signal intensity ratio of fibroid to myometrium, hence predicting an NPV ratio of at least 90%. No serious adverse effects and no significant difference between the anti-Mullerian hormone levels before or 6 months post-treatment were reported. The findings in this study suggest that the achievement of NPV ratio of at least 90% in magnetic resonance imaging-guided HIFU treatment of uterine fibroids based on prediction model appears clinically possible without compromising the safety of patients. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Carvajal, Guido; Roser, David J; Sisson, Scott A; Keegan, Alexandra; Khan, Stuart J
2017-02-01
Chlorine disinfection of biologically treated wastewater is practiced in many locations prior to environmental discharge or beneficial reuse. The effectiveness of chlorine disinfection processes may be influenced by several factors, such as pH, temperature, ionic strength, organic carbon concentration, and suspended solids. We investigated the use of Bayesian multilayer perceptron (BMLP) models as efficient and practical tools for compiling and analysing free chlorine and monochloramine virus disinfection performance as a multivariate problem. Corresponding to their relative susceptibility, Adenovirus 2 was used to assess disinfection by monochloramine and Coxsackievirus B5 was used for free chlorine. A BMLP model was constructed to relate key disinfection conditions (CT, pH, turbidity) to observed Log Reduction Values (LRVs) for these viruses at constant temperature. The models proved to be valuable for incorporating uncertainty in the chlor(am)ination performance estimation and interpolating between operating conditions. Various types of queries could be performed with this model including the identification of target CT for a particular combination of LRV, pH and turbidity. Similarly, it was possible to derive achievable LRVs for combinations of CT, pH and turbidity. These queries yielded probability density functions for the target variable reflecting the uncertainty in the model parameters and variability of the input variables. The disinfection efficacy was greatly impacted by pH and to a lesser extent by turbidity for both types of disinfections. Non-linear relationships were observed between pH and target CT, and turbidity and target CT, with compound effects on target CT also evidenced. This work demonstrated that the use of BMLP models had considerable ability to improve the resolution and understanding of the multivariate relationships between operational parameters and disinfection outcomes for wastewater treatment. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bedside risk estimation of morbidly adherent placenta using simple calculator.
Maymon, R; Melcer, Y; Pekar-Zlotin, M; Shaked, O; Cuckle, H; Tovbin, J
2018-03-01
To construct a calculator for 'bedside' estimation of morbidly adherent placenta (MAP) risk based on ultrasound (US) findings. This retrospective study included all pregnant women with at least one previous cesarean delivery attending in our US unit between December 2013 and January 2017. The examination was based on a scoring system which determines the probability for MAP. The study population included 471 pregnant women, and 41 of whom (8.7%) were diagnosed with MAP. Based on ROC curve, the most effective US criteria for detection of MAP were the presence of the placental lacunae, obliteration of the utero-placental demarcation, and placenta previa. On the multivariate logistic regression analysis, US findings of placental lacunae (OR = 3.5; 95% CI, 1.2-9.5; P = 0.01), obliteration of the utero-placental demarcation (OR = 12.4; 95% CI, 3.7-41.6; P < 0.0001), and placenta previa (OR = 10.5; 95% CI, 3.5-31.3; P < 0.0001) were associated with MAP. By combining these three parameters, the receiver operating characteristic curve was calculated, yielding an area under the curve of 0.93 (95% CI, 0.87-0.97). Accordingly, we have constructed a simple calculator for 'bedside' estimation of MAP risk. The calculator is mounted on the hospital's internet website ( http://www.assafh.org/Pages/PPCalc/index.html ). The risk estimation of MAP varies between 1.5 and 87%. The present calculator enables a simple 'bedside' MAP estimation, facilitating accurate and adequate antenatal risk assessment.
Rajabioun, Mehdi; Nasrabadi, Ali Motie; Shamsollahi, Mohammad Bagher
2017-09-01
Effective connectivity is one of the most important considerations in brain functional mapping via EEG. It demonstrates the effects of a particular active brain region on others. In this paper, a new method is proposed which is based on dual Kalman filter. In this method, firstly by using a brain active localization method (standardized low resolution brain electromagnetic tomography) and applying it to EEG signal, active regions are extracted, and appropriate time model (multivariate autoregressive model) is fitted to extracted brain active sources for evaluating the activity and time dependence between sources. Then, dual Kalman filter is used to estimate model parameters or effective connectivity between active regions. The advantage of this method is the estimation of different brain parts activity simultaneously with the calculation of effective connectivity between active regions. By combining dual Kalman filter with brain source localization methods, in addition to the connectivity estimation between parts, source activity is updated during the time. The proposed method performance has been evaluated firstly by applying it to simulated EEG signals with interacting connectivity simulation between active parts. Noisy simulated signals with different signal to noise ratios are used for evaluating method sensitivity to noise and comparing proposed method performance with other methods. Then the method is applied to real signals and the estimation error during a sweeping window is calculated. By comparing proposed method results in different simulation (simulated and real signals), proposed method gives acceptable results with least mean square error in noisy or real conditions.
Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA.
Kelly, Brendan J; Gross, Robert; Bittinger, Kyle; Sherrill-Mix, Scott; Lewis, James D; Collman, Ronald G; Bushman, Frederic D; Li, Hongzhe
2015-08-01
The variation in community composition between microbiome samples, termed beta diversity, can be measured by pairwise distance based on either presence-absence or quantitative species abundance data. PERMANOVA, a permutation-based extension of multivariate analysis of variance to a matrix of pairwise distances, partitions within-group and between-group distances to permit assessment of the effect of an exposure or intervention (grouping factor) upon the sampled microbiome. Within-group distance and exposure/intervention effect size must be accurately modeled to estimate statistical power for a microbiome study that will be analyzed with pairwise distances and PERMANOVA. We present a framework for PERMANOVA power estimation tailored to marker-gene microbiome studies that will be analyzed by pairwise distances, which includes: (i) a novel method for distance matrix simulation that permits modeling of within-group pairwise distances according to pre-specified population parameters; (ii) a method to incorporate effects of different sizes within the simulated distance matrix; (iii) a simulation-based method for estimating PERMANOVA power from simulated distance matrices; and (iv) an R statistical software package that implements the above. Matrices of pairwise distances can be efficiently simulated to satisfy the triangle inequality and incorporate group-level effects, which are quantified by the adjusted coefficient of determination, omega-squared (ω2). From simulated distance matrices, available PERMANOVA power or necessary sample size can be estimated for a planned microbiome study. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Kanno, Hiroko; Kanda, Eiichiro; Sato, Asako; Sakamoto, Kaori; Kanno, Yoshihiko
2016-04-01
Determination of daily protein intake in the management of chronic kidney disease (CKD) requires precision. Inaccuracies in recording dietary intake occur, and estimation from total urea excretion presents hurdles owing to the difficulty of collecting whole urine for 24 h. Spot urine has been used for measuring daily sodium intake and urinary protein excretion. In this cross-sectional study, we investigated whether urea nitrogen (UN) concentration in spot urine can be used to predict daily protein intake instead of the 24-h urine collection in 193 Japanese CKD patients (Stages G1-G5). After patient randomization into 2 datasets for the development and validation of models, bootstrapping was used to develop protein intake estimation models. The parameters for the candidate multivariate regression models were male gender, age, body mass index (BMI), diabetes mellitus, dyslipidemia, proteinuria, estimated glomerular filtration rate, serum albumin level, spot urinary UN and creatinine level, and spot urinary UN/creatinine levels. The final model contained BMI and spot urinary UN level. The final model was selected because of the higher correlation between the predicted and measured protein intakes r = 0.558 (95 % confidence interval 0.400, 0.683), and the smaller distribution of the difference between the measured and predicted protein intakes than those of the other models. The results suggest that UN concentration in spot urine may be used to estimate daily protein intake and that a prediction formula would be useful for nutritional control in CKD patients.
NASA Technical Reports Server (NTRS)
Waszak, Martin R.
1992-01-01
The application of a sector-based stability theory approach to the formulation of useful uncertainty descriptions for linear, time-invariant, multivariable systems is explored. A review of basic sector properties and sector-based approach are presented first. The sector-based approach is then applied to several general forms of parameter uncertainty to investigate its advantages and limitations. The results indicate that the sector uncertainty bound can be used effectively to evaluate the impact of parameter uncertainties on the frequency response of the design model. Inherent conservatism is a potential limitation of the sector-based approach, especially for highly dependent uncertain parameters. In addition, the representation of the system dynamics can affect the amount of conservatism reflected in the sector bound. Careful application of the model can help to reduce this conservatism, however, and the solution approach has some degrees of freedom that may be further exploited to reduce the conservatism.
Using explanatory crop models to develop simple tools for Advanced Life Support system studies
NASA Technical Reports Server (NTRS)
Cavazzoni, J.
2004-01-01
System-level analyses for Advanced Life Support require mathematical models for various processes, such as for biomass production and waste management, which would ideally be integrated into overall system models. Explanatory models (also referred to as mechanistic or process models) would provide the basis for a more robust system model, as these would be based on an understanding of specific processes. However, implementing such models at the system level may not always be practicable because of their complexity. For the area of biomass production, explanatory models were used to generate parameters and multivariable polynomial equations for basic models that are suitable for estimating the direction and magnitude of daily changes in canopy gas-exchange, harvest index, and production scheduling for both nominal and off-nominal growing conditions. c2004 COSPAR. Published by Elsevier Ltd. All rights reserved.
BrainAGE score indicates accelerated brain aging in schizophrenia, but not bipolar disorder.
Nenadić, Igor; Dietzek, Maren; Langbein, Kerstin; Sauer, Heinrich; Gaser, Christian
2017-08-30
BrainAGE (brain age gap estimation) is a novel morphometric parameter providing a univariate score derived from multivariate voxel-wise analyses. It uses a machine learning approach and can be used to analyse deviation from physiological developmental or aging-related trajectories. Using structural MRI data and BrainAGE quantification of acceleration or deceleration of in individual aging, we analysed data from 45 schizophrenia patients, 22 bipolar I disorder patients (mostly with previous psychotic symptoms / episodes), and 70 healthy controls. We found significantly higher BrainAGE scores in schizophrenia, but not bipolar disorder patients. Our findings indicate significantly accelerated brain structural aging in schizophrenia. This suggests, that despite the conceptualisation of schizophrenia as a neurodevelopmental disorder, there might be an additional progressive pathogenic component. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Vallières, M.; Freeman, C. R.; Skamene, S. R.; El Naqa, I.
2015-07-01
This study aims at developing a joint FDG-PET and MRI texture-based model for the early evaluation of lung metastasis risk in soft-tissue sarcomas (STSs). We investigate if the creation of new composite textures from the combination of FDG-PET and MR imaging information could better identify aggressive tumours. Towards this goal, a cohort of 51 patients with histologically proven STSs of the extremities was retrospectively evaluated. All patients had pre-treatment FDG-PET and MRI scans comprised of T1-weighted and T2-weighted fat-suppression sequences (T2FS). Nine non-texture features (SUV metrics and shape features) and forty-one texture features were extracted from the tumour region of separate (FDG-PET, T1 and T2FS) and fused (FDG-PET/T1 and FDG-PET/T2FS) scans. Volume fusion of the FDG-PET and MRI scans was implemented using the wavelet transform. The influence of six different extraction parameters on the predictive value of textures was investigated. The incorporation of features into multivariable models was performed using logistic regression. The multivariable modeling strategy involved imbalance-adjusted bootstrap resampling in the following four steps leading to final prediction model construction: (1) feature set reduction; (2) feature selection; (3) prediction performance estimation; and (4) computation of model coefficients. Univariate analysis showed that the isotropic voxel size at which texture features were extracted had the most impact on predictive value. In multivariable analysis, texture features extracted from fused scans significantly outperformed those from separate scans in terms of lung metastases prediction estimates. The best performance was obtained using a combination of four texture features extracted from FDG-PET/T1 and FDG-PET/T2FS scans. This model reached an area under the receiver-operating characteristic curve of 0.984 ± 0.002, a sensitivity of 0.955 ± 0.006, and a specificity of 0.926 ± 0.004 in bootstrapping evaluations. Ultimately, lung metastasis risk assessment at diagnosis of STSs could improve patient outcomes by allowing better treatment adaptation.
Ouma, Paul O; Agutu, Nathan O; Snow, Robert W; Noor, Abdisalan M
2017-09-18
Precise quantification of health service utilisation is important for the estimation of disease burden and allocation of health resources. Current approaches to mapping health facility utilisation rely on spatial accessibility alone as the predictor. However, other spatially varying social, demographic and economic factors may affect the use of health services. The exclusion of these factors can lead to the inaccurate estimation of health facility utilisation. Here, we compare the accuracy of a univariate spatial model, developed only from estimated travel time, to a multivariate model that also includes relevant social, demographic and economic factors. A theoretical surface of travel time to the nearest public health facility was developed. These were assigned to each child reported to have had fever in the Kenya demographic and health survey of 2014 (KDHS 2014). The relationship of child treatment seeking for fever with travel time, household and individual factors from the KDHS2014 were determined using multilevel mixed modelling. Bayesian information criterion (BIC) and likelihood ratio test (LRT) tests were carried out to measure how selected factors improve parsimony and goodness of fit of the time model. Using the mixed model, a univariate spatial model of health facility utilisation was fitted using travel time as the predictor. The mixed model was also used to compute a multivariate spatial model of utilisation, using travel time and modelled surfaces of selected household and individual factors as predictors. The univariate and multivariate spatial models were then compared using the receiver operating area under the curve (AUC) and a percent correct prediction (PCP) test. The best fitting multivariate model had travel time, household wealth index and number of children in household as the predictors. These factors reduced BIC of the time model from 4008 to 2959, a change which was confirmed by the LRT test. Although there was a high correlation of the two modelled probability surfaces (Adj R 2 = 88%), the multivariate model had better AUC compared to the univariate model; 0.83 versus 0.73 and PCP 0.61 versus 0.45 values. Our study shows that a model that uses travel time, as well as household and individual-level socio-demographic factors, results in a more accurate estimation of use of health facilities for the treatment of childhood fever, compared to one that relies on only travel time.
A General Approach for Estimating Scale Score Reliability for Panel Survey Data
ERIC Educational Resources Information Center
Biemer, Paul P.; Christ, Sharon L.; Wiesen, Christopher A.
2009-01-01
Scale score measures are ubiquitous in the psychological literature and can be used as both dependent and independent variables in data analysis. Poor reliability of scale score measures leads to inflated standard errors and/or biased estimates, particularly in multivariate analysis. Reliability estimation is usually an integral step to assess…
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Lemmens, Louise; Kos, Snjezana; Beijer, Cornelis; Brinkman, Jacoline W; van der Horst, Frans A L; van den Hoven, Leonie; Kieslinger, Dorit C; van Trooyen-van Vrouwerff, Netty J; Wolthuis, Albert; Hendriks, Jan C M; Wetzels, Alex M M
2016-06-01
To investigate the value of sperm parameters to predict an ongoing pregnancy outcome in couples treated with intrauterine insemination (IUI), during a methodologically stable period of time. Retrospective, observational study with logistic regression analyses. University hospital. A total of 1,166 couples visiting the fertility laboratory for their first IUI episode, including 4,251 IUI cycles. None. Sperm morphology, total progressively motile sperm count (TPMSC), and number of inseminated progressively motile spermatozoa (NIPMS); odds ratios (ORs) of the sperm parameters after the first IUI cycle and the first finished IUI episode; discriminatory accuracy of the multivariable model. None of the sperm parameters was of predictive value for pregnancy after the first IUI cycle. In the first finished IUI episode, a positive relationship was found for ≤4% of morphologically normal spermatozoa (OR 1.39) and a moderate NIPMS (5-10 million; OR 1.73). Low NIPMS showed a negative relation (≤1 million; OR 0.42). The TPMSC had no predictive value. The multivariable model (i.e., sperm morphology, NIPMS, female age, male age, and the number of cycles in the episode) had a moderate discriminatory accuracy (area under the curve 0.73). Intrauterine insemination is especially relevant for couples with moderate male factor infertility (sperm morphology ≤4%, NIPMS 5-10 million). In the multivariable model, however, the predictive power of these sperm parameters is rather low. Copyright © 2016 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
The Emotional Components of Jealousy: A Multivariate Investigation.
ERIC Educational Resources Information Center
Ray, Lisa
To investigate the emotional parameters of jealousy and to examine the differences between male and female labeling of jealousy, 288 college undergraduates completed the Emotional Parameters of Jealousy Questionnaire. The questionnaire consists of 59 statements that refer to emotions experienced by a person in a situation in which he/she feels…
Dynamic Factor Analysis Models with Time-Varying Parameters
ERIC Educational Resources Information Center
Chow, Sy-Miin; Zu, Jiyun; Shifren, Kim; Zhang, Guangjian
2011-01-01
Dynamic factor analysis models with time-varying parameters offer a valuable tool for evaluating multivariate time series data with time-varying dynamics and/or measurement properties. We use the Dynamic Model of Activation proposed by Zautra and colleagues (Zautra, Potter, & Reich, 1997) as a motivating example to construct a dynamic factor…
Li, Shi; Mukherjee, Bhramar; Taylor, Jeremy M G; Rice, Kenneth M; Wen, Xiaoquan; Rice, John D; Stringham, Heather M; Boehnke, Michael
2014-07-01
With challenges in data harmonization and environmental heterogeneity across various data sources, meta-analysis of gene-environment interaction studies can often involve subtle statistical issues. In this paper, we study the effect of environmental covariate heterogeneity (within and between cohorts) on two approaches for fixed-effect meta-analysis: the standard inverse-variance weighted meta-analysis and a meta-regression approach. Akin to the results in Simmonds and Higgins (), we obtain analytic efficiency results for both methods under certain assumptions. The relative efficiency of the two methods depends on the ratio of within versus between cohort variability of the environmental covariate. We propose to use an adaptively weighted estimator (AWE), between meta-analysis and meta-regression, for the interaction parameter. The AWE retains full efficiency of the joint analysis using individual level data under certain natural assumptions. Lin and Zeng (2010a, b) showed that a multivariate inverse-variance weighted estimator retains full efficiency as joint analysis using individual level data, if the estimates with full covariance matrices for all the common parameters are pooled across all studies. We show consistency of our work with Lin and Zeng (2010a, b). Without sacrificing much efficiency, the AWE uses only univariate summary statistics from each study, and bypasses issues with sharing individual level data or full covariance matrices across studies. We compare the performance of the methods both analytically and numerically. The methods are illustrated through meta-analysis of interaction between Single Nucleotide Polymorphisms in FTO gene and body mass index on high-density lipoprotein cholesterol data from a set of eight studies of type 2 diabetes. © 2014 WILEY PERIODICALS, INC.
Deformation data modeling through numerical models: an efficient method for tracking magma transport
NASA Astrophysics Data System (ADS)
Charco, M.; Gonzalez, P. J.; Galán del Sastre, P.
2017-12-01
Nowadays, multivariate collected data and robust physical models at volcano observatories are becoming crucial for providing effective volcano monitoring. Nevertheless, the forecast of volcanic eruption is notoriously difficult. Wthin this frame one of the most promising methods to evaluate the volcano hazard is the use of surface ground deformation and in the last decades many developments in the field of deformation modeling has been achieved. In particular, numerical modeling allows realistic media features such as topography and crustal heterogeneities to be included, although it is still very time cosuming to solve the inverse problem for near-real time interpretations. Here, we present a method that can be efficiently used to estimate the location and evolution of magmatic sources base on real-time surface deformation data and Finite Element (FE) models. Generally, the search for the best-fitting magmatic (point) source(s) is conducted for an array of 3-D locations extending below a predefined volume region and the Green functions for all the array components have to be precomputed. We propose a FE model for the pre-computation of Green functions in a mechanically heterogeneous domain which eventually will lead to a better description of the status of the volcanic area. The number of Green functions is reduced here to the number of observational points by using their reciprocity relationship. We present and test this methodology with an optimization method base on a Genetic Algorithm. Following synthetic and sensitivity test to estimate the uncertainty of the model parameters, we apply the tool for magma tracking during 2007 Kilauea volcano intrusion and eruption. We show how data inversion with numerical models can speed up the source parameters estimations for a given volcano showing signs of unrest.
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis.
Nespeca, Maurilio Gustavo; Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000-650 cm -1 . The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time.
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis
Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000–650 cm−1. The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time. PMID:29629209
Hevesi, Joseph A.; Flint, Alan L.; Istok, Jonathan D.
1992-01-01
Values of average annual precipitation (AAP) may be important for hydrologic characterization of a potential high-level nuclear-waste repository site at Yucca Mountain, Nevada. Reliable measurements of AAP are sparse in the vicinity of Yucca Mountain, and estimates of AAP were needed for an isohyetal mapping over a 2600-square-mile watershed containing Yucca Mountain. Estimates were obtained with a multivariate geostatistical model developed using AAP and elevation data from a network of 42 precipitation stations in southern Nevada and southeastern California. An additional 1531 elevations were obtained to improve estimation accuracy. Isohyets representing estimates obtained using univariate geostatistics (kriging) defined a smooth and continuous surface. Isohyets representing estimates obtained using multivariate geostatistics (cokriging) defined an irregular surface that more accurately represented expected local orographic influences on AAP. Cokriging results included a maximum estimate within the study area of 335 mm at an elevation of 7400 ft, an average estimate of 157 mm for the study area, and an average estimate of 172 mm at eight locations in the vicinity of the potential repository site. Kriging estimates tended to be lower in comparison because the increased AAP expected for remote mountainous topography was not adequately represented by the available sample. Regression results between cokriging estimates and elevation were similar to regression results between measured AAP and elevation. The position of the cokriging 250-mm isohyet relative to the boundaries of pinyon pine and juniper woodlands provided indirect evidence of improved estimation accuracy because the cokriging result agreed well with investigations by others concerning the relationship between elevation, vegetation, and climate in the Great Basin. Calculated estimation variances were also mapped and compared to evaluate improvements in estimation accuracy. Cokriging estimation variances were reduced by an average of 54% relative to kriging variances within the study area. Cokriging reduced estimation variances at the potential repository site by 55% relative to kriging. The usefulness of an existing network of stations for measuring AAP within the study area was evaluated using cokriging variances, and twenty additional stations were located for the purpose of improving the accuracy of future isohyetal mappings. Using the expanded network of stations, the maximum cokriging estimation variance within the study area was reduced by 78% relative to the existing network, and the average estimation variance was reduced by 52%.
Large signal-to-noise ratio quantification in MLE for ARARMAX models
NASA Astrophysics Data System (ADS)
Zou, Yiqun; Tang, Xiafei
2014-06-01
It has been shown that closed-loop linear system identification by indirect method can be generally transferred to open-loop ARARMAX (AutoRegressive AutoRegressive Moving Average with eXogenous input) estimation. For such models, the gradient-related optimisation with large enough signal-to-noise ratio (SNR) can avoid the potential local convergence in maximum likelihood estimation. To ease the application of this condition, the threshold SNR needs to be quantified. In this paper, we build the amplitude coefficient which is an equivalence to the SNR and prove the finiteness of the threshold amplitude coefficient within the stability region. The quantification of threshold is achieved by the minimisation of an elaborately designed multi-variable cost function which unifies all the restrictions on the amplitude coefficient. The corresponding algorithm based on two sets of physically realisable system input-output data details the minimisation and also points out how to use the gradient-related method to estimate ARARMAX parameters when local minimum is present as the SNR is small. Then, the algorithm is tested on a theoretical AutoRegressive Moving Average with eXogenous input model for the derivation of the threshold and a gas turbine engine real system for model identification, respectively. Finally, the graphical validation of threshold on a two-dimensional plot is discussed.
Joint resonant CMB power spectrum and bispectrum estimation
NASA Astrophysics Data System (ADS)
Meerburg, P. Daniel; Münchmeyer, Moritz; Wandelt, Benjamin
2016-02-01
We develop the tools necessary to assess the statistical significance of resonant features in the CMB correlation functions, combining power spectrum and bispectrum measurements. This significance is typically addressed by running a large number of simulations to derive the probability density function (PDF) of the feature-amplitude in the Gaussian case. Although these simulations are tractable for the power spectrum, for the bispectrum they require significant computational resources. We show that, by assuming that the PDF is given by a multivariate Gaussian where the covariance is determined by the Fisher matrix of the sine and cosine terms, we can efficiently produce spectra that are statistically close to those derived from full simulations. By drawing a large number of spectra from this PDF, both for the power spectrum and the bispectrum, we can quickly determine the statistical significance of candidate signatures in the CMB, considering both single frequency and multifrequency estimators. We show that for resonance models, cosmology and foreground parameters have little influence on the estimated amplitude, which allows us to simplify the analysis considerably. A more precise likelihood treatment can then be applied to candidate signatures only. We also discuss a modal expansion approach for the power spectrum, aimed at quickly scanning through large families of oscillating models.
Anyfanti, Panagiota; Triantafyllou, Areti; Gkaliagkousi, Eugenia; Triantafyllou, Georgios; Koletsos, Nikolaos; Chatzimichailidou, Sophia; Panagopoulos, Panagiotis; Botis, Ioannis; Aslanidis, Spyros; Douma, Stella
2017-06-01
Cardiac involvement is common in rheumatoid arthritis. Subendocardial viability ratio (SEVR) is a non-invasive measure of microvascular coronary perfusion, yet it remains unclear whether it is affected in rheumatoid arthritis patients. We additionally sought predictors of SEVR in rheumatoid arthritis among a wide range of disease-related parameters, cardiac and hemodynamic factors, and markers of atherosclerosis, arteriosclerosis, and endothelial dysfunction. SEVR was estimated in rheumatoid arthritis patients and healthy controls by applanation tonometry, which was also used to evaluate arterial stiffness (pulse wave velocity and augmentation index). In the rheumatoid arthritis group, carotid intima-media thickness (cIMT) was additionally estimated by ultrasound, cardiac and hemodynamic parameters by impedance cardiography, and endothelial dysfunction by measurement of asymmetric dimethylarginine (ADMA). In a total of 122 participants, SEVR was lower among 91 patients with rheumatoid arthritis compared to 31 controls (141.4 ± 21.9 vs 153.1 ± 18.7%, p = 0.009) and remained so among 29 rheumatoid arthritis patients without hypertension, diabetes, or cardiovascular diseases, compared to the control group (139.7 ± 21.7 vs 153.1 ± 18.7%, p = 0.013). SEVR did not significantly correlate with arterial stiffness, cIMT, ADMA, or disease-related parameters. Multivariate analysis revealed gender (p = 0.007), blood pressure (p = 0.028), heart rate (p = 0.025), cholesterol levels (p = 0.008), cardiac index (p < 0.001) and left ventricular ejection time (p = 0.004) as independent predictors of SEVR among patients with rheumatoid arthritis. Patients with rheumatoid arthritis exhibit lower values of SEVR compared to healthy individuals. Cardiac and hemodynamic parameters, rather than functional indices of endothelial and macrovascular dysfunction, may be useful as predictors of myocardial perfusion in rheumatoid arthritis.
Estimating correlation between multivariate longitudinal data in the presence of heterogeneity.
Gao, Feng; Philip Miller, J; Xiong, Chengjie; Luo, Jingqin; Beiser, Julia A; Chen, Ling; Gordon, Mae O
2017-08-17
Estimating correlation coefficients among outcomes is one of the most important analytical tasks in epidemiological and clinical research. Availability of multivariate longitudinal data presents a unique opportunity to assess joint evolution of outcomes over time. Bivariate linear mixed model (BLMM) provides a versatile tool with regard to assessing correlation. However, BLMMs often assume that all individuals are drawn from a single homogenous population where the individual trajectories are distributed smoothly around population average. Using longitudinal mean deviation (MD) and visual acuity (VA) from the Ocular Hypertension Treatment Study (OHTS), we demonstrated strategies to better understand the correlation between multivariate longitudinal data in the presence of potential heterogeneity. Conditional correlation (i.e., marginal correlation given random effects) was calculated to describe how the association between longitudinal outcomes evolved over time within specific subpopulation. The impact of heterogeneity on correlation was also assessed by simulated data. There was a significant positive correlation in both random intercepts (ρ = 0.278, 95% CI: 0.121-0.420) and random slopes (ρ = 0.579, 95% CI: 0.349-0.810) between longitudinal MD and VA, and the strength of correlation constantly increased over time. However, conditional correlation and simulation studies revealed that the correlation was induced primarily by participants with rapid deteriorating MD who only accounted for a small fraction of total samples. Conditional correlation given random effects provides a robust estimate to describe the correlation between multivariate longitudinal data in the presence of unobserved heterogeneity (NCT00000125).
Marques, Pedro; Leite, Valeriano; Bugalho, Maria João
2014-01-01
Background Papillary thyroid carcinoma (PTC) is the most common thyroid cancer. The widespread use of neck ultrasound (US) and US-guided fine-needle aspiration cytology is triggering an overdiagnosis of PTC. Objective To evaluate clinical behavior and outcomes of patients with PTCs ≤2 cm, seeking for possible prognostic factors. Methods Clinical records of cases with histological diagnosis of PTC ≤2 cm followed at the Endocrine Department of Instituto Português de Oncologia, Lisbon between 2002 and 2006 were analyzed retrospectively. Results We identified 255 PTCs, 111 were microcarcinomas. Most patients underwent near-total thyroidectomy, with lymph node dissections in 55 cases (21.6%). Radioiodine therapy was administered in 184 patients. At the last evaluation, 38 (14.9%) had evidence of disease. Two deaths were attributed to PTC. Median (±SD) follow-up was 74 (±23) months. Multivariate analysis identified vascular invasion, lymph node and systemic metastases significantly associated with recurrence/persistence of disease. In addition, lymph node involvement was significantly associated with extrathyroidal extension and angioinvasion. Median (±SD) disease-free survival (DFS) was estimated as 106 (±3) months and the 5-year DFS rate was 87.5%. Univariate Cox analysis identified some relevant parameters for DFS, but multivariate regression only identified lymph node and systemic metastases as significant independent factors. The median DFS estimated for lymph node and systemic metastases was 75 and 0 months, respectively. Conclusions In the setting of small PTCs, vascular invasion, extrathyroidal extension and lymph node and/or systemic metastases may confer worse prognosis, perhaps justifying more aggressive therapeutic and follow-up approaches in such cases. PMID:25759803
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fietkau, Rainer; Roedel, Claus; Hohenberger, Werner
2007-03-15
Purpose: The impact of the delivery of radiotherapy (RT) on treatment results in rectal cancer patients is unknown. Methods and Materials: The data from 788 patients with rectal cancer treated within the German CAO/AIO/ARO-94 phase III trial were analyzed concerning the impact of the delivery of RT (adequate RT: minimal radiation RT dose delivered, 4300 cGy for neoadjuvant RT or 4700 cGy for adjuvant RT; completion of RT in <44 days for neoadjuvant RT or <49 days for adjuvant RT) in different centers on the locoregional recurrence rate (LRR) and disease-free survival (DFS) at 5 years. The LRR, DFS, andmore » delivery of RT were analyzed as endpoints in multivariate analysis. Results: A significant difference was found between the centers and the delivery of RT. The overall delivery of RT was a prognostic factor for the LRR (no RT, 29.6% {+-} 7.8%; inadequate RT, 21.2% {+-} 5.6%; adequate RT, 6.8% {+-} 1.4%; p = 0.0001) and DFS (no RT, 55.1% {+-} 9.1%; inadequate RT, 57.4% {+-} 6.3%; adequate RT, 69.1% {+-} 2.3%; p = 0.02). Postoperatively, delivery of RT was a prognostic factor for LRR on multivariate analysis (together with pathologic stage) but not for DFS (independent parameters, pathologic stage and age). Preoperatively, on multivariate analysis, pathologic stage, but not delivery of RT, was an independent prognostic parameter for LRR and DFS (together with adequate chemotherapy). On multivariate analysis, the treatment center, treatment schedule (neoadjuvant vs. adjuvant RT), and gender were prognostic parameters for adequate RT. Conclusion: Delivery of RT should be regarded as a prognostic factor for LRR in rectal cancer and is influenced by the treatment center, treatment schedule, and patient gender.« less
A Hybrid Index for Characterizing Drought Based on a Nonparametric Kernel Estimator
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, Shengzhi; Huang, Qiang; Leng, Guoyong
This study develops a nonparametric multivariate drought index, namely, the Nonparametric Multivariate Standardized Drought Index (NMSDI), by considering the variations of both precipitation and streamflow. Building upon previous efforts in constructing Nonparametric Multivariate Drought Index, we use the nonparametric kernel estimator to derive the joint distribution of precipitation and streamflow, thus providing additional insights in drought index development. The proposed NMSDI are applied in the Wei River Basin (WRB), based on which the drought evolution characteristics are investigated. Results indicate: (1) generally, NMSDI captures the drought onset similar to Standardized Precipitation Index (SPI) and drought termination and persistence similar tomore » Standardized Streamflow Index (SSFI). The drought events identified by NMSDI match well with historical drought records in the WRB. The performances are also consistent with that by an existing Multivariate Standardized Drought Index (MSDI) at various timescales, confirming the validity of the newly constructed NMSDI in drought detections (2) An increasing risk of drought has been detected for the past decades, and will be persistent to a certain extent in future in most areas of the WRB; (3) the identified change points of annual NMSDI are mainly concentrated in the early 1970s and middle 1990s, coincident with extensive water use and soil reservation practices. This study highlights the nonparametric multivariable drought index, which can be used for drought detections and predictions efficiently and comprehensively.« less
NASA Astrophysics Data System (ADS)
Tong, M.; Xue, M.
2006-12-01
An important source of model error for convective-scale data assimilation and prediction is microphysical parameterization. This study investigates the possibility of estimating up to five fundamental microphysical parameters, which are closely involved in the definition of drop size distribution of microphysical species in a commonly used single-moment ice microphysics scheme, using radar observations and the ensemble Kalman filter method. The five parameters include the intercept parameters for rain, snow and hail/graupel, and the bulk densities of hail/graupel and snow. Parameter sensitivity and identifiability are first examined. The ensemble square-root Kalman filter (EnSRF) is employed for simultaneous state and parameter estimation. OSS experiments are performed for a model-simulated supercell storm, in which the five microphysical parameters are estimated individually or in different combinations starting from different initial guesses. When error exists in only one of the microphysical parameters, the parameter can be successfully estimated without exception. The estimation of multiple parameters is found to be less robust, with end results of estimation being sensitive to the realization of the initial parameter perturbation. This is believed to be because of the reduced parameter identifiability and the existence of non-unique solutions. The results of state estimation are, however, always improved when simultaneous parameter estimation is performed, even when the estimated parameters values are not accurate.