Sample records for bayesian linear models

  1. Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

    PubMed Central

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-01-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882

  2. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    PubMed

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  3. A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

    EPA Science Inventory

    We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...

  4. Incorporation of diet information derived from Bayesian stable isotope mixing models into mass-balanced marine ecosystem models: A case study from the Marennes-Oleron Estuary, France

    EPA Science Inventory

    We investigated the use of output from Bayesian stable isotope mixing models as constraints for a linear inverse food web model of a temperate intertidal seagrass system in the Marennes-Oléron Bay, France. Linear inverse modeling (LIM) is a technique that estimates a complete net...

  5. Mixed linear-non-linear inversion of crustal deformation data: Bayesian inference of model, weighting and regularization parameters

    NASA Astrophysics Data System (ADS)

    Fukuda, Jun'ichi; Johnson, Kaj M.

    2010-06-01

    We present a unified theoretical framework and solution method for probabilistic, Bayesian inversions of crustal deformation data. The inversions involve multiple data sets with unknown relative weights, model parameters that are related linearly or non-linearly through theoretic models to observations, prior information on model parameters and regularization priors to stabilize underdetermined problems. To efficiently handle non-linear inversions in which some of the model parameters are linearly related to the observations, this method combines both analytical least-squares solutions and a Monte Carlo sampling technique. In this method, model parameters that are linearly and non-linearly related to observations, relative weights of multiple data sets and relative weights of prior information and regularization priors are determined in a unified Bayesian framework. In this paper, we define the mixed linear-non-linear inverse problem, outline the theoretical basis for the method, provide a step-by-step algorithm for the inversion, validate the inversion method using synthetic data and apply the method to two real data sets. We apply the method to inversions of multiple geodetic data sets with unknown relative data weights for interseismic fault slip and locking depth. We also apply the method to the problem of estimating the spatial distribution of coseismic slip on faults with unknown fault geometry, relative data weights and smoothing regularization weight.

  6. Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data

    PubMed Central

    Zhao, Xin; Cheung, Leo Wang-Kit

    2007-01-01

    Background Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more important for our understanding of diseases at genomic level. Although many machine learning methods have been developed and applied to the area of microarray gene expression data analysis, the majority of them are based on linear models, which however are not necessarily appropriate for the underlying connection between the target disease and its associated explanatory genes. Linear model based methods usually also bring in false positive significant features more easily. Furthermore, linear model based algorithms often involve calculating the inverse of a matrix that is possibly singular when the number of potentially important genes is relatively large. This leads to problems of numerical instability. To overcome these limitations, a few non-linear methods have recently been introduced to the area. Many of the existing non-linear methods have a couple of critical problems, the model selection problem and the model parameter tuning problem, that remain unsolved or even untouched. In general, a unified framework that allows model parameters of both linear and non-linear models to be easily tuned is always preferred in real-world applications. Kernel-induced learning methods form a class of approaches that show promising potentials to achieve this goal. Results A hierarchical statistical model named kernel-imbedded Gaussian process (KIGP) is developed under a unified Bayesian framework for binary disease classification problems using microarray gene expression data. In particular, based on a probit regression setting, an adaptive algorithm with a cascading structure is designed to find the appropriate kernel, to discover the potentially significant genes, and to make the optimal class prediction accordingly. A Gibbs sampler is built as the core of the algorithm to make Bayesian inferences. Simulation studies showed that, even without any knowledge of the underlying generative model, the KIGP performed very close to the theoretical Bayesian bound not only in the case with a linear Bayesian classifier but also in the case with a very non-linear Bayesian classifier. This sheds light on its broader usability to microarray data analysis problems, especially to those that linear methods work awkwardly. The KIGP was also applied to four published microarray datasets, and the results showed that the KIGP performed better than or at least as well as any of the referred state-of-the-art methods did in all of these cases. Conclusion Mathematically built on the kernel-induced feature space concept under a Bayesian framework, the KIGP method presented in this paper provides a unified machine learning approach to explore both the linear and the possibly non-linear underlying relationship between the target features of a given binary disease classification problem and the related explanatory gene expression data. More importantly, it incorporates the model parameter tuning into the framework. The model selection problem is addressed in the form of selecting a proper kernel type. The KIGP method also gives Bayesian probabilistic predictions for disease classification. These properties and features are beneficial to most real-world applications. The algorithm is naturally robust in numerical computation. The simulation studies and the published data studies demonstrated that the proposed KIGP performs satisfactorily and consistently. PMID:17328811

  7. Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach.

    PubMed

    Duarte, Belmiro P M; Wong, Weng Kee

    2015-08-01

    This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted.

  8. Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach

    PubMed Central

    Duarte, Belmiro P. M.; Wong, Weng Kee

    2014-01-01

    Summary This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted. PMID:26512159

  9. An introduction to using Bayesian linear regression with clinical data.

    PubMed

    Baldwin, Scott A; Larson, Michael J

    2017-11-01

    Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Probabilistic inference using linear Gaussian importance sampling for hybrid Bayesian networks

    NASA Astrophysics Data System (ADS)

    Sun, Wei; Chang, K. C.

    2005-05-01

    Probabilistic inference for Bayesian networks is in general NP-hard using either exact algorithms or approximate methods. However, for very complex networks, only the approximate methods such as stochastic sampling could be used to provide a solution given any time constraint. There are several simulation methods currently available. They include logic sampling (the first proposed stochastic method for Bayesian networks, the likelihood weighting algorithm) the most commonly used simulation method because of its simplicity and efficiency, the Markov blanket scoring method, and the importance sampling algorithm. In this paper, we first briefly review and compare these available simulation methods, then we propose an improved importance sampling algorithm called linear Gaussian importance sampling algorithm for general hybrid model (LGIS). LGIS is aimed for hybrid Bayesian networks consisting of both discrete and continuous random variables with arbitrary distributions. It uses linear function and Gaussian additive noise to approximate the true conditional probability distribution for continuous variable given both its parents and evidence in a Bayesian network. One of the most important features of the newly developed method is that it can adaptively learn the optimal important function from the previous samples. We test the inference performance of LGIS using a 16-node linear Gaussian model and a 6-node general hybrid model. The performance comparison with other well-known methods such as Junction tree (JT) and likelihood weighting (LW) shows that LGIS-GHM is very promising.

  11. Fast genomic predictions via Bayesian G-BLUP and multilocus models of threshold traits including censored Gaussian data.

    PubMed

    Kärkkäinen, Hanni P; Sillanpää, Mikko J

    2013-09-04

    Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.

  12. Fast Genomic Predictions via Bayesian G-BLUP and Multilocus Models of Threshold Traits Including Censored Gaussian Data

    PubMed Central

    Kärkkäinen, Hanni P.; Sillanpää, Mikko J.

    2013-01-01

    Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618

  13. A Bayes linear Bayes method for estimation of correlated event rates.

    PubMed

    Quigley, John; Wilson, Kevin J; Walls, Lesley; Bedford, Tim

    2013-12-01

    Typically, full Bayesian estimation of correlated event rates can be computationally challenging since estimators are intractable. When estimation of event rates represents one activity within a larger modeling process, there is an incentive to develop more efficient inference than provided by a full Bayesian model. We develop a new subjective inference method for correlated event rates based on a Bayes linear Bayes model under the assumption that events are generated from a homogeneous Poisson process. To reduce the elicitation burden we introduce homogenization factors to the model and, as an alternative to a subjective prior, an empirical method using the method of moments is developed. Inference under the new method is compared against estimates obtained under a full Bayesian model, which takes a multivariate gamma prior, where the predictive and posterior distributions are derived in terms of well-known functions. The mathematical properties of both models are presented. A simulation study shows that the Bayes linear Bayes inference method and the full Bayesian model provide equally reliable estimates. An illustrative example, motivated by a problem of estimating correlated event rates across different users in a simple supply chain, shows how ignoring the correlation leads to biased estimation of event rates. © 2013 Society for Risk Analysis.

  14. Bayesian model reduction and empirical Bayes for group (DCM) studies

    PubMed Central

    Friston, Karl J.; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E.; van Wijk, Bernadette C.M.; Ziegler, Gabriel; Zeidman, Peter

    2016-01-01

    This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level – e.g., dynamic causal models – and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570

  15. An efficient method for model refinement in diffuse optical tomography

    NASA Astrophysics Data System (ADS)

    Zirak, A. R.; Khademi, M.

    2007-11-01

    Diffuse optical tomography (DOT) is a non-linear, ill-posed, boundary value and optimization problem which necessitates regularization. Also, Bayesian methods are suitable owing to measurements data are sparse and correlated. In such problems which are solved with iterative methods, for stabilization and better convergence, the solution space must be small. These constraints subject to extensive and overdetermined system of equations which model retrieving criteria specially total least squares (TLS) must to refine model error. Using TLS is limited to linear systems which is not achievable when applying traditional Bayesian methods. This paper presents an efficient method for model refinement using regularized total least squares (RTLS) for treating on linearized DOT problem, having maximum a posteriori (MAP) estimator and Tikhonov regulator. This is done with combination Bayesian and regularization tools as preconditioner matrices, applying them to equations and then using RTLS to the resulting linear equations. The preconditioning matrixes are guided by patient specific information as well as a priori knowledge gained from the training set. Simulation results illustrate that proposed method improves the image reconstruction performance and localize the abnormally well.

  16. Expectation propagation for large scale Bayesian inference of non-linear molecular networks from perturbation data.

    PubMed

    Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger

    2017-01-01

    Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.

  17. Bayesian model reduction and empirical Bayes for group (DCM) studies.

    PubMed

    Friston, Karl J; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E; van Wijk, Bernadette C M; Ziegler, Gabriel; Zeidman, Peter

    2016-03-01

    This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level - e.g., dynamic causal models - and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  18. Bayesian Models for Astrophysical Data Using R, JAGS, Python, and Stan

    NASA Astrophysics Data System (ADS)

    Hilbe, Joseph M.; de Souza, Rafael S.; Ishida, Emille E. O.

    2017-05-01

    This comprehensive guide to Bayesian methods in astronomy enables hands-on work by supplying complete R, JAGS, Python, and Stan code, to use directly or to adapt. It begins by examining the normal model from both frequentist and Bayesian perspectives and then progresses to a full range of Bayesian generalized linear and mixed or hierarchical models, as well as additional types of models such as ABC and INLA. The book provides code that is largely unavailable elsewhere and includes details on interpreting and evaluating Bayesian models. Initial discussions offer models in synthetic form so that readers can easily adapt them to their own data; later the models are applied to real astronomical data. The consistent focus is on hands-on modeling, analysis of data, and interpretations that address scientific questions. A must-have for astronomers, its concrete approach will also be attractive to researchers in the sciences more generally.

  19. Bayesian Approach to the Joint Inversion of Gravity and Magnetic Data, with Application to the Ismenius Area of Mars

    NASA Technical Reports Server (NTRS)

    Jewell, Jeffrey B.; Raymond, C.; Smrekar, S.; Millbury, C.

    2004-01-01

    This viewgraph presentation reviews a Bayesian approach to the inversion of gravity and magnetic data with specific application to the Ismenius Area of Mars. Many inverse problems encountered in geophysics and planetary science are well known to be non-unique (i.e. inversion of gravity the density structure of a body). In hopes of reducing the non-uniqueness of solutions, there has been interest in the joint analysis of data. An example is the joint inversion of gravity and magnetic data, with the assumption that the same physical anomalies generate both the observed magnetic and gravitational anomalies. In this talk, we formulate the joint analysis of different types of data in a Bayesian framework and apply the formalism to the inference of the density and remanent magnetization structure for a local region in the Ismenius area of Mars. The Bayesian approach allows prior information or constraints in the solutions to be incorporated in the inversion, with the "best" solutions those whose forward predictions most closely match the data while remaining consistent with assumed constraints. The application of this framework to the inversion of gravity and magnetic data on Mars reveals two typical challenges - the forward predictions of the data have a linear dependence on some of the quantities of interest, and non-linear dependence on others (termed the "linear" and "non-linear" variables, respectively). For observations with Gaussian noise, a Bayesian approach to inversion for "linear" variables reduces to a linear filtering problem, with an explicitly computable "error" matrix. However, for models whose forward predictions have non-linear dependencies, inference is no longer given by such a simple linear problem, and moreover, the uncertainty in the solution is no longer completely specified by a computable "error matrix". It is therefore important to develop methods for sampling from the full Bayesian posterior to provide a complete and statistically consistent picture of model uncertainty, and what has been learned from observations. We will discuss advanced numerical techniques, including Monte Carlo Markov

  20. Spectral likelihood expansions for Bayesian inference

    NASA Astrophysics Data System (ADS)

    Nagel, Joseph B.; Sudret, Bruno

    2016-03-01

    A spectral approach to Bayesian inference is presented. It pursues the emulation of the posterior probability density. The starting point is a series expansion of the likelihood function in terms of orthogonal polynomials. From this spectral likelihood expansion all statistical quantities of interest can be calculated semi-analytically. The posterior is formally represented as the product of a reference density and a linear combination of polynomial basis functions. Both the model evidence and the posterior moments are related to the expansion coefficients. This formulation avoids Markov chain Monte Carlo simulation and allows one to make use of linear least squares instead. The pros and cons of spectral Bayesian inference are discussed and demonstrated on the basis of simple applications from classical statistics and inverse modeling.

  1. Learning oncogenetic networks by reducing to mixed integer linear programming.

    PubMed

    Shahrabi Farahani, Hossein; Lagergren, Jens

    2013-01-01

    Cancer can be a result of accumulation of different types of genetic mutations such as copy number aberrations. The data from tumors are cross-sectional and do not contain the temporal order of the genetic events. Finding the order in which the genetic events have occurred and progression pathways are of vital importance in understanding the disease. In order to model cancer progression, we propose Progression Networks, a special case of Bayesian networks, that are tailored to model disease progression. Progression networks have similarities with Conjunctive Bayesian Networks (CBNs) [1],a variation of Bayesian networks also proposed for modeling disease progression. We also describe a learning algorithm for learning Bayesian networks in general and progression networks in particular. We reduce the hard problem of learning the Bayesian and progression networks to Mixed Integer Linear Programming (MILP). MILP is a Non-deterministic Polynomial-time complete (NP-complete) problem for which very good heuristics exists. We tested our algorithm on synthetic and real cytogenetic data from renal cell carcinoma. We also compared our learned progression networks with the networks proposed in earlier publications. The software is available on the website https://bitbucket.org/farahani/diprog.

  2. Bayesian generalized linear mixed modeling of Tuberculosis using informative priors.

    PubMed

    Ojo, Oluwatobi Blessing; Lougue, Siaka; Woldegerima, Woldegebriel Assefa

    2017-01-01

    TB is rated as one of the world's deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014.

  3. Boosting Bayesian parameter inference of stochastic differential equation models with methods from statistical physics

    NASA Astrophysics Data System (ADS)

    Albert, Carlo; Ulzega, Simone; Stoop, Ruedi

    2016-04-01

    Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods that have been developed in the statistical physics community over the last few decades. We demonstrate that such methods, along with automated differentiation algorithms, allow us to perform a full-fledged Bayesian inference, for a large class of SDE models, in a highly efficient and largely automatized manner. Furthermore, our algorithm is highly parallelizable. For our toy model, discretized with a few hundred points, a full Bayesian inference can be performed in a matter of seconds on a standard PC.

  4. Inference on the Genetic Basis of Eye and Skin Color in an Admixed Population via Bayesian Linear Mixed Models.

    PubMed

    Lloyd-Jones, Luke R; Robinson, Matthew R; Moser, Gerhard; Zeng, Jian; Beleza, Sandra; Barsh, Gregory S; Tang, Hua; Visscher, Peter M

    2017-06-01

    Genetic association studies in admixed populations are underrepresented in the genomics literature, with a key concern for researchers being the adequate control of spurious associations due to population structure. Linear mixed models (LMMs) are well suited for genome-wide association studies (GWAS) because they account for both population stratification and cryptic relatedness and achieve increased statistical power by jointly modeling all genotyped markers. Additionally, Bayesian LMMs allow for more flexible assumptions about the underlying distribution of genetic effects, and can concurrently estimate the proportion of phenotypic variance explained by genetic markers. Using three recently published Bayesian LMMs, Bayes R, BSLMM, and BOLT-LMM, we investigate an existing data set on eye ( n = 625) and skin ( n = 684) color from Cape Verde, an island nation off West Africa that is home to individuals with a broad range of phenotypic values for eye and skin color due to the mix of West African and European ancestry. We use simulations to demonstrate the utility of Bayesian LMMs for mapping loci and studying the genetic architecture of quantitative traits in admixed populations. The Bayesian LMMs provide evidence for two new pigmentation loci: one for eye color ( AHRR ) and one for skin color ( DDB1 ). Copyright © 2017 by the Genetics Society of America.

  5. Bayesian Correction for Misclassification in Multilevel Count Data Models.

    PubMed

    Nelson, Tyler; Song, Joon Jin; Chin, Yoo-Mi; Stamey, James D

    2018-01-01

    Covariate misclassification is well known to yield biased estimates in single level regression models. The impact on hierarchical count models has been less studied. A fully Bayesian approach to modeling both the misclassified covariate and the hierarchical response is proposed. Models with a single diagnostic test and with multiple diagnostic tests are considered. Simulation studies show the ability of the proposed model to appropriately account for the misclassification by reducing bias and improving performance of interval estimators. A real data example further demonstrated the consequences of ignoring the misclassification. Ignoring misclassification yielded a model that indicated there was a significant, positive impact on the number of children of females who observed spousal abuse between their parents. When the misclassification was accounted for, the relationship switched to negative, but not significant. Ignoring misclassification in standard linear and generalized linear models is well known to lead to biased results. We provide an approach to extend misclassification modeling to the important area of hierarchical generalized linear models.

  6. An introduction to Bayesian statistics in health psychology.

    PubMed

    Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske

    2017-09-01

    The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.

  7. Bayesian generalized linear mixed modeling of Tuberculosis using informative priors

    PubMed Central

    Woldegerima, Woldegebriel Assefa

    2017-01-01

    TB is rated as one of the world’s deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014. PMID:28257437

  8. Bayesian estimation inherent in a Mexican-hat-type neural network

    NASA Astrophysics Data System (ADS)

    Takiyama, Ken

    2016-05-01

    Brain functions, such as perception, motor control and learning, and decision making, have been explained based on a Bayesian framework, i.e., to decrease the effects of noise inherent in the human nervous system or external environment, our brain integrates sensory and a priori information in a Bayesian optimal manner. However, it remains unclear how Bayesian computations are implemented in the brain. Herein, I address this issue by analyzing a Mexican-hat-type neural network, which was used as a model of the visual cortex, motor cortex, and prefrontal cortex. I analytically demonstrate that the dynamics of an order parameter in the model corresponds exactly to a variational inference of a linear Gaussian state-space model, a Bayesian estimation, when the strength of recurrent synaptic connectivity is appropriately stronger than that of an external stimulus, a plausible condition in the brain. This exact correspondence can reveal the relationship between the parameters in the Bayesian estimation and those in the neural network, providing insight for understanding brain functions.

  9. Bayesian analysis of non-linear differential equation models with application to a gut microbial ecosystem.

    PubMed

    Lawson, Daniel J; Holtrop, Grietje; Flint, Harry

    2011-07-01

    Process models specified by non-linear dynamic differential equations contain many parameters, which often must be inferred from a limited amount of data. We discuss a hierarchical Bayesian approach combining data from multiple related experiments in a meaningful way, which permits more powerful inference than treating each experiment as independent. The approach is illustrated with a simulation study and example data from experiments replicating the aspects of the human gut microbial ecosystem. A predictive model is obtained that contains prediction uncertainty caused by uncertainty in the parameters, and we extend the model to capture situations of interest that cannot easily be studied experimentally. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Using Discrete Loss Functions and Weighted Kappa for Classification: An Illustration Based on Bayesian Network Analysis

    ERIC Educational Resources Information Center

    Zwick, Rebecca; Lenaburg, Lubella

    2009-01-01

    In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…

  11. Update on Bayesian Blocks: Segmented Models for Sequential Data

    NASA Technical Reports Server (NTRS)

    Scargle, Jeff

    2017-01-01

    The Bayesian Block algorithm, in wide use in astronomy and other areas, has been improved in several ways. The model for block shape has been generalized to include other than constant signal rate - e.g., linear, exponential, or other parametric models. In addition the computational efficiency has been improved, so that instead of O(N**2) the basic algorithm is O(N) in most cases. Other improvements in the theory and application of segmented representations will be described.

  12. Identification of transmissivity fields using a Bayesian strategy and perturbative approach

    NASA Astrophysics Data System (ADS)

    Zanini, Andrea; Tanda, Maria Giovanna; Woodbury, Allan D.

    2017-10-01

    The paper deals with the crucial problem of the groundwater parameter estimation that is the basis for efficient modeling and reclamation activities. A hierarchical Bayesian approach is developed: it uses the Akaike's Bayesian Information Criteria in order to estimate the hyperparameters (related to the covariance model chosen) and to quantify the unknown noise variance. The transmissivity identification proceeds in two steps: the first, called empirical Bayesian interpolation, uses Y* (Y = lnT) observations to interpolate Y values on a specified grid; the second, called empirical Bayesian update, improve the previous Y estimate through the addition of hydraulic head observations. The relationship between the head and the lnT has been linearized through a perturbative solution of the flow equation. In order to test the proposed approach, synthetic aquifers from literature have been considered. The aquifers in question contain a variety of boundary conditions (both Dirichelet and Neuman type) and scales of heterogeneities (σY2 = 1.0 and σY2 = 5.3). The estimated transmissivity fields were compared to the true one. The joint use of Y* and head measurements improves the estimation of Y considering both degrees of heterogeneity. Even if the variance of the strong transmissivity field can be considered high for the application of the perturbative approach, the results show the same order of approximation of the non-linear methods proposed in literature. The procedure allows to compute the posterior probability distribution of the target quantities and to quantify the uncertainty in the model prediction. Bayesian updating has advantages related both to the Monte-Carlo (MC) and non-MC approaches. In fact, as the MC methods, Bayesian updating allows computing the direct posterior probability distribution of the target quantities and as non-MC methods it has computational times in the order of seconds.

  13. BFLCRM: A BAYESIAN FUNCTIONAL LINEAR COX REGRESSION MODEL FOR PREDICTING TIME TO CONVERSION TO ALZHEIMER’S DISEASE*

    PubMed Central

    Lee, Eunjee; Zhu, Hongtu; Kong, Dehan; Wang, Yalin; Giovanello, Kelly Sullivan; Ibrahim, Joseph G

    2015-01-01

    The aim of this paper is to develop a Bayesian functional linear Cox regression model (BFLCRM) with both functional and scalar covariates. This new development is motivated by establishing the likelihood of conversion to Alzheimer’s disease (AD) in 346 patients with mild cognitive impairment (MCI) enrolled in the Alzheimer’s Disease Neuroimaging Initiative 1 (ADNI-1) and the early markers of conversion. These 346 MCI patients were followed over 48 months, with 161 MCI participants progressing to AD at 48 months. The functional linear Cox regression model was used to establish that functional covariates including hippocampus surface morphology and scalar covariates including brain MRI volumes, cognitive performance (ADAS-Cog), and APOE status can accurately predict time to onset of AD. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFLCRM. PMID:26900412

  14. A Bayesian approach for estimating under-reported dengue incidence with a focus on non-linear associations between climate and dengue in Dhaka, Bangladesh.

    PubMed

    Sharmin, Sifat; Glass, Kathryn; Viennet, Elvina; Harley, David

    2018-04-01

    Determining the relation between climate and dengue incidence is challenging due to under-reporting of disease and consequent biased incidence estimates. Non-linear associations between climate and incidence compound this. Here, we introduce a modelling framework to estimate dengue incidence from passive surveillance data while incorporating non-linear climate effects. We estimated the true number of cases per month using a Bayesian generalised linear model, developed in stages to adjust for under-reporting. A semi-parametric thin-plate spline approach was used to quantify non-linear climate effects. The approach was applied to data collected from the national dengue surveillance system of Bangladesh. The model estimated that only 2.8% (95% credible interval 2.7-2.8) of all cases in the capital Dhaka were reported through passive case reporting. The optimal mean monthly temperature for dengue transmission is 29℃ and average monthly rainfall above 15 mm decreases transmission. Our approach provides an estimate of true incidence and an understanding of the effects of temperature and rainfall on dengue transmission in Dhaka, Bangladesh.

  15. Bayesian design criteria: computation, comparison, and application to a pharmacokinetic and a pharmacodynamic model.

    PubMed

    Merlé, Y; Mentré, F

    1995-02-01

    In this paper 3 criteria to design experiments for Bayesian estimation of the parameters of nonlinear models with respect to their parameters, when a prior distribution is available, are presented: the determinant of the Bayesian information matrix, the determinant of the pre-posterior covariance matrix, and the expected information provided by an experiment. A procedure to simplify the computation of these criteria is proposed in the case of continuous prior distributions and is compared with the criterion obtained from a linearization of the model about the mean of the prior distribution for the parameters. This procedure is applied to two models commonly encountered in the area of pharmacokinetics and pharmacodynamics: the one-compartment open model with bolus intravenous single-dose injection and the Emax model. They both involve two parameters. Additive as well as multiplicative gaussian measurement errors are considered with normal prior distributions. Various combinations of the variances of the prior distribution and of the measurement error are studied. Our attention is restricted to designs with limited numbers of measurements (1 or 2 measurements). This situation often occurs in practice when Bayesian estimation is performed. The optimal Bayesian designs that result vary with the variances of the parameter distribution and with the measurement error. The two-point optimal designs sometimes differ from the D-optimal designs for the mean of the prior distribution and may consist of replicating measurements. For the studied cases, the determinant of the Bayesian information matrix and its linearized form lead to the same optimal designs. In some cases, the pre-posterior covariance matrix can be far from its lower bound, namely, the inverse of the Bayesian information matrix, especially for the Emax model and a multiplicative measurement error. The expected information provided by the experiment and the determinant of the pre-posterior covariance matrix generally lead to the same designs except for the Emax model and the multiplicative measurement error. Results show that these criteria can be easily computed and that they could be incorporated in modules for designing experiments.

  16. Assessing Local Model Adequacy in Bayesian Hierarchical Models Using the Partitioned Deviance Information Criterion

    PubMed Central

    Wheeler, David C.; Hickson, DeMarc A.; Waller, Lance A.

    2010-01-01

    Many diagnostic tools and goodness-of-fit measures, such as the Akaike information criterion (AIC) and the Bayesian deviance information criterion (DIC), are available to evaluate the overall adequacy of linear regression models. In addition, visually assessing adequacy in models has become an essential part of any regression analysis. In this paper, we focus on a spatial consideration of the local DIC measure for model selection and goodness-of-fit evaluation. We use a partitioning of the DIC into the local DIC, leverage, and deviance residuals to assess local model fit and influence for both individual observations and groups of observations in a Bayesian framework. We use visualization of the local DIC and differences in local DIC between models to assist in model selection and to visualize the global and local impacts of adding covariates or model parameters. We demonstrate the utility of the local DIC in assessing model adequacy using HIV prevalence data from pregnant women in the Butare province of Rwanda during 1989-1993 using a range of linear model specifications, from global effects only to spatially varying coefficient models, and a set of covariates related to sexual behavior. Results of applying the diagnostic visualization approach include more refined model selection and greater understanding of the models as applied to the data. PMID:21243121

  17. Nonparametric Bayesian Multiple Imputation for Incomplete Categorical Variables in Large-Scale Assessment Surveys

    ERIC Educational Resources Information Center

    Si, Yajuan; Reiter, Jerome P.

    2013-01-01

    In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression imputation, can fail to capture complex dependencies and can be difficult to implement effectively in high dimensions. We present a fully Bayesian,…

  18. On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang

    2015-02-01

    The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesianmore » inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.« less

  19. Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data

    PubMed Central

    Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian; Punjabi, Naresh M.

    2013-01-01

    Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis. PMID:22241689

  20. Development of a Bayesian response-adaptive trial design for the Dexamethasone for Excessive Menstruation study.

    PubMed

    Holm Hansen, Christian; Warner, Pamela; Parker, Richard A; Walker, Brian R; Critchley, Hilary Od; Weir, Christopher J

    2017-12-01

    It is often unclear what specific adaptive trial design features lead to an efficient design which is also feasible to implement. This article describes the preparatory simulation study for a Bayesian response-adaptive dose-finding trial design. Dexamethasone for Excessive Menstruation aims to assess the efficacy of Dexamethasone in reducing excessive menstrual bleeding and to determine the best dose for further study. To maximise learning about the dose response, patients receive placebo or an active dose with randomisation probabilities adapting based on evidence from patients already recruited. The dose-response relationship is estimated using a flexible Bayesian Normal Dynamic Linear Model. Several competing design options were considered including: number of doses, proportion assigned to placebo, adaptation criterion, and number and timing of adaptations. We performed a fractional factorial study using SAS software to simulate virtual trial data for candidate adaptive designs under a variety of scenarios and to invoke WinBUGS for Bayesian model estimation. We analysed the simulated trial results using Normal linear models to estimate the effects of each design feature on empirical type I error and statistical power. Our readily-implemented approach using widely available statistical software identified a final design which performed robustly across a range of potential trial scenarios.

  1. Quantifying temporal trends in fisheries abundance using Bayesian dynamic linear models: A case study of riverine Smallmouth Bass populations

    USGS Publications Warehouse

    Schall, Megan K.; Blazer, Vicki S.; Lorantas, Robert M.; Smith, Geoffrey; Mullican, John E.; Keplinger, Brandon J.; Wagner, Tyler

    2018-01-01

    Detecting temporal changes in fish abundance is an essential component of fisheries management. Because of the need to understand short‐term and nonlinear changes in fish abundance, traditional linear models may not provide adequate information for management decisions. This study highlights the utility of Bayesian dynamic linear models (DLMs) as a tool for quantifying temporal dynamics in fish abundance. To achieve this goal, we quantified temporal trends of Smallmouth Bass Micropterus dolomieu catch per effort (CPE) from rivers in the mid‐Atlantic states, and we calculated annual probabilities of decline from the posterior distributions of annual rates of change in CPE. We were interested in annual declines because of recent concerns about fish health in portions of the study area. In general, periods of decline were greatest within the Susquehanna River basin, Pennsylvania. The declines in CPE began in the late 1990s—prior to observations of fish health problems—and began to stabilize toward the end of the time series (2011). In contrast, many of the other rivers investigated did not have the same magnitude or duration of decline in CPE. Bayesian DLMs provide information about annual changes in abundance that can inform management and are easily communicated with managers and stakeholders.

  2. Protein construct storage: Bayesian variable selection and prediction with mixtures.

    PubMed

    Clyde, M A; Parmigiani, G

    1998-07-01

    Determining optimal conditions for protein storage while maintaining a high level of protein activity is an important question in pharmaceutical research. A designed experiment based on a space-filling design was conducted to understand the effects of factors affecting protein storage and to establish optimal storage conditions. Different model-selection strategies to identify important factors may lead to very different answers about optimal conditions. Uncertainty about which factors are important, or model uncertainty, can be a critical issue in decision-making. We use Bayesian variable selection methods for linear models to identify important variables in the protein storage data, while accounting for model uncertainty. We also use the Bayesian framework to build predictions based on a large family of models, rather than an individual model, and to evaluate the probability that certain candidate storage conditions are optimal.

  3. Bayesian assessment of the expected data impact on prediction confidence in optimal sampling design

    NASA Astrophysics Data System (ADS)

    Leube, P. C.; Geiges, A.; Nowak, W.

    2012-02-01

    Incorporating hydro(geo)logical data, such as head and tracer data, into stochastic models of (subsurface) flow and transport helps to reduce prediction uncertainty. Because of financial limitations for investigation campaigns, information needs toward modeling or prediction goals should be satisfied efficiently and rationally. Optimal design techniques find the best one among a set of investigation strategies. They optimize the expected impact of data on prediction confidence or related objectives prior to data collection. We introduce a new optimal design method, called PreDIA(gnosis) (Preposterior Data Impact Assessor). PreDIA derives the relevant probability distributions and measures of data utility within a fully Bayesian, generalized, flexible, and accurate framework. It extends the bootstrap filter (BF) and related frameworks to optimal design by marginalizing utility measures over the yet unknown data values. PreDIA is a strictly formal information-processing scheme free of linearizations. It works with arbitrary simulation tools, provides full flexibility concerning measurement types (linear, nonlinear, direct, indirect), allows for any desired task-driven formulations, and can account for various sources of uncertainty (e.g., heterogeneity, geostatistical assumptions, boundary conditions, measurement values, model structure uncertainty, a large class of model errors) via Bayesian geostatistics and model averaging. Existing methods fail to simultaneously provide these crucial advantages, which our method buys at relatively higher-computational costs. We demonstrate the applicability and advantages of PreDIA over conventional linearized methods in a synthetic example of subsurface transport. In the example, we show that informative data is often invisible for linearized methods that confuse zero correlation with statistical independence. Hence, PreDIA will often lead to substantially better sampling designs. Finally, we extend our example to specifically highlight the consideration of conceptual model uncertainty.

  4. Bayesian evidence computation for model selection in non-linear geoacoustic inference problems.

    PubMed

    Dettmer, Jan; Dosso, Stan E; Osler, John C

    2010-12-01

    This paper applies a general Bayesian inference approach, based on Bayesian evidence computation, to geoacoustic inversion of interface-wave dispersion data. Quantitative model selection is carried out by computing the evidence (normalizing constants) for several model parameterizations using annealed importance sampling. The resulting posterior probability density estimate is compared to estimates obtained from Metropolis-Hastings sampling to ensure consistent results. The approach is applied to invert interface-wave dispersion data collected on the Scotian Shelf, off the east coast of Canada for the sediment shear-wave velocity profile. Results are consistent with previous work on these data but extend the analysis to a rigorous approach including model selection and uncertainty analysis. The results are also consistent with core samples and seismic reflection measurements carried out in the area.

  5. Sparse Bayesian Learning for Identifying Imaging Biomarkers in AD Prediction

    PubMed Central

    Shen, Li; Qi, Yuan; Kim, Sungeun; Nho, Kwangsik; Wan, Jing; Risacher, Shannon L.; Saykin, Andrew J.

    2010-01-01

    We apply sparse Bayesian learning methods, automatic relevance determination (ARD) and predictive ARD (PARD), to Alzheimer’s disease (AD) classification to make accurate prediction and identify critical imaging markers relevant to AD at the same time. ARD is one of the most successful Bayesian feature selection methods. PARD is a powerful Bayesian feature selection method, and provides sparse models that is easy to interpret. PARD selects the model with the best estimate of the predictive performance instead of choosing the one with the largest marginal model likelihood. Comparative study with support vector machine (SVM) shows that ARD/PARD in general outperform SVM in terms of prediction accuracy. Additional comparison with surface-based general linear model (GLM) analysis shows that regions with strongest signals are identified by both GLM and ARD/PARD. While GLM P-map returns significant regions all over the cortex, ARD/PARD provide a small number of relevant and meaningful imaging markers with predictive power, including both cortical and subcortical measures. PMID:20879451

  6. Bayesian Inference for Generalized Linear Models for Spiking Neurons

    PubMed Central

    Gerwinn, Sebastian; Macke, Jakob H.; Bethge, Matthias

    2010-01-01

    Generalized Linear Models (GLMs) are commonly used statistical methods for modelling the relationship between neural population activity and presented stimuli. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. Here we show how the posterior distribution over model parameters of GLMs can be approximated by a Gaussian using the Expectation Propagation algorithm. In this way, we obtain an estimate of the posterior mean and posterior covariance, allowing us to calculate Bayesian confidence intervals that characterize the uncertainty about the optimal solution. From the posterior we also obtain a different point estimate, namely the posterior mean as opposed to the commonly used maximum a posteriori estimate. We systematically compare the different inference techniques on simulated as well as on multi-electrode recordings of retinal ganglion cells, and explore the effects of the chosen prior and the performance measure used. We find that good performance can be achieved by choosing an Laplace prior together with the posterior mean estimate. PMID:20577627

  7. Revisiting Isotherm Analyses Using R: Comparison of Linear, Non-linear, and Bayesian Techniques

    EPA Science Inventory

    Extensive adsorption isotherm data exist for an array of chemicals of concern on a variety of engineered and natural sorbents. Several isotherm models exist that can accurately describe these data from which the resultant fitting parameters may subsequently be used in numerical ...

  8. Spatio-temporal Bayesian model selection for disease mapping

    PubMed Central

    Carroll, R; Lawson, AB; Faes, C; Kirby, RS; Aregay, M; Watjou, K

    2016-01-01

    Spatio-temporal analysis of small area health data often involves choosing a fixed set of predictors prior to the final model fit. In this paper, we propose a spatio-temporal approach of Bayesian model selection to implement model selection for certain areas of the study region as well as certain years in the study time line. Here, we examine the usefulness of this approach by way of a large-scale simulation study accompanied by a case study. Our results suggest that a special case of the model selection methods, a mixture model allowing a weight parameter to indicate if the appropriate linear predictor is spatial, spatio-temporal, or a mixture of the two, offers the best option to fitting these spatio-temporal models. In addition, the case study illustrates the effectiveness of this mixture model within the model selection setting by easily accommodating lifestyle, socio-economic, and physical environmental variables to select a predominantly spatio-temporal linear predictor. PMID:28070156

  9. Bayesian Travel Time Inversion adopting Gaussian Process Regression

    NASA Astrophysics Data System (ADS)

    Mauerberger, S.; Holschneider, M.

    2017-12-01

    A major application in seismology is the determination of seismic velocity models. Travel time measurements are putting an integral constraint on the velocity between source and receiver. We provide insight into travel time inversion from a correlation-based Bayesian point of view. Therefore, the concept of Gaussian process regression is adopted to estimate a velocity model. The non-linear travel time integral is approximated by a 1st order Taylor expansion. A heuristic covariance describes correlations amongst observations and a priori model. That approach enables us to assess a proxy of the Bayesian posterior distribution at ordinary computational costs. No multi dimensional numeric integration nor excessive sampling is necessary. Instead of stacking the data, we suggest to progressively build the posterior distribution. Incorporating only a single evidence at a time accounts for the deficit of linearization. As a result, the most probable model is given by the posterior mean whereas uncertainties are described by the posterior covariance.As a proof of concept, a synthetic purely 1d model is addressed. Therefore a single source accompanied by multiple receivers is considered on top of a model comprising a discontinuity. We consider travel times of both phases - direct and reflected wave - corrupted by noise. Left and right of the interface are assumed independent where the squared exponential kernel serves as covariance.

  10. Bayesian quantile regression-based partially linear mixed-effects joint models for longitudinal data with multiple features.

    PubMed

    Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara

    2017-01-01

    In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.

  11. Hierarchical Bayesian sparse image reconstruction with application to MRFM.

    PubMed

    Dobigeon, Nicolas; Hero, Alfred O; Tourneret, Jean-Yves

    2009-09-01

    This paper presents a hierarchical Bayesian model to reconstruct sparse images when the observations are obtained from linear transformations and corrupted by an additive white Gaussian noise. Our hierarchical Bayes model is well suited to such naturally sparse image applications as it seamlessly accounts for properties such as sparsity and positivity of the image via appropriate Bayes priors. We propose a prior that is based on a weighted mixture of a positive exponential distribution and a mass at zero. The prior has hyperparameters that are tuned automatically by marginalization over the hierarchical Bayesian model. To overcome the complexity of the posterior distribution, a Gibbs sampling strategy is proposed. The Gibbs samples can be used to estimate the image to be recovered, e.g., by maximizing the estimated posterior distribution. In our fully Bayesian approach, the posteriors of all the parameters are available. Thus, our algorithm provides more information than other previously proposed sparse reconstruction methods that only give a point estimate. The performance of the proposed hierarchical Bayesian sparse reconstruction method is illustrated on synthetic data and real data collected from a tobacco virus sample using a prototype MRFM instrument.

  12. Simultaneous Optimization of Decisions Using a Linear Utility Function.

    ERIC Educational Resources Information Center

    Vos, Hans J.

    1990-01-01

    An approach is presented to simultaneously optimize decision rules for combinations of elementary decisions through a framework derived from Bayesian decision theory. The developed linear utility model for selection-mastery decisions was applied to a sample of 43 first year medical students to illustrate the procedure. (SLD)

  13. Comparing Families of Dynamic Causal Models

    PubMed Central

    Penny, Will D.; Stephan, Klaas E.; Daunizeau, Jean; Rosa, Maria J.; Friston, Karl J.; Schofield, Thomas M.; Leff, Alex P.

    2010-01-01

    Mathematical models of scientific data can be formally compared using Bayesian model evidence. Previous applications in the biological sciences have mainly focussed on model selection in which one first selects the model with the highest evidence and then makes inferences based on the parameters of that model. This “best model” approach is very useful but can become brittle if there are a large number of models to compare, and if different subjects use different models. To overcome this shortcoming we propose the combination of two further approaches: (i) family level inference and (ii) Bayesian model averaging within families. Family level inference removes uncertainty about aspects of model structure other than the characteristic of interest. For example: What are the inputs to the system? Is processing serial or parallel? Is it linear or nonlinear? Is it mediated by a single, crucial connection? We apply Bayesian model averaging within families to provide inferences about parameters that are independent of further assumptions about model structure. We illustrate the methods using Dynamic Causal Models of brain imaging data. PMID:20300649

  14. Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States

    NASA Astrophysics Data System (ADS)

    Yang, J.; Astitha, M.; Schwartz, C. S.

    2017-12-01

    Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.

  15. Bayesian modelling of the emission spectrum of the Joint European Torus Lithium Beam Emission Spectroscopy system.

    PubMed

    Kwak, Sehyun; Svensson, J; Brix, M; Ghim, Y-C

    2016-02-01

    A Bayesian model of the emission spectrum of the JET lithium beam has been developed to infer the intensity of the Li I (2p-2s) line radiation and associated uncertainties. The detected spectrum for each channel of the lithium beam emission spectroscopy system is here modelled by a single Li line modified by an instrumental function, Bremsstrahlung background, instrumental offset, and interference filter curve. Both the instrumental function and the interference filter curve are modelled with non-parametric Gaussian processes. All free parameters of the model, the intensities of the Li line, Bremsstrahlung background, and instrumental offset, are inferred using Bayesian probability theory with a Gaussian likelihood for photon statistics and electronic background noise. The prior distributions of the free parameters are chosen as Gaussians. Given these assumptions, the intensity of the Li line and corresponding uncertainties are analytically available using a Bayesian linear inversion technique. The proposed approach makes it possible to extract the intensity of Li line without doing a separate background subtraction through modulation of the Li beam.

  16. Bayesian Group Bridge for Bi-level Variable Selection.

    PubMed

    Mallick, Himel; Yi, Nengjun

    2017-06-01

    A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.

  17. Bayesian reconstruction of projection reconstruction NMR (PR-NMR).

    PubMed

    Yoon, Ji Won

    2014-11-01

    Projection reconstruction nuclear magnetic resonance (PR-NMR) is a technique for generating multidimensional NMR spectra. A small number of projections from lower-dimensional NMR spectra are used to reconstruct the multidimensional NMR spectra. In our previous work, it was shown that multidimensional NMR spectra are efficiently reconstructed using peak-by-peak based reversible jump Markov chain Monte Carlo (RJMCMC) algorithm. We propose an extended and generalized RJMCMC algorithm replacing a simple linear model with a linear mixed model to reconstruct close NMR spectra into true spectra. This statistical method generates samples in a Bayesian scheme. Our proposed algorithm is tested on a set of six projections derived from the three-dimensional 700 MHz HNCO spectrum of a protein HasA. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Model-based Bayesian inference for ROC data analysis

    NASA Astrophysics Data System (ADS)

    Lei, Tianhu; Bae, K. Ty

    2013-03-01

    This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.

  19. Pathway analysis of high-throughput biological data within a Bayesian network framework.

    PubMed

    Isci, Senol; Ozturk, Cengizhan; Jones, Jon; Otu, Hasan H

    2011-06-15

    Most current approaches to high-throughput biological data (HTBD) analysis either perform individual gene/protein analysis or, gene/protein set enrichment analysis for a list of biologically relevant molecules. Bayesian Networks (BNs) capture linear and non-linear interactions, handle stochastic events accounting for noise, and focus on local interactions, which can be related to causal inference. Here, we describe for the first time an algorithm that models biological pathways as BNs and identifies pathways that best explain given HTBD by scoring fitness of each network. Proposed method takes into account the connectivity and relatedness between nodes of the pathway through factoring pathway topology in its model. Our simulations using synthetic data demonstrated robustness of our approach. We tested proposed method, Bayesian Pathway Analysis (BPA), on human microarray data regarding renal cell carcinoma (RCC) and compared our results with gene set enrichment analysis. BPA was able to find broader and more specific pathways related to RCC. Accompanying BPA software (BPAS) package is freely available for academic use at http://bumil.boun.edu.tr/bpa.

  20. A Bayesian Semiparametric Latent Variable Model for Mixed Responses

    ERIC Educational Resources Information Center

    Fahrmeir, Ludwig; Raach, Alexander

    2007-01-01

    In this paper we introduce a latent variable model (LVM) for mixed ordinal and continuous responses, where covariate effects on the continuous latent variables are modelled through a flexible semiparametric Gaussian regression model. We extend existing LVMs with the usual linear covariate effects by including nonparametric components for nonlinear…

  1. Rigorous Approach in Investigation of Seismic Structure and Source Characteristicsin Northeast Asia: Hierarchical and Trans-dimensional Bayesian Inversion

    NASA Astrophysics Data System (ADS)

    Mustac, M.; Kim, S.; Tkalcic, H.; Rhie, J.; Chen, Y.; Ford, S. R.; Sebastian, N.

    2015-12-01

    Conventional approaches to inverse problems suffer from non-linearity and non-uniqueness in estimations of seismic structures and source properties. Estimated results and associated uncertainties are often biased by applied regularizations and additional constraints, which are commonly introduced to solve such problems. Bayesian methods, however, provide statistically meaningful estimations of models and their uncertainties constrained by data information. In addition, hierarchical and trans-dimensional (trans-D) techniques are inherently implemented in the Bayesian framework to account for involved error statistics and model parameterizations, and, in turn, allow more rigorous estimations of the same. Here, we apply Bayesian methods throughout the entire inference process to estimate seismic structures and source properties in Northeast Asia including east China, the Korean peninsula, and the Japanese islands. Ambient noise analysis is first performed to obtain a base three-dimensional (3-D) heterogeneity model using continuous broadband waveforms from more than 300 stations. As for the tomography of surface wave group and phase velocities in the 5-70 s band, we adopt a hierarchical and trans-D Bayesian inversion method using Voronoi partition. The 3-D heterogeneity model is further improved by joint inversions of teleseismic receiver functions and dispersion data using a newly developed high-efficiency Bayesian technique. The obtained model is subsequently used to prepare 3-D structural Green's functions for the source characterization. A hierarchical Bayesian method for point source inversion using regional complete waveform data is applied to selected events from the region. The seismic structure and source characteristics with rigorously estimated uncertainties from the novel Bayesian methods provide enhanced monitoring and discrimination of seismic events in northeast Asia.

  2. A flexible Bayesian assessment for the expected impact of data on prediction confidence for optimal sampling designs

    NASA Astrophysics Data System (ADS)

    Leube, Philipp; Geiges, Andreas; Nowak, Wolfgang

    2010-05-01

    Incorporating hydrogeological data, such as head and tracer data, into stochastic models of subsurface flow and transport helps to reduce prediction uncertainty. Considering limited financial resources available for the data acquisition campaign, information needs towards the prediction goal should be satisfied in a efficient and task-specific manner. For finding the best one among a set of design candidates, an objective function is commonly evaluated, which measures the expected impact of data on prediction confidence, prior to their collection. An appropriate approach to this task should be stochastically rigorous, master non-linear dependencies between data, parameters and model predictions, and allow for a wide variety of different data types. Existing methods fail to fulfill all these requirements simultaneously. For this reason, we introduce a new method, denoted as CLUE (Cross-bred Likelihood Uncertainty Estimator), that derives the essential distributions and measures of data utility within a generalized, flexible and accurate framework. The method makes use of Bayesian GLUE (Generalized Likelihood Uncertainty Estimator) and extends it to an optimal design method by marginalizing over the yet unknown data values. Operating in a purely Bayesian Monte-Carlo framework, CLUE is a strictly formal information processing scheme free of linearizations. It provides full flexibility associated with the type of measurements (linear, non-linear, direct, indirect) and accounts for almost arbitrary sources of uncertainty (e.g. heterogeneity, geostatistical assumptions, boundary conditions, model concepts) via stochastic simulation and Bayesian model averaging. This helps to minimize the strength and impact of possible subjective prior assumptions, that would be hard to defend prior to data collection. Our study focuses on evaluating two different uncertainty measures: (i) expected conditional variance and (ii) expected relative entropy of a given prediction goal. The applicability and advantages are shown in a synthetic example. Therefor, we consider a contaminant source, posing a threat on a drinking water well in an aquifer. Furthermore, we assume uncertainty in geostatistical parameters, boundary conditions and hydraulic gradient. The two mentioned measures evaluate the sensitivity of (1) general prediction confidence and (2) exceedance probability of a legal regulatory threshold value on sampling locations.

  3. Commensurate Priors for Incorporating Historical Information in Clinical Trials Using General and Generalized Linear Models

    PubMed Central

    Hobbs, Brian P.; Sargent, Daniel J.; Carlin, Bradley P.

    2014-01-01

    Assessing between-study variability in the context of conventional random-effects meta-analysis is notoriously difficult when incorporating data from only a small number of historical studies. In order to borrow strength, historical and current data are often assumed to be fully homogeneous, but this can have drastic consequences for power and Type I error if the historical information is biased. In this paper, we propose empirical and fully Bayesian modifications of the commensurate prior model (Hobbs et al., 2011) extending Pocock (1976), and evaluate their frequentist and Bayesian properties for incorporating patient-level historical data using general and generalized linear mixed regression models. Our proposed commensurate prior models lead to preposterior admissible estimators that facilitate alternative bias-variance trade-offs than those offered by pre-existing methodologies for incorporating historical data from a small number of historical studies. We also provide a sample analysis of a colon cancer trial comparing time-to-disease progression using a Weibull regression model. PMID:24795786

  4. Bayesian Model Comparison for the Order Restricted RC Association Model

    ERIC Educational Resources Information Center

    Iliopoulos, G.; Kateri, M.; Ntzoufras, I.

    2009-01-01

    Association models constitute an attractive alternative to the usual log-linear models for modeling the dependence between classification variables. They impose special structure on the underlying association by assigning scores on the levels of each classification variable, which can be fixed or parametric. Under the general row-column (RC)…

  5. A Bayesian approach to earthquake source studies

    NASA Astrophysics Data System (ADS)

    Minson, Sarah

    Bayesian sampling has several advantages over conventional optimization approaches to solving inverse problems. It produces the distribution of all possible models sampled proportionally to how much each model is consistent with the data and the specified prior information, and thus images the entire solution space, revealing the uncertainties and trade-offs in the model. Bayesian sampling is applicable to both linear and non-linear modeling, and the values of the model parameters being sampled can be constrained based on the physics of the process being studied and do not have to be regularized. However, these methods are computationally challenging for high-dimensional problems. Until now the computational expense of Bayesian sampling has been too great for it to be practicable for most geophysical problems. I present a new parallel sampling algorithm called CATMIP for Cascading Adaptive Tempered Metropolis In Parallel. This technique, based on Transitional Markov chain Monte Carlo, makes it possible to sample distributions in many hundreds of dimensions, if the forward model is fast, or to sample computationally expensive forward models in smaller numbers of dimensions. The design of the algorithm is independent of the model being sampled, so CATMIP can be applied to many areas of research. I use CATMIP to produce a finite fault source model for the 2007 Mw 7.7 Tocopilla, Chile earthquake. Surface displacements from the earthquake were recorded by six interferograms and twelve local high-rate GPS stations. Because of the wealth of near-fault data, the source process is well-constrained. I find that the near-field high-rate GPS data have significant resolving power above and beyond the slip distribution determined from static displacements. The location and magnitude of the maximum displacement are resolved. The rupture almost certainly propagated at sub-shear velocities. The full posterior distribution can be used not only to calculate source parameters but also to determine their uncertainties. So while kinematic source modeling and the estimation of source parameters is not new, with CATMIP I am able to use Bayesian sampling to determine which parts of the source process are well-constrained and which are not.

  6. Data-driven Modelling for decision making under uncertainty

    NASA Astrophysics Data System (ADS)

    Angria S, Layla; Dwi Sari, Yunita; Zarlis, Muhammad; Tulus

    2018-01-01

    The rise of the issues with the uncertainty of decision making has become a very warm conversation in operation research. Many models have been presented, one of which is with data-driven modelling (DDM). The purpose of this paper is to extract and recognize patterns in data, and find the best model in decision-making problem under uncertainty by using data-driven modeling approach with linear programming, linear and nonlinear differential equation, bayesian approach. Model criteria tested to determine the smallest error, and it will be the best model that can be used.

  7. Inferring the most probable maps of underground utilities using Bayesian mapping model

    NASA Astrophysics Data System (ADS)

    Bilal, Muhammad; Khan, Wasiq; Muggleton, Jennifer; Rustighi, Emiliano; Jenks, Hugo; Pennock, Steve R.; Atkins, Phil R.; Cohn, Anthony

    2018-03-01

    Mapping the Underworld (MTU), a major initiative in the UK, is focused on addressing social, environmental and economic consequences raised from the inability to locate buried underground utilities (such as pipes and cables) by developing a multi-sensor mobile device. The aim of MTU device is to locate different types of buried assets in real time with the use of automated data processing techniques and statutory records. The statutory records, even though typically being inaccurate and incomplete, provide useful prior information on what is buried under the ground and where. However, the integration of information from multiple sensors (raw data) with these qualitative maps and their visualization is challenging and requires the implementation of robust machine learning/data fusion approaches. An approach for automated creation of revised maps was developed as a Bayesian Mapping model in this paper by integrating the knowledge extracted from sensors raw data and available statutory records. The combination of statutory records with the hypotheses from sensors was for initial estimation of what might be found underground and roughly where. The maps were (re)constructed using automated image segmentation techniques for hypotheses extraction and Bayesian classification techniques for segment-manhole connections. The model consisting of image segmentation algorithm and various Bayesian classification techniques (segment recognition and expectation maximization (EM) algorithm) provided robust performance on various simulated as well as real sites in terms of predicting linear/non-linear segments and constructing refined 2D/3D maps.

  8. Application of Bayesian Maximum Entropy Filter in parameter calibration of groundwater flow model in PingTung Plain

    NASA Astrophysics Data System (ADS)

    Cheung, Shao-Yong; Lee, Chieh-Han; Yu, Hwa-Lung

    2017-04-01

    Due to the limited hydrogeological observation data and high levels of uncertainty within, parameter estimation of the groundwater model has been an important issue. There are many methods of parameter estimation, for example, Kalman filter provides a real-time calibration of parameters through measurement of groundwater monitoring wells, related methods such as Extended Kalman Filter and Ensemble Kalman Filter are widely applied in groundwater research. However, Kalman Filter method is limited to linearity. This study propose a novel method, Bayesian Maximum Entropy Filtering, which provides a method that can considers the uncertainty of data in parameter estimation. With this two methods, we can estimate parameter by given hard data (certain) and soft data (uncertain) in the same time. In this study, we use Python and QGIS in groundwater model (MODFLOW) and development of Extended Kalman Filter and Bayesian Maximum Entropy Filtering in Python in parameter estimation. This method may provide a conventional filtering method and also consider the uncertainty of data. This study was conducted through numerical model experiment to explore, combine Bayesian maximum entropy filter and a hypothesis for the architecture of MODFLOW groundwater model numerical estimation. Through the virtual observation wells to simulate and observe the groundwater model periodically. The result showed that considering the uncertainty of data, the Bayesian maximum entropy filter will provide an ideal result of real-time parameters estimation.

  9. A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.

    PubMed

    Houseman, E Andres; Virji, M Abbas

    2017-08-01

    Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates were significant in some frequentist models, but in the Bayesian model their credible intervals contained zero; such discrepancies were observed in multiple datasets. Variance components from the Bayesian model reflected substantial autocorrelation, consistent with the frequentist models, except for the auto-regressive moving average model. Plots of means from the Bayesian model showed good fit to the observed data. The proposed Bayesian model provides an approach for modeling non-stationary autocorrelation in a hierarchical modeling framework to estimate task means, standard deviations, quantiles, and parameter estimates for covariates that are less biased and have better performance characteristics than some of the contemporary methods. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2017.

  10. Discussion of “Bayesian design of experiments for industrial and scientific applications via gaussian processes”

    DOE PAGES

    Anderson-Cook, Christine M.; Burke, Sarah E.

    2016-10-18

    First, we would like to commend Dr. Woods on his thought-provoking paper and insightful presentation at the 4th Annual Stu Hunter conference. We think that the material presented highlights some important needs in the area of design of experiments for generalized linear models (GLMs). In addition, we agree with Dr. Woods that design of experiements of GLMs does implicitly require expert judgement about model parameters, and hence using a Bayesian approach to capture this knowledge is a natural strategy to summarize what is known with the opportunity to incorporate associated uncertainty about that information.

  11. Discussion of “Bayesian design of experiments for industrial and scientific applications via gaussian processes”

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson-Cook, Christine M.; Burke, Sarah E.

    First, we would like to commend Dr. Woods on his thought-provoking paper and insightful presentation at the 4th Annual Stu Hunter conference. We think that the material presented highlights some important needs in the area of design of experiments for generalized linear models (GLMs). In addition, we agree with Dr. Woods that design of experiements of GLMs does implicitly require expert judgement about model parameters, and hence using a Bayesian approach to capture this knowledge is a natural strategy to summarize what is known with the opportunity to incorporate associated uncertainty about that information.

  12. Bayesian Analysis of the Association between Family-Level Factors and Siblings' Dental Caries.

    PubMed

    Wen, A; Weyant, R J; McNeil, D W; Crout, R J; Neiswanger, K; Marazita, M L; Foxman, B

    2017-07-01

    We conducted a Bayesian analysis of the association between family-level socioeconomic status and smoking and the prevalence of dental caries among siblings (children from infant to 14 y) among children living in rural and urban Northern Appalachia using data from the Center for Oral Health Research in Appalachia (COHRA). The observed proportion of siblings sharing caries was significantly different from predicted assuming siblings' caries status was independent. Using a Bayesian hierarchical model, we found the inclusion of a household factor significantly improved the goodness of fit. Other findings showed an inverse association between parental education and siblings' caries and a positive association between households with smokers and siblings' caries. Our study strengthens existing evidence suggesting that increased parental education and decreased parental cigarette smoking are associated with reduced childhood caries in the household. Our results also demonstrate the value of a Bayesian approach, which allows us to include household as a random effect, thereby providing more accurate estimates than obtained using generalized linear mixed models.

  13. Almost but not quite 2D, Non-linear Bayesian Inversion of CSEM Data

    NASA Astrophysics Data System (ADS)

    Ray, A.; Key, K.; Bodin, T.

    2013-12-01

    The geophysical inverse problem can be elegantly stated in a Bayesian framework where a probability distribution can be viewed as a statement of information regarding a random variable. After all, the goal of geophysical inversion is to provide information on the random variables of interest - physical properties of the earth's subsurface. However, though it may be simple to postulate, a practical difficulty of fully non-linear Bayesian inversion is the computer time required to adequately sample the model space and extract the information we seek. As a consequence, in geophysical problems where evaluation of a full 2D/3D forward model is computationally expensive, such as marine controlled source electromagnetic (CSEM) mapping of the resistivity of seafloor oil and gas reservoirs, Bayesian studies have largely been conducted with 1D forward models. While the 1D approximation is indeed appropriate for exploration targets with planar geometry and geological stratification, it only provides a limited, site-specific idea of uncertainty in resistivity with depth. In this work, we extend our fully non-linear 1D Bayesian inversion to a 2D model framework, without requiring the usual regularization of model resistivities in the horizontal or vertical directions used to stabilize quasi-2D inversions. In our approach, we use the reversible jump Markov-chain Monte-Carlo (RJ-MCMC) or trans-dimensional method and parameterize the subsurface in a 2D plane with Voronoi cells. The method is trans-dimensional in that the number of cells required to parameterize the subsurface is variable, and the cells dynamically move around and multiply or combine as demanded by the data being inverted. This approach allows us to expand our uncertainty analysis of resistivity at depth to more than a single site location, allowing for interactions between model resistivities at different horizontal locations along a traverse over an exploration target. While the model is parameterized in 2D, we efficiently evaluate the forward response using 1D profiles extracted from the model at the common-midpoints of the EM source-receiver pairs. Since the 1D approximation is locally valid at different midpoint locations, the computation time is far lower than is required by a full 2D or 3D simulation. We have applied this method to both synthetic and real CSEM survey data from the Scarborough gas field on the Northwest shelf of Australia, resulting in a spatially variable quantification of resistivity and its uncertainty in 2D. This Bayesian approach results in a large database of 2D models that comprise a posterior probability distribution, which we can subset to test various hypotheses about the range of model structures compatible with the data. For example, we can subset the model distributions to examine the hypothesis that a resistive reservoir extends overs a certain spatial extent. Depending on how this conditions other parts of the model space, light can be shed on the geological viability of the hypothesis. Since tackling spatially variable uncertainty and trade-offs in 2D and 3D is a challenging research problem, the insights gained from this work may prove valuable for subsequent full 2D and 3D Bayesian inversions.

  14. Skew-t partially linear mixed-effects models for AIDS clinical studies.

    PubMed

    Lu, Tao

    2016-01-01

    We propose partially linear mixed-effects models with asymmetry and missingness to investigate the relationship between two biomarkers in clinical studies. The proposed models take into account irregular time effects commonly observed in clinical studies under a semiparametric model framework. In addition, commonly assumed symmetric distributions for model errors are substituted by asymmetric distribution to account for skewness. Further, informative missing data mechanism is accounted for. A Bayesian approach is developed to perform parameter estimation simultaneously. The proposed model and method are applied to an AIDS dataset and comparisons with alternative models are performed.

  15. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    PubMed

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.

  16. Non-Linear Modeling of Growth Prerequisites in a Finnish Polytechnic Institution of Higher Education

    ERIC Educational Resources Information Center

    Nokelainen, Petri; Ruohotie, Pekka

    2009-01-01

    Purpose: This study aims to examine the factors of growth-oriented atmosphere in a Finnish polytechnic institution of higher education with categorical exploratory factor analysis, multidimensional scaling and Bayesian unsupervised model-based visualization. Design/methodology/approach: This study was designed to examine employee perceptions of…

  17. Heuristics as Bayesian inference under extreme priors.

    PubMed

    Parpart, Paula; Jones, Matt; Love, Bradley C

    2018-05-01

    Simple heuristics are often regarded as tractable decision strategies because they ignore a great deal of information in the input data. One puzzle is why heuristics can outperform full-information models, such as linear regression, which make full use of the available information. These "less-is-more" effects, in which a relatively simpler model outperforms a more complex model, are prevalent throughout cognitive science, and are frequently argued to demonstrate an inherent advantage of simplifying computation or ignoring information. In contrast, we show at the computational level (where algorithmic restrictions are set aside) that it is never optimal to discard information. Through a formal Bayesian analysis, we prove that popular heuristics, such as tallying and take-the-best, are formally equivalent to Bayesian inference under the limit of infinitely strong priors. Varying the strength of the prior yields a continuum of Bayesian models with the heuristics at one end and ordinary regression at the other. Critically, intermediate models perform better across all our simulations, suggesting that down-weighting information with the appropriate prior is preferable to entirely ignoring it. Rather than because of their simplicity, our analyses suggest heuristics perform well because they implement strong priors that approximate the actual structure of the environment. We end by considering how new heuristics could be derived by infinitely strengthening the priors of other Bayesian models. These formal results have implications for work in psychology, machine learning and economics. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  18. TENSOR DECOMPOSITIONS AND SPARSE LOG-LINEAR MODELS

    PubMed Central

    Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B.

    2017-01-01

    Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions. PMID:29332971

  19. Trending in Probability of Collision Measurements via a Bayesian Zero-Inflated Beta Mixed Model

    NASA Technical Reports Server (NTRS)

    Vallejo, Jonathon; Hejduk, Matt; Stamey, James

    2015-01-01

    We investigate the performance of a generalized linear mixed model in predicting the Probabilities of Collision (Pc) for conjunction events. Specifically, we apply this model to the log(sub 10) transformation of these probabilities and argue that this transformation yields values that can be considered bounded in practice. Additionally, this bounded random variable, after scaling, is zero-inflated. Consequently, we model these values using the zero-inflated Beta distribution, and utilize the Bayesian paradigm and the mixed model framework to borrow information from past and current events. This provides a natural way to model the data and provides a basis for answering questions of interest, such as what is the likelihood of observing a probability of collision equal to the effective value of zero on a subsequent observation.

  20. Detection and recognition of simple spatial forms

    NASA Technical Reports Server (NTRS)

    Watson, A. B.

    1983-01-01

    A model of human visual sensitivity to spatial patterns is constructed. The model predicts the visibility and discriminability of arbitrary two-dimensional monochrome images. The image is analyzed by a large array of linear feature sensors, which differ in spatial frequency, phase, orientation, and position in the visual field. All sensors have one octave frequency bandwidths, and increase in size linearly with eccentricity. Sensor responses are processed by an ideal Bayesian classifier, subject to uncertainty. The performance of the model is compared to that of the human observer in detecting and discriminating some simple images.

  1. Iterative updating of model error for Bayesian inversion

    NASA Astrophysics Data System (ADS)

    Calvetti, Daniela; Dunlop, Matthew; Somersalo, Erkki; Stuart, Andrew

    2018-02-01

    In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations. We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations when the data is finite dimensional. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit under a simplifying assumption. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.

  2. Bayesian parameter estimation for nonlinear modelling of biological pathways.

    PubMed

    Ghasemi, Omid; Lindsey, Merry L; Yang, Tianyi; Nguyen, Nguyen; Huang, Yufei; Jin, Yu-Fang

    2011-01-01

    The availability of temporal measurements on biological experiments has significantly promoted research areas in systems biology. To gain insight into the interaction and regulation of biological systems, mathematical frameworks such as ordinary differential equations have been widely applied to model biological pathways and interpret the temporal data. Hill equations are the preferred formats to represent the reaction rate in differential equation frameworks, due to their simple structures and their capabilities for easy fitting to saturated experimental measurements. However, Hill equations are highly nonlinearly parameterized functions, and parameters in these functions cannot be measured easily. Additionally, because of its high nonlinearity, adaptive parameter estimation algorithms developed for linear parameterized differential equations cannot be applied. Therefore, parameter estimation in nonlinearly parameterized differential equation models for biological pathways is both challenging and rewarding. In this study, we propose a Bayesian parameter estimation algorithm to estimate parameters in nonlinear mathematical models for biological pathways using time series data. We used the Runge-Kutta method to transform differential equations to difference equations assuming a known structure of the differential equations. This transformation allowed us to generate predictions dependent on previous states and to apply a Bayesian approach, namely, the Markov chain Monte Carlo (MCMC) method. We applied this approach to the biological pathways involved in the left ventricle (LV) response to myocardial infarction (MI) and verified our algorithm by estimating two parameters in a Hill equation embedded in the nonlinear model. We further evaluated our estimation performance with different parameter settings and signal to noise ratios. Our results demonstrated the effectiveness of the algorithm for both linearly and nonlinearly parameterized dynamic systems. Our proposed Bayesian algorithm successfully estimated parameters in nonlinear mathematical models for biological pathways. This method can be further extended to high order systems and thus provides a useful tool to analyze biological dynamics and extract information using temporal data.

  3. Parameter and Structure Inference for Nonlinear Dynamical Systems

    NASA Technical Reports Server (NTRS)

    Morris, Robin D.; Smelyanskiy, Vadim N.; Millonas, Mark

    2006-01-01

    A great many systems can be modeled in the non-linear dynamical systems framework, as x = f(x) + xi(t), where f() is the potential function for the system, and xi is the excitation noise. Modeling the potential using a set of basis functions, we derive the posterior for the basis coefficients. A more challenging problem is to determine the set of basis functions that are required to model a particular system. We show that using the Bayesian Information Criteria (BIC) to rank models, and the beam search technique, that we can accurately determine the structure of simple non-linear dynamical system models, and the structure of the coupling between non-linear dynamical systems where the individual systems are known. This last case has important ecological applications.

  4. Flood quantile estimation at ungauged sites by Bayesian networks

    NASA Astrophysics Data System (ADS)

    Mediero, L.; Santillán, D.; Garrote, L.

    2012-04-01

    Estimating flood quantiles at a site for which no observed measurements are available is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. The most common technique used is the multiple regression analysis, which relates physical and climatic basin characteristic to flood quantiles. Regression equations are fitted from flood frequency data and basin characteristics at gauged sites. Regression equations are a rigid technique that assumes linear relationships between variables and cannot take the measurement errors into account. In addition, the prediction intervals are estimated in a very simplistic way from the variance of the residuals in the estimated model. Bayesian networks are a probabilistic computational structure taken from the field of Artificial Intelligence, which have been widely and successfully applied to many scientific fields like medicine and informatics, but application to the field of hydrology is recent. Bayesian networks infer the joint probability distribution of several related variables from observations through nodes, which represent random variables, and links, which represent causal dependencies between them. A Bayesian network is more flexible than regression equations, as they capture non-linear relationships between variables. In addition, the probabilistic nature of Bayesian networks allows taking the different sources of estimation uncertainty into account, as they give a probability distribution as result. A homogeneous region in the Tagus Basin was selected as case study. A regression equation was fitted taking the basin area, the annual maximum 24-hour rainfall for a given recurrence interval and the mean height as explanatory variables. Flood quantiles at ungauged sites were estimated by Bayesian networks. Bayesian networks need to be learnt from a huge enough data set. As observational data are reduced, a stochastic generator of synthetic data was developed. Synthetic basin characteristics were randomised, keeping the statistical properties of observed physical and climatic variables in the homogeneous region. The synthetic flood quantiles were stochastically generated taking the regression equation as basis. The learnt Bayesian network was validated by the reliability diagram, the Brier Score and the ROC diagram, which are common measures used in the validation of probabilistic forecasts. Summarising, the flood quantile estimations through Bayesian networks supply information about the prediction uncertainty as a probability distribution function of discharges is given as result. Therefore, the Bayesian network model has application as a decision support for water resources and planning management.

  5. On a full Bayesian inference for force reconstruction problems

    NASA Astrophysics Data System (ADS)

    Aucejo, M.; De Smet, O.

    2018-05-01

    In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.

  6. Statistical modelling of networked human-automation performance using working memory capacity.

    PubMed

    Ahmed, Nisar; de Visser, Ewart; Shaw, Tyler; Mohamed-Ameen, Amira; Campbell, Mark; Parasuraman, Raja

    2014-01-01

    This study examines the challenging problem of modelling the interaction between individual attentional limitations and decision-making performance in networked human-automation system tasks. Analysis of real experimental data from a task involving networked supervision of multiple unmanned aerial vehicles by human participants shows that both task load and network message quality affect performance, but that these effects are modulated by individual differences in working memory (WM) capacity. These insights were used to assess three statistical approaches for modelling and making predictions with real experimental networked supervisory performance data: classical linear regression, non-parametric Gaussian processes and probabilistic Bayesian networks. It is shown that each of these approaches can help designers of networked human-automated systems cope with various uncertainties in order to accommodate future users by linking expected operating conditions and performance from real experimental data to observable cognitive traits like WM capacity. Practitioner Summary: Working memory (WM) capacity helps account for inter-individual variability in operator performance in networked unmanned aerial vehicle supervisory tasks. This is useful for reliable performance prediction near experimental conditions via linear models; robust statistical prediction beyond experimental conditions via Gaussian process models and probabilistic inference about unknown task conditions/WM capacities via Bayesian network models.

  7. Reduced-order modelling of parameter-dependent, linear and nonlinear dynamic partial differential equation models.

    PubMed

    Shah, A A; Xing, W W; Triantafyllidis, V

    2017-04-01

    In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach.

  8. Reduced-order modelling of parameter-dependent, linear and nonlinear dynamic partial differential equation models

    PubMed Central

    Xing, W. W.; Triantafyllidis, V.

    2017-01-01

    In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach. PMID:28484327

  9. Bayesian uncertainty quantification in linear models for diffusion MRI.

    PubMed

    Sjölund, Jens; Eklund, Anders; Özarslan, Evren; Herberthson, Magnus; Bånkestad, Maria; Knutsson, Hans

    2018-03-29

    Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue microstructure. By fitting a model to the dMRI signal it is possible to derive various quantitative features. Several of the most popular dMRI signal models are expansions in an appropriately chosen basis, where the coefficients are determined using some variation of least-squares. However, such approaches lack any notion of uncertainty, which could be valuable in e.g. group analyses. In this work, we use a probabilistic interpretation of linear least-squares methods to recast popular dMRI models as Bayesian ones. This makes it possible to quantify the uncertainty of any derived quantity. In particular, for quantities that are affine functions of the coefficients, the posterior distribution can be expressed in closed-form. We simulated measurements from single- and double-tensor models where the correct values of several quantities are known, to validate that the theoretically derived quantiles agree with those observed empirically. We included results from residual bootstrap for comparison and found good agreement. The validation employed several different models: Diffusion Tensor Imaging (DTI), Mean Apparent Propagator MRI (MAP-MRI) and Constrained Spherical Deconvolution (CSD). We also used in vivo data to visualize maps of quantitative features and corresponding uncertainties, and to show how our approach can be used in a group analysis to downweight subjects with high uncertainty. In summary, we convert successful linear models for dMRI signal estimation to probabilistic models, capable of accurate uncertainty quantification. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. Evaluation of a neutron spectrum from Bonner spheres measurements using a Bayesian parameter estimation combined with the traditional unfolding methods

    NASA Astrophysics Data System (ADS)

    Mazrou, H.; Bezoubiri, F.

    2018-07-01

    In this work, a new program developed under MATLAB environment and supported by the Bayesian software WinBUGS has been combined to the traditional unfolding codes namely MAXED and GRAVEL, to evaluate a neutron spectrum from the Bonner spheres measured counts obtained around a shielded 241AmBe based-neutron irradiator located at a Secondary Standards Dosimetry Laboratory (SSDL) at CRNA. In the first step, the results obtained by the standalone Bayesian program, using a parametric neutron spectrum model based on a linear superposition of three components namely: a thermal-Maxwellian distribution, an epithermal (1/E behavior) and a kind of a Watt fission and Evaporation models to represent the fast component, were compared to those issued from MAXED and GRAVEL assuming a Monte Carlo default spectrum. Through the selection of new upper limits for some free parameters, taking into account the physical characteristics of the irradiation source, of both considered models, good agreement was obtained for investigated integral quantities i.e. fluence rate and ambient dose equivalent rate compared to MAXED and GRAVEL results. The difference was generally below 4% for investigated parameters suggesting, thereby, the reliability of the proposed models. In the second step, the Bayesian results obtained from the previous calculations were used, as initial guess spectra, for the traditional unfolding codes, MAXED and GRAVEL to derive the solution spectra. Here again the results were in very good agreement, confirming the stability of the Bayesian solution.

  11. Analysis of statistical and standard algorithms for detecting muscle onset with surface electromyography.

    PubMed

    Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A

    2017-01-01

    The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60-90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity.

  12. Bayesian dynamical systems modelling in the social sciences.

    PubMed

    Ranganathan, Shyam; Spaiser, Viktoria; Mann, Richard P; Sumpter, David J T

    2014-01-01

    Data arising from social systems is often highly complex, involving non-linear relationships between the macro-level variables that characterize these systems. We present a method for analyzing this type of longitudinal or panel data using differential equations. We identify the best non-linear functions that capture interactions between variables, employing Bayes factor to decide how many interaction terms should be included in the model. This method punishes overly complicated models and identifies models with the most explanatory power. We illustrate our approach on the classic example of relating democracy and economic growth, identifying non-linear relationships between these two variables. We show how multiple variables and variable lags can be accounted for and provide a toolbox in R to implement our approach.

  13. Bayesian inference of radiation belt loss timescales.

    NASA Astrophysics Data System (ADS)

    Camporeale, E.; Chandorkar, M.

    2017-12-01

    Electron fluxes in the Earth's radiation belts are routinely studied using the classical quasi-linear radial diffusion model. Although this simplified linear equation has proven to be an indispensable tool in understanding the dynamics of the radiation belt, it requires specification of quantities such as the diffusion coefficient and electron loss timescales that are never directly measured. Researchers have so far assumed a-priori parameterisations for radiation belt quantities and derived the best fit using satellite data. The state of the art in this domain lacks a coherent formulation of this problem in a probabilistic framework. We present some recent progress that we have made in performing Bayesian inference of radial diffusion parameters. We achieve this by making extensive use of the theory connecting Gaussian Processes and linear partial differential equations, and performing Markov Chain Monte Carlo sampling of radial diffusion parameters. These results are important for understanding the role and the propagation of uncertainties in radiation belt simulations and, eventually, for providing a probabilistic forecast of energetic electron fluxes in a Space Weather context.

  14. Optimal Sequential Rules for Computer-Based Instruction.

    ERIC Educational Resources Information Center

    Vos, Hans J.

    1998-01-01

    Formulates sequential rules for adapting the appropriate amount of instruction to learning needs in the context of computer-based instruction. Topics include Bayesian decision theory, threshold and linear-utility structure, psychometric model, optimal sequential number of test questions, and an empirical example of sequential instructional…

  15. A computer program for uncertainty analysis integrating regression and Bayesian methods

    USGS Publications Warehouse

    Lu, Dan; Ye, Ming; Hill, Mary C.; Poeter, Eileen P.; Curtis, Gary

    2014-01-01

    This work develops a new functionality in UCODE_2014 to evaluate Bayesian credible intervals using the Markov Chain Monte Carlo (MCMC) method. The MCMC capability in UCODE_2014 is based on the FORTRAN version of the differential evolution adaptive Metropolis (DREAM) algorithm of Vrugt et al. (2009), which estimates the posterior probability density function of model parameters in high-dimensional and multimodal sampling problems. The UCODE MCMC capability provides eleven prior probability distributions and three ways to initialize the sampling process. It evaluates parametric and predictive uncertainties and it has parallel computing capability based on multiple chains to accelerate the sampling process. This paper tests and demonstrates the MCMC capability using a 10-dimensional multimodal mathematical function, a 100-dimensional Gaussian function, and a groundwater reactive transport model. The use of the MCMC capability is made straightforward and flexible by adopting the JUPITER API protocol. With the new MCMC capability, UCODE_2014 can be used to calculate three types of uncertainty intervals, which all can account for prior information: (1) linear confidence intervals which require linearity and Gaussian error assumptions and typically 10s–100s of highly parallelizable model runs after optimization, (2) nonlinear confidence intervals which require a smooth objective function surface and Gaussian observation error assumptions and typically 100s–1,000s of partially parallelizable model runs after optimization, and (3) MCMC Bayesian credible intervals which require few assumptions and commonly 10,000s–100,000s or more partially parallelizable model runs. Ready access allows users to select methods best suited to their work, and to compare methods in many circumstances.

  16. Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models

    PubMed Central

    Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A.; Burgueño, Juan; Pérez-Rodríguez, Paulino; de los Campos, Gustavo

    2016-01-01

    The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects (u) that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model (u) plus an extra component, f, that captures random effects between environments that were not captured by the random effects u. We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u. PMID:27793970

  17. Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models.

    PubMed

    Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A; Burgueño, Juan; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo

    2017-01-05

    The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects [Formula: see text] that can be assessed by the Kronecker product of variance-covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model [Formula: see text] plus an extra component, F: , that captures random effects between environments that were not captured by the random effects [Formula: see text] We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with [Formula: see text] over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect [Formula: see text]. Copyright © 2017 Cuevas et al.

  18. Online Variational Bayesian Filtering-Based Mobile Target Tracking in Wireless Sensor Networks

    PubMed Central

    Zhou, Bingpeng; Chen, Qingchun; Li, Tiffany Jing; Xiao, Pei

    2014-01-01

    The received signal strength (RSS)-based online tracking for a mobile node in wireless sensor networks (WSNs) is investigated in this paper. Firstly, a multi-layer dynamic Bayesian network (MDBN) is introduced to characterize the target mobility with either directional or undirected movement. In particular, it is proposed to employ the Wishart distribution to approximate the time-varying RSS measurement precision's randomness due to the target movement. It is shown that the proposed MDBN offers a more general analysis model via incorporating the underlying statistical information of both the target movement and observations, which can be utilized to improve the online tracking capability by exploiting the Bayesian statistics. Secondly, based on the MDBN model, a mean-field variational Bayesian filtering (VBF) algorithm is developed to realize the online tracking of a mobile target in the presence of nonlinear observations and time-varying RSS precision, wherein the traditional Bayesian filtering scheme cannot be directly employed. Thirdly, a joint optimization between the real-time velocity and its prior expectation is proposed to enable online velocity tracking in the proposed online tacking scheme. Finally, the associated Bayesian Cramer–Rao Lower Bound (BCRLB) analysis and numerical simulations are conducted. Our analysis unveils that, by exploiting the potential state information via the general MDBN model, the proposed VBF algorithm provides a promising solution to the online tracking of a mobile node in WSNs. In addition, it is shown that the final tracking accuracy linearly scales with its expectation when the RSS measurement precision is time-varying. PMID:25393784

  19. Comparing methods of measuring geographic patterns in temporal trends: an application to county-level heart disease mortality in the United States, 1973 to 2010.

    PubMed

    Vaughan, Adam S; Kramer, Michael R; Waller, Lance A; Schieb, Linda J; Greer, Sophia; Casper, Michele

    2015-05-01

    To demonstrate the implications of choosing analytical methods for quantifying spatiotemporal trends, we compare the assumptions, implementation, and outcomes of popular methods using county-level heart disease mortality in the United States between 1973 and 2010. We applied four regression-based approaches (joinpoint regression, both aspatial and spatial generalized linear mixed models, and Bayesian space-time model) and compared resulting inferences for geographic patterns of local estimates of annual percent change and associated uncertainty. The average local percent change in heart disease mortality from each method was -4.5%, with the Bayesian model having the smallest range of values. The associated uncertainty in percent change differed markedly across the methods, with the Bayesian space-time model producing the narrowest range of variance (0.0-0.8). The geographic pattern of percent change was consistent across methods with smaller declines in the South Central United States and larger declines in the Northeast and Midwest. However, the geographic patterns of uncertainty differed markedly between methods. The similarity of results, including geographic patterns, for magnitude of percent change across these methods validates the underlying spatial pattern of declines in heart disease mortality. However, marked differences in degree of uncertainty indicate that Bayesian modeling offers substantially more precise estimates. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Design considerations and analysis planning of a phase 2a proof of concept study in rheumatoid arthritis in the presence of possible non-monotonicity.

    PubMed

    Liu, Feng; Walters, Stephen J; Julious, Steven A

    2017-10-02

    It is important to quantify the dose response for a drug in phase 2a clinical trials so the optimal doses can then be selected for subsequent late phase trials. In a phase 2a clinical trial of new lead drug being developed for the treatment of rheumatoid arthritis (RA), a U-shaped dose response curve was observed. In the light of this result further research was undertaken to design an efficient phase 2a proof of concept (PoC) trial for a follow-on compound using the lessons learnt from the lead compound. The planned analysis for the Phase 2a trial for GSK123456 was a Bayesian Emax model which assumes the dose-response relationship follows a monotonic sigmoid "S" shaped curve. This model was found to be suboptimal to model the U-shaped dose response observed in the data from this trial and alternatives approaches were needed to be considered for the next compound for which a Normal dynamic linear model (NDLM) is proposed. This paper compares the statistical properties of the Bayesian Emax model and NDLM model and both models are evaluated using simulation in the context of adaptive Phase 2a PoC design under a variety of assumed dose response curves: linear, Emax model, U-shaped model, and flat response. It is shown that the NDLM method is flexible and can handle a wide variety of dose-responses, including monotonic and non-monotonic relationships. In comparison to the NDLM model the Emax model excelled with higher probability of selecting ED90 and smaller average sample size, when the true dose response followed Emax like curve. In addition, the type I error, probability of incorrectly concluding a drug may work when it does not, is inflated with the Bayesian NDLM model in all scenarios which would represent a development risk to pharmaceutical company. The bias, which is the difference between the estimated effect from the Emax and NDLM models and the simulated value, is comparable if the true dose response follows a placebo like curve, an Emax like curve, or log linear shape curve under fixed dose allocation, no adaptive allocation, half adaptive and adaptive scenarios. The bias though is significantly increased for the Emax model if the true dose response follows a U-shaped curve. In most cases the Bayesian Emax model works effectively and efficiently, with low bias and good probability of success in case of monotonic dose response. However, if there is a belief that the dose response could be non-monotonic then the NDLM is the superior model to assess the dose response.

  1. A Bayesian model averaging method for the derivation of reservoir operating rules

    NASA Astrophysics Data System (ADS)

    Zhang, Jingwen; Liu, Pan; Wang, Hao; Lei, Xiaohui; Zhou, Yanlai

    2015-09-01

    Because the intrinsic dynamics among optimal decision making, inflow processes and reservoir characteristics are complex, functional forms of reservoir operating rules are always determined subjectively. As a result, the uncertainty of selecting form and/or model involved in reservoir operating rules must be analyzed and evaluated. In this study, we analyze the uncertainty of reservoir operating rules using the Bayesian model averaging (BMA) model. Three popular operating rules, namely piecewise linear regression, surface fitting and a least-squares support vector machine, are established based on the optimal deterministic reservoir operation. These individual models provide three-member decisions for the BMA combination, enabling the 90% release interval to be estimated by the Markov Chain Monte Carlo simulation. A case study of China's the Baise reservoir shows that: (1) the optimal deterministic reservoir operation, superior to any reservoir operating rules, is used as the samples to derive the rules; (2) the least-squares support vector machine model is more effective than both piecewise linear regression and surface fitting; (3) BMA outperforms any individual model of operating rules based on the optimal trajectories. It is revealed that the proposed model can reduce the uncertainty of operating rules, which is of great potential benefit in evaluating the confidence interval of decisions.

  2. Bayesian block-diagonal variable selection and model averaging

    PubMed Central

    Papaspiliopoulos, O.; Rossell, D.

    2018-01-01

    Summary We propose a scalable algorithmic framework for exact Bayesian variable selection and model averaging in linear models under the assumption that the Gram matrix is block-diagonal, and as a heuristic for exploring the model space for general designs. In block-diagonal designs our approach returns the most probable model of any given size without resorting to numerical integration. The algorithm also provides a novel and efficient solution to the frequentist best subset selection problem for block-diagonal designs. Posterior probabilities for any number of models are obtained by evaluating a single one-dimensional integral, and other quantities of interest such as variable inclusion probabilities and model-averaged regression estimates are obtained by an adaptive, deterministic one-dimensional numerical integration. The overall computational cost scales linearly with the number of blocks, which can be processed in parallel, and exponentially with the block size, rendering it most adequate in situations where predictors are organized in many moderately-sized blocks. For general designs, we approximate the Gram matrix by a block-diagonal matrix using spectral clustering and propose an iterative algorithm that capitalizes on the block-diagonal algorithms to explore efficiently the model space. All methods proposed in this paper are implemented in the R library mombf. PMID:29861501

  3. A Bayesian hierarchical model with spatial variable selection: the effect of weather on insurance claims

    PubMed Central

    Scheel, Ida; Ferkingstad, Egil; Frigessi, Arnoldo; Haug, Ola; Hinnerichsen, Mikkel; Meze-Hausken, Elisabeth

    2013-01-01

    Climate change will affect the insurance industry. We develop a Bayesian hierarchical statistical approach to explain and predict insurance losses due to weather events at a local geographic scale. The number of weather-related insurance claims is modelled by combining generalized linear models with spatially smoothed variable selection. Using Gibbs sampling and reversible jump Markov chain Monte Carlo methods, this model is fitted on daily weather and insurance data from each of the 319 municipalities which constitute southern and central Norway for the period 1997–2006. Precise out-of-sample predictions validate the model. Our results show interesting regional patterns in the effect of different weather covariates. In addition to being useful for insurance pricing, our model can be used for short-term predictions based on weather forecasts and for long-term predictions based on downscaled climate models. PMID:23396890

  4. A Sparse Bayesian Learning Algorithm for White Matter Parameter Estimation from Compressed Multi-shell Diffusion MRI.

    PubMed

    Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Sapiro, Guillermo; Lenglet, Christophe

    2017-09-01

    We propose a sparse Bayesian learning algorithm for improved estimation of white matter fiber parameters from compressed (under-sampled q-space) multi-shell diffusion MRI data. The multi-shell data is represented in a dictionary form using a non-monoexponential decay model of diffusion, based on continuous gamma distribution of diffusivities. The fiber volume fractions with predefined orientations, which are the unknown parameters, form the dictionary weights. These unknown parameters are estimated with a linear un-mixing framework, using a sparse Bayesian learning algorithm. A localized learning of hyperparameters at each voxel and for each possible fiber orientations improves the parameter estimation. Our experiments using synthetic data from the ISBI 2012 HARDI reconstruction challenge and in-vivo data from the Human Connectome Project demonstrate the improvements.

  5. Unsupervised Bayesian linear unmixing of gene expression microarrays.

    PubMed

    Bazot, Cécile; Dobigeon, Nicolas; Tourneret, Jean-Yves; Zaas, Aimee K; Ginsburg, Geoffrey S; Hero, Alfred O

    2013-03-19

    This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coefficients, called factor scores, that specify the relative contribution of each signature to a specific sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters. Firstly, the proposed uBLU method is applied to several simulated datasets with known ground truth and compared with previous factor decomposition methods, such as principal component analysis (PCA), non negative matrix factorization (NMF), Bayesian factor regression modeling (BFRM), and the gradient-based algorithm for general matrix factorization (GB-GMF). Secondly, we illustrate the application of uBLU on a real time-evolving gene expression dataset from a recent viral challenge study in which individuals have been inoculated with influenza A/H3N2/Wisconsin. We show that the uBLU method significantly outperforms the other methods on the simulated and real data sets considered here. The results obtained on synthetic and real data illustrate the accuracy of the proposed uBLU method when compared to other factor decomposition methods from the literature (PCA, NMF, BFRM, and GB-GMF). The uBLU method identifies an inflammatory component closely associated with clinical symptom scores collected during the study. Using a constrained model allows recovery of all the inflammatory genes in a single factor.

  6. Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Baichuan; Choudhury, Sutanay; Al-Hasan, Mohammad

    2016-02-01

    Estimating the confidence for a link is a critical task for Knowledge Graph construction. Link prediction, or predicting the likelihood of a link in a knowledge graph based on prior state is a key research direction within this area. We propose a Latent Feature Embedding based link recommendation model for prediction task and utilize Bayesian Personalized Ranking based optimization technique for learning models for each predicate. Experimental results on large-scale knowledge bases such as YAGO2 show that our approach achieves substantially higher performance than several state-of-art approaches. Furthermore, we also study the performance of the link prediction algorithm in termsmore » of topological properties of the Knowledge Graph and present a linear regression model to reason about its expected level of accuracy.« less

  7. A Bayesian least squares support vector machines based framework for fault diagnosis and failure prognosis

    NASA Astrophysics Data System (ADS)

    Khawaja, Taimoor Saleem

    A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior and any abnormal or novel data during real-time operation. The results of the scheme are interpreted as a posterior probability of health (1 - probability of fault). As shown through two case studies in Chapter 3, the scheme is well suited for diagnosing imminent faults in dynamical non-linear systems. Finally, the failure prognosis scheme is based on an incremental weighted Bayesian LS-SVR machine. It is particularly suited for online deployment given the incremental nature of the algorithm and the quick optimization problem solved in the LS-SVR algorithm. By way of kernelization and a Gaussian Mixture Modeling (GMM) scheme, the algorithm can estimate "possibly" non-Gaussian posterior distributions for complex non-linear systems. An efficient regression scheme associated with the more rigorous core algorithm allows for long-term predictions, fault growth estimation with confidence bounds and remaining useful life (RUL) estimation after a fault is detected. The leading contributions of this thesis are (a) the development of a novel Bayesian Anomaly Detector for efficient and reliable Fault Detection and Identification (FDI) based on Least Squares Support Vector Machines, (b) the development of a data-driven real-time architecture for long-term Failure Prognosis using Least Squares Support Vector Machines, (c) Uncertainty representation and management using Bayesian Inference for posterior distribution estimation and hyper-parameter tuning, and finally (d) the statistical characterization of the performance of diagnosis and prognosis algorithms in order to relate the efficiency and reliability of the proposed schemes.

  8. Stochastic Model of Seasonal Runoff Forecasts

    NASA Astrophysics Data System (ADS)

    Krzysztofowicz, Roman; Watada, Leslie M.

    1986-03-01

    Each year the National Weather Service and the Soil Conservation Service issue a monthly sequence of five (or six) categorical forecasts of the seasonal snowmelt runoff volume. To describe uncertainties in these forecasts for the purposes of optimal decision making, a stochastic model is formulated. It is a discrete-time, finite, continuous-space, nonstationary Markov process. Posterior densities of the actual runoff conditional upon a forecast, and transition densities of forecasts are obtained from a Bayesian information processor. Parametric densities are derived for the process with a normal prior density of the runoff and a linear model of the forecast error. The structure of the model and the estimation procedure are motivated by analyses of forecast records from five stations in the Snake River basin, from the period 1971-1983. The advantages of supplementing the current forecasting scheme with a Bayesian analysis are discussed.

  9. A Development of Nonstationary Regional Frequency Analysis Model with Large-scale Climate Information: Its Application to Korean Watershed

    NASA Astrophysics Data System (ADS)

    Kim, Jin-Young; Kwon, Hyun-Han; Kim, Hung-Soo

    2015-04-01

    The existing regional frequency analysis has disadvantages in that it is difficult to consider geographical characteristics in estimating areal rainfall. In this regard, this study aims to develop a hierarchical Bayesian model based nonstationary regional frequency analysis in that spatial patterns of the design rainfall with geographical information (e.g. latitude, longitude and altitude) are explicitly incorporated. This study assumes that the parameters of Gumbel (or GEV distribution) are a function of geographical characteristics within a general linear regression framework. Posterior distribution of the regression parameters are estimated by Bayesian Markov Chain Monte Carlo (MCMC) method, and the identified functional relationship is used to spatially interpolate the parameters of the distributions by using digital elevation models (DEM) as inputs. The proposed model is applied to derive design rainfalls over the entire Han-river watershed. It was found that the proposed Bayesian regional frequency analysis model showed similar results compared to L-moment based regional frequency analysis. In addition, the model showed an advantage in terms of quantifying uncertainty of the design rainfall and estimating the area rainfall considering geographical information. Finally, comprehensive discussion on design rainfall in the context of nonstationary will be presented. KEYWORDS: Regional frequency analysis, Nonstationary, Spatial information, Bayesian Acknowledgement This research was supported by a grant (14AWMP-B082564-01) from Advanced Water Management Research Program funded by Ministry of Land, Infrastructure and Transport of Korean government.

  10. Mixed linear-nonlinear fault slip inversion: Bayesian inference of model, weighting, and smoothing parameters

    NASA Astrophysics Data System (ADS)

    Fukuda, J.; Johnson, K. M.

    2009-12-01

    Studies utilizing inversions of geodetic data for the spatial distribution of coseismic slip on faults typically present the result as a single fault plane and slip distribution. Commonly the geometry of the fault plane is assumed to be known a priori and the data are inverted for slip. However, sometimes there is not strong a priori information on the geometry of the fault that produced the earthquake and the data is not always strong enough to completely resolve the fault geometry. We develop a method to solve for the full posterior probability distribution of fault slip and fault geometry parameters in a Bayesian framework using Monte Carlo methods. The slip inversion problem is particularly challenging because it often involves multiple data sets with unknown relative weights (e.g. InSAR, GPS), model parameters that are related linearly (slip) and nonlinearly (fault geometry) through the theoretical model to surface observations, prior information on model parameters, and a regularization prior to stabilize the inversion. We present the theoretical framework and solution method for a Bayesian inversion that can handle all of these aspects of the problem. The method handles the mixed linear/nonlinear nature of the problem through combination of both analytical least-squares solutions and Monte Carlo methods. We first illustrate and validate the inversion scheme using synthetic data sets. We then apply the method to inversion of geodetic data from the 2003 M6.6 San Simeon, California earthquake. We show that the uncertainty in strike and dip of the fault plane is over 20 degrees. We characterize the uncertainty in the slip estimate with a volume around the mean fault solution in which the slip most likely occurred. Slip likely occurred somewhere in a volume that extends 5-10 km in either direction normal to the fault plane. We implement slip inversions with both traditional, kinematic smoothing constraints on slip and a simple physical condition of uniform stress drop.

  11. Analysis of statistical and standard algorithms for detecting muscle onset with surface electromyography

    PubMed Central

    Tweedell, Andrew J.; Haynes, Courtney A.

    2017-01-01

    The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60–90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity. PMID:28489897

  12. A D-vine copula-based model for repeated measurements extending linear mixed models with homogeneous correlation structure.

    PubMed

    Killiches, Matthias; Czado, Claudia

    2018-03-22

    We propose a model for unbalanced longitudinal data, where the univariate margins can be selected arbitrarily and the dependence structure is described with the help of a D-vine copula. We show that our approach is an extremely flexible extension of the widely used linear mixed model if the correlation is homogeneous over the considered individuals. As an alternative to joint maximum-likelihood a sequential estimation approach for the D-vine copula is provided and validated in a simulation study. The model can handle missing values without being forced to discard data. Since conditional distributions are known analytically, we easily make predictions for future events. For model selection, we adjust the Bayesian information criterion to our situation. In an application to heart surgery data our model performs clearly better than competing linear mixed models. © 2018, The International Biometric Society.

  13. Bayesian models for comparative analysis integrating phylogenetic uncertainty.

    PubMed

    de Villemereuil, Pierre; Wells, Jessie A; Edwards, Robert D; Blomberg, Simon P

    2012-06-28

    Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language.

  14. Bayesian models for comparative analysis integrating phylogenetic uncertainty

    PubMed Central

    2012-01-01

    Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language. PMID:22741602

  15. Artificial and Bayesian Neural Networks

    PubMed

    Korhani Kangi, Azam; Bahrampour, Abbas

    2018-02-26

    Introduction and purpose: In recent years the use of neural networks without any premises for investigation of prognosis in analyzing survival data has increased. Artificial neural networks (ANN) use small processors with a continuous network to solve problems inspired by the human brain. Bayesian neural networks (BNN) constitute a neural-based approach to modeling and non-linearization of complex issues using special algorithms and statistical methods. Gastric cancer incidence is the first and third ranking for men and women in Iran, respectively. The aim of the present study was to assess the value of an artificial neural network and a Bayesian neural network for modeling and predicting of probability of gastric cancer patient death. Materials and Methods: In this study, we used information on 339 patients aged from 20 to 90 years old with positive gastric cancer, referred to Afzalipoor and Shahid Bahonar Hospitals in Kerman City from 2001 to 2015. The three layers perceptron neural network (ANN) and the Bayesian neural network (BNN) were used for predicting the probability of mortality using the available data. To investigate differences between the models, sensitivity, specificity, accuracy and the area under receiver operating characteristic curves (AUROCs) were generated. Results: In this study, the sensitivity and specificity of the artificial neural network and Bayesian neural network models were 0.882, 0.903 and 0.954, 0.909, respectively. Prediction accuracy and the area under curve ROC for the two models were 0.891, 0.944 and 0.935, 0.961. The age at diagnosis of gastric cancer was most important for predicting survival, followed by tumor grade, morphology, gender, smoking history, opium consumption, receiving chemotherapy, presence of metastasis, tumor stage, receiving radiotherapy, and being resident in a village. Conclusion: The findings of the present study indicated that the Bayesian neural network is preferable to an artificial neural network for predicting survival of gastric cancer patients in Iran. Creative Commons Attribution License

  16. A Bayesian Framework for Generalized Linear Mixed Modeling Identifies New Candidate Loci for Late-Onset Alzheimer’s Disease

    PubMed Central

    Wang, Xulong; Philip, Vivek M.; Ananda, Guruprasad; White, Charles C.; Malhotra, Ankit; Michalski, Paul J.; Karuturi, Krishna R. Murthy; Chintalapudi, Sumana R.; Acklin, Casey; Sasner, Michael; Bennett, David A.; De Jager, Philip L.; Howell, Gareth R.; Carter, Gregory W.

    2018-01-01

    Recent technical and methodological advances have greatly enhanced genome-wide association studies (GWAS). The advent of low-cost, whole-genome sequencing facilitates high-resolution variant identification, and the development of linear mixed models (LMM) allows improved identification of putatively causal variants. While essential for correcting false positive associations due to sample relatedness and population stratification, LMMs have commonly been restricted to quantitative variables. However, phenotypic traits in association studies are often categorical, coded as binary case-control or ordered variables describing disease stages. To address these issues, we have devised a method for genomic association studies that implements a generalized LMM (GLMM) in a Bayesian framework, called Bayes-GLMM. Bayes-GLMM has four major features: (1) support of categorical, binary, and quantitative variables; (2) cohesive integration of previous GWAS results for related traits; (3) correction for sample relatedness by mixed modeling; and (4) model estimation by both Markov chain Monte Carlo sampling and maximal likelihood estimation. We applied Bayes-GLMM to the whole-genome sequencing cohort of the Alzheimer’s Disease Sequencing Project. This study contains 570 individuals from 111 families, each with Alzheimer’s disease diagnosed at one of four confidence levels. Using Bayes-GLMM we identified four variants in three loci significantly associated with Alzheimer’s disease. Two variants, rs140233081 and rs149372995, lie between PRKAR1B and PDGFA. The coded proteins are localized to the glial-vascular unit, and PDGFA transcript levels are associated with Alzheimer’s disease-related neuropathology. In summary, this work provides implementation of a flexible, generalized mixed-model approach in a Bayesian framework for association studies. PMID:29507048

  17. Bayesian informative dropout model for longitudinal binary data with random effects using conditional and joint modeling approaches.

    PubMed

    Chan, Jennifer S K

    2016-05-01

    Dropouts are common in longitudinal study. If the dropout probability depends on the missing observations at or after dropout, this type of dropout is called informative (or nonignorable) dropout (ID). Failure to accommodate such dropout mechanism into the model will bias the parameter estimates. We propose a conditional autoregressive model for longitudinal binary data with an ID model such that the probabilities of positive outcomes as well as the drop-out indicator in each occasion are logit linear in some covariates and outcomes. This model adopting a marginal model for outcomes and a conditional model for dropouts is called a selection model. To allow for the heterogeneity and clustering effects, the outcome model is extended to incorporate mixture and random effects. Lastly, the model is further extended to a novel model that models the outcome and dropout jointly such that their dependency is formulated through an odds ratio function. Parameters are estimated by a Bayesian approach implemented using the user-friendly Bayesian software WinBUGS. A methadone clinic dataset is analyzed to illustrate the proposed models. Result shows that the treatment time effect is still significant but weaker after allowing for an ID process in the data. Finally the effect of drop-out on parameter estimates is evaluated through simulation studies. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Predictive Ensemble Decoding of Acoustical Features Explains Context-Dependent Receptive Fields.

    PubMed

    Yildiz, Izzet B; Mesgarani, Nima; Deneve, Sophie

    2016-12-07

    A primary goal of auditory neuroscience is to identify the sound features extracted and represented by auditory neurons. Linear encoding models, which describe neural responses as a function of the stimulus, have been primarily used for this purpose. Here, we provide theoretical arguments and experimental evidence in support of an alternative approach, based on decoding the stimulus from the neural response. We used a Bayesian normative approach to predict the responses of neurons detecting relevant auditory features, despite ambiguities and noise. We compared the model predictions to recordings from the primary auditory cortex of ferrets and found that: (1) the decoding filters of auditory neurons resemble the filters learned from the statistics of speech sounds; (2) the decoding model captures the dynamics of responses better than a linear encoding model of similar complexity; and (3) the decoding model accounts for the accuracy with which the stimulus is represented in neural activity, whereas linear encoding model performs very poorly. Most importantly, our model predicts that neuronal responses are fundamentally shaped by "explaining away," a divisive competition between alternative interpretations of the auditory scene. Neural responses in the auditory cortex are dynamic, nonlinear, and hard to predict. Traditionally, encoding models have been used to describe neural responses as a function of the stimulus. However, in addition to external stimulation, neural activity is strongly modulated by the responses of other neurons in the network. We hypothesized that auditory neurons aim to collectively decode their stimulus. In particular, a stimulus feature that is decoded (or explained away) by one neuron is not explained by another. We demonstrated that this novel Bayesian decoding model is better at capturing the dynamic responses of cortical neurons in ferrets. Whereas the linear encoding model poorly reflects selectivity of neurons, the decoding model can account for the strong nonlinearities observed in neural data. Copyright © 2016 Yildiz et al.

  19. Bayesian change point analysis of abundance trends for pelagic fishes in the upper San Francisco Estuary.

    PubMed

    Thomson, James R; Kimmerer, Wim J; Brown, Larry R; Newman, Ken B; Mac Nally, Ralph; Bennett, William A; Feyrer, Frederick; Fleishman, Erica

    2010-07-01

    We examined trends in abundance of four pelagic fish species (delta smelt, longfin smelt, striped bass, and threadfin shad) in the upper San Francisco Estuary, California, USA, over 40 years using Bayesian change point models. Change point models identify times of abrupt or unusual changes in absolute abundance (step changes) or in rates of change in abundance (trend changes). We coupled Bayesian model selection with linear regression splines to identify biotic or abiotic covariates with the strongest associations with abundances of each species. We then refitted change point models conditional on the selected covariates to explore whether those covariates could explain statistical trends or change points in species abundances. We also fitted a multispecies change point model that identified change points common to all species. All models included hierarchical structures to model data uncertainties, including observation errors and missing covariate values. There were step declines in abundances of all four species in the early 2000s, with a likely common decline in 2002. Abiotic variables, including water clarity, position of the 2 per thousand isohaline (X2), and the volume of freshwater exported from the estuary, explained some variation in species' abundances over the time series, but no selected covariates could explain statistically the post-2000 change points for any species.

  20. Hierarchical Commensurate and Power Prior Models for Adaptive Incorporation of Historical Information in Clinical Trials

    PubMed Central

    Hobbs, Brian P.; Carlin, Bradley P.; Mandrekar, Sumithra J.; Sargent, Daniel J.

    2011-01-01

    Summary Bayesian clinical trial designs offer the possibility of a substantially reduced sample size, increased statistical power, and reductions in cost and ethical hazard. However when prior and current information conflict, Bayesian methods can lead to higher than expected Type I error, as well as the possibility of a costlier and lengthier trial. This motivates an investigation of the feasibility of hierarchical Bayesian methods for incorporating historical data that are adaptively robust to prior information that reveals itself to be inconsistent with the accumulating experimental data. In this paper, we present several models that allow for the commensurability of the information in the historical and current data to determine how much historical information is used. A primary tool is elaborating the traditional power prior approach based upon a measure of commensurability for Gaussian data. We compare the frequentist performance of several methods using simulations, and close with an example of a colon cancer trial that illustrates a linear models extension of our adaptive borrowing approach. Our proposed methods produce more precise estimates of the model parameters, in particular conferring statistical significance to the observed reduction in tumor size for the experimental regimen as compared to the control regimen. PMID:21361892

  1. HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python.

    PubMed

    Wiecki, Thomas V; Sofer, Imri; Frank, Michael J

    2013-01-01

    The diffusion model is a commonly used tool to infer latent psychological processes underlying decision-making, and to link them to neural mechanisms based on response times. Although efficient open source software has been made available to quantitatively fit the model to data, current estimation methods require an abundance of response time measurements to recover meaningful parameters, and only provide point estimates of each parameter. In contrast, hierarchical Bayesian parameter estimation methods are useful for enhancing statistical power, allowing for simultaneous estimation of individual subject parameters and the group distribution that they are drawn from, while also providing measures of uncertainty in these parameters in the posterior distribution. Here, we present a novel Python-based toolbox called HDDM (hierarchical drift diffusion model), which allows fast and flexible estimation of the the drift-diffusion model and the related linear ballistic accumulator model. HDDM requires fewer data per subject/condition than non-hierarchical methods, allows for full Bayesian data analysis, and can handle outliers in the data. Finally, HDDM supports the estimation of how trial-by-trial measurements (e.g., fMRI) influence decision-making parameters. This paper will first describe the theoretical background of the drift diffusion model and Bayesian inference. We then illustrate usage of the toolbox on a real-world data set from our lab. Finally, parameter recovery studies show that HDDM beats alternative fitting methods like the χ(2)-quantile method as well as maximum likelihood estimation. The software and documentation can be downloaded at: http://ski.clps.brown.edu/hddm_docs/

  2. Modeling Women's Menstrual Cycles using PICI Gates in Bayesian Network.

    PubMed

    Zagorecki, Adam; Łupińska-Dubicka, Anna; Voortman, Mark; Druzdzel, Marek J

    2016-03-01

    A major difficulty in building Bayesian network (BN) models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a number of parameters that is linear in the number of parents. In this paper, we introduce a new class of parametric models, the Probabilistic Independence of Causal Influences (PICI) models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of efficiently modeling a variety of interactions. A subset of PICI models is decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We present an application of the proposed method to learning dynamic BNs for modeling a woman's menstrual cycle. We show that PICI models are especially useful for parameter learning from small data sets and lead to higher parameter accuracy than when learning CPTs.

  3. Bayesian integration and non-linear feedback control in a full-body motor task.

    PubMed

    Stevenson, Ian H; Fernandes, Hugo L; Vilares, Iris; Wei, Kunlin; Körding, Konrad P

    2009-12-01

    A large number of experiments have asked to what degree human reaching movements can be understood as being close to optimal in a statistical sense. However, little is known about whether these principles are relevant for other classes of movements. Here we analyzed movement in a task that is similar to surfing or snowboarding. Human subjects stand on a force plate that measures their center of pressure. This center of pressure affects the acceleration of a cursor that is displayed in a noisy fashion (as a cloud of dots) on a projection screen while the subject is incentivized to keep the cursor close to a fixed position. We find that salient aspects of observed behavior are well-described by optimal control models where a Bayesian estimation model (Kalman filter) is combined with an optimal controller (either a Linear-Quadratic-Regulator or Bang-bang controller). We find evidence that subjects integrate information over time taking into account uncertainty. However, behavior in this continuous steering task appears to be a highly non-linear function of the visual feedback. While the nervous system appears to implement Bayes-like mechanisms for a full-body, dynamic task, it may additionally take into account the specific costs and constraints of the task.

  4. Improving satellite-based PM2.5 estimates in China using Gaussian processes modeling in a Bayesian hierarchical setting.

    PubMed

    Yu, Wenxi; Liu, Yang; Ma, Zongwei; Bi, Jun

    2017-08-01

    Using satellite-based aerosol optical depth (AOD) measurements and statistical models to estimate ground-level PM 2.5 is a promising way to fill the areas that are not covered by ground PM 2.5 monitors. The statistical models used in previous studies are primarily Linear Mixed Effects (LME) and Geographically Weighted Regression (GWR) models. In this study, we developed a new regression model between PM 2.5 and AOD using Gaussian processes in a Bayesian hierarchical setting. Gaussian processes model the stochastic nature of the spatial random effects, where the mean surface and the covariance function is specified. The spatial stochastic process is incorporated under the Bayesian hierarchical framework to explain the variation of PM 2.5 concentrations together with other factors, such as AOD, spatial and non-spatial random effects. We evaluate the results of our model and compare them with those of other, conventional statistical models (GWR and LME) by within-sample model fitting and out-of-sample validation (cross validation, CV). The results show that our model possesses a CV result (R 2  = 0.81) that reflects higher accuracy than that of GWR and LME (0.74 and 0.48, respectively). Our results indicate that Gaussian process models have the potential to improve the accuracy of satellite-based PM 2.5 estimates.

  5. Unsupervised Unmixing of Hyperspectral Images Accounting for Endmember Variability.

    PubMed

    Halimi, Abderrahim; Dobigeon, Nicolas; Tourneret, Jean-Yves

    2015-12-01

    This paper presents an unsupervised Bayesian algorithm for hyperspectral image unmixing, accounting for endmember variability. The pixels are modeled by a linear combination of endmembers weighted by their corresponding abundances. However, the endmembers are assumed random to consider their variability in the image. An additive noise is also considered in the proposed model, generalizing the normal compositional model. The proposed algorithm exploits the whole image to benefit from both spectral and spatial information. It estimates both the mean and the covariance matrix of each endmember in the image. This allows the behavior of each material to be analyzed and its variability to be quantified in the scene. A spatial segmentation is also obtained based on the estimated abundances. In order to estimate the parameters associated with the proposed Bayesian model, we propose to use a Hamiltonian Monte Carlo algorithm. The performance of the resulting unmixing strategy is evaluated through simulations conducted on both synthetic and real data.

  6. Is the Jeffreys' scale a reliable tool for Bayesian model comparison in cosmology?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nesseris, Savvas; García-Bellido, Juan, E-mail: savvas.nesseris@uam.es, E-mail: juan.garciabellido@uam.es

    2013-08-01

    We are entering an era where progress in cosmology is driven by data, and alternative models will have to be compared and ruled out according to some consistent criterium. The most conservative and widely used approach is Bayesian model comparison. In this paper we explicitly calculate the Bayes factors for all models that are linear with respect to their parameters. We do this in order to test the so called Jeffreys' scale and determine analytically how accurate its predictions are in a simple case where we fully understand and can calculate everything analytically. We also discuss the case of nestedmore » models, e.g. one with M{sub 1} and another with M{sub 2} superset of M{sub 1} parameters and we derive analytic expressions for both the Bayes factor and the figure of Merit, defined as the inverse area of the model parameter's confidence contours. With all this machinery and the use of an explicit example we demonstrate that the threshold nature of Jeffreys' scale is not a ''one size fits all'' reliable tool for model comparison and that it may lead to biased conclusions. Furthermore, we discuss the importance of choosing the right basis in the context of models that are linear with respect to their parameters and how that basis affects the parameter estimation and the derived constraints.« less

  7. Collection Fusion Using Bayesian Estimation of a Linear Regression Model in Image Databases on the Web.

    ERIC Educational Resources Information Center

    Kim, Deok-Hwan; Chung, Chin-Wan

    2003-01-01

    Discusses the collection fusion problem of image databases, concerned with retrieving relevant images by content based retrieval from image databases distributed on the Web. Focuses on a metaserver which selects image databases supporting similarity measures and proposes a new algorithm which exploits a probabilistic technique using Bayesian…

  8. Bayesian whole-genome prediction and genome-wide association analysis with missing genotypes using variable selection

    USDA-ARS?s Scientific Manuscript database

    Single-step Genomic Best Linear Unbiased Predictor (ssGBLUP) has become increasingly popular for whole-genome prediction (WGP) modeling as it utilizes any available pedigree and phenotypes on both genotyped and non-genotyped individuals. The WGP accuracy of ssGBLUP has been demonstrated to be greate...

  9. Bayesian GGE biplot models applied to maize multi-environments trials.

    PubMed

    de Oliveira, L A; da Silva, C P; Nuvunga, J J; da Silva, A Q; Balestre, M

    2016-06-17

    The additive main effects and multiplicative interaction (AMMI) and the genotype main effects and genotype x environment interaction (GGE) models stand out among the linear-bilinear models used in genotype x environment interaction studies. Despite the advantages of their use to describe genotype x environment (AMMI) or genotype and genotype x environment (GGE) interactions, these methods have known limitations that are inherent to fixed effects models, including difficulty in treating variance heterogeneity and missing data. Traditional biplots include no measure of uncertainty regarding the principal components. The present study aimed to apply the Bayesian approach to GGE biplot models and assess the implications for selecting stable and adapted genotypes. Our results demonstrated that the Bayesian approach applied to GGE models with non-informative priors was consistent with the traditional GGE biplot analysis, although the credible region incorporated into the biplot enabled distinguishing, based on probability, the performance of genotypes, and their relationships with the environments in the biplot. Those regions also enabled the identification of groups of genotypes and environments with similar effects in terms of adaptability and stability. The relative position of genotypes and environments in biplots is highly affected by the experimental accuracy. Thus, incorporation of uncertainty in biplots is a key tool for breeders to make decisions regarding stability selection and adaptability and the definition of mega-environments.

  10. SOMBI: Bayesian identification of parameter relations in unstructured cosmological data

    NASA Astrophysics Data System (ADS)

    Frank, Philipp; Jasche, Jens; Enßlin, Torsten A.

    2016-11-01

    This work describes the implementation and application of a correlation determination method based on self organizing maps and Bayesian inference (SOMBI). SOMBI aims to automatically identify relations between different observed parameters in unstructured cosmological or astrophysical surveys by automatically identifying data clusters in high-dimensional datasets via the self organizing map neural network algorithm. Parameter relations are then revealed by means of a Bayesian inference within respective identified data clusters. Specifically such relations are assumed to be parametrized as a polynomial of unknown order. The Bayesian approach results in a posterior probability distribution function for respective polynomial coefficients. To decide which polynomial order suffices to describe correlation structures in data, we include a method for model selection, the Bayesian information criterion, to the analysis. The performance of the SOMBI algorithm is tested with mock data. As illustration we also provide applications of our method to cosmological data. In particular, we present results of a correlation analysis between galaxy and active galactic nucleus (AGN) properties provided by the SDSS catalog with the cosmic large-scale-structure (LSS). The results indicate that the combined galaxy and LSS dataset indeed is clustered into several sub-samples of data with different average properties (for example different stellar masses or web-type classifications). The majority of data clusters appear to have a similar correlation structure between galaxy properties and the LSS. In particular we revealed a positive and linear dependency between the stellar mass, the absolute magnitude and the color of a galaxy with the corresponding cosmic density field. A remaining subset of data shows inverted correlations, which might be an artifact of non-linear redshift distortions.

  11. Bayesian estimation of self-similarity exponent

    NASA Astrophysics Data System (ADS)

    Makarava, Natallia; Benmehdi, Sabah; Holschneider, Matthias

    2011-08-01

    In this study we propose a Bayesian approach to the estimation of the Hurst exponent in terms of linear mixed models. Even for unevenly sampled signals and signals with gaps, our method is applicable. We test our method by using artificial fractional Brownian motion of different length and compare it with the detrended fluctuation analysis technique. The estimation of the Hurst exponent of a Rosenblatt process is shown as an example of an H-self-similar process with non-Gaussian dimensional distribution. Additionally, we perform an analysis with real data, the Dow-Jones Industrial Average closing values, and analyze its temporal variation of the Hurst exponent.

  12. Partially linear mixed-effects joint models for skewed and missing longitudinal competing risks outcomes.

    PubMed

    Lu, Tao; Lu, Minggen; Wang, Min; Zhang, Jun; Dong, Guang-Hui; Xu, Yong

    2017-12-18

    Longitudinal competing risks data frequently arise in clinical studies. Skewness and missingness are commonly observed for these data in practice. However, most joint models do not account for these data features. In this article, we propose partially linear mixed-effects joint models to analyze skew longitudinal competing risks data with missingness. In particular, to account for skewness, we replace the commonly assumed symmetric distributions by asymmetric distribution for model errors. To deal with missingness, we employ an informative missing data model. The joint models that couple the partially linear mixed-effects model for the longitudinal process, the cause-specific proportional hazard model for competing risks process and missing data process are developed. To estimate the parameters in the joint models, we propose a fully Bayesian approach based on the joint likelihood. To illustrate the proposed model and method, we implement them to an AIDS clinical study. Some interesting findings are reported. We also conduct simulation studies to validate the proposed method.

  13. Sequential Designs Based on Bayesian Uncertainty Quantification in Sparse Representation Surrogate Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Ray -Bing; Wang, Weichung; Jeff Wu, C. F.

    A numerical method, called OBSM, was recently proposed which employs overcomplete basis functions to achieve sparse representations. While the method can handle non-stationary response without the need of inverting large covariance matrices, it lacks the capability to quantify uncertainty in predictions. We address this issue by proposing a Bayesian approach which first imposes a normal prior on the large space of linear coefficients, then applies the MCMC algorithm to generate posterior samples for predictions. From these samples, Bayesian credible intervals can then be obtained to assess prediction uncertainty. A key application for the proposed method is the efficient construction ofmore » sequential designs. Several sequential design procedures with different infill criteria are proposed based on the generated posterior samples. As a result, numerical studies show that the proposed schemes are capable of solving problems of positive point identification, optimization, and surrogate fitting.« less

  14. Sequential Designs Based on Bayesian Uncertainty Quantification in Sparse Representation Surrogate Modeling

    DOE PAGES

    Chen, Ray -Bing; Wang, Weichung; Jeff Wu, C. F.

    2017-04-12

    A numerical method, called OBSM, was recently proposed which employs overcomplete basis functions to achieve sparse representations. While the method can handle non-stationary response without the need of inverting large covariance matrices, it lacks the capability to quantify uncertainty in predictions. We address this issue by proposing a Bayesian approach which first imposes a normal prior on the large space of linear coefficients, then applies the MCMC algorithm to generate posterior samples for predictions. From these samples, Bayesian credible intervals can then be obtained to assess prediction uncertainty. A key application for the proposed method is the efficient construction ofmore » sequential designs. Several sequential design procedures with different infill criteria are proposed based on the generated posterior samples. As a result, numerical studies show that the proposed schemes are capable of solving problems of positive point identification, optimization, and surrogate fitting.« less

  15. Economic policy optimization based on both one stochastic model and the parametric control theory

    NASA Astrophysics Data System (ADS)

    Ashimov, Abdykappar; Borovskiy, Yuriy; Onalbekov, Mukhit

    2016-06-01

    A nonlinear dynamic stochastic general equilibrium model with financial frictions is developed to describe two interacting national economies in the environment of the rest of the world. Parameters of nonlinear model are estimated based on its log-linearization by the Bayesian approach. The nonlinear model is verified by retroprognosis, estimation of stability indicators of mappings specified by the model, and estimation the degree of coincidence for results of internal and external shocks' effects on macroeconomic indicators on the basis of the estimated nonlinear model and its log-linearization. On the base of the nonlinear model, the parametric control problems of economic growth and volatility of macroeconomic indicators of Kazakhstan are formulated and solved for two exchange rate regimes (free floating and managed floating exchange rates)

  16. Discriminative Bayesian Dictionary Learning for Classification.

    PubMed

    Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal

    2016-12-01

    We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.

  17. Context Effects in Multi-Alternative Decision Making: Empirical Data and a Bayesian Model

    ERIC Educational Resources Information Center

    Hawkins, Guy; Brown, Scott D.; Steyvers, Mark; Wagenmakers, Eric-Jan

    2012-01-01

    For decisions between many alternatives, the benchmark result is Hick's Law: that response time increases log-linearly with the number of choice alternatives. Even when Hick's Law is observed for response times, divergent results have been observed for error rates--sometimes error rates increase with the number of choice alternatives, and…

  18. A full-spectral Bayesian reconstruction approach based on the material decomposition model applied in dual-energy computed tomography.

    PubMed

    Cai, C; Rodet, T; Legoupil, S; Mohammad-Djafari, A

    2013-11-01

    Dual-energy computed tomography (DECT) makes it possible to get two fractions of basis materials without segmentation. One is the soft-tissue equivalent water fraction and the other is the hard-matter equivalent bone fraction. Practical DECT measurements are usually obtained with polychromatic x-ray beams. Existing reconstruction approaches based on linear forward models without counting the beam polychromaticity fail to estimate the correct decomposition fractions and result in beam-hardening artifacts (BHA). The existing BHA correction approaches either need to refer to calibration measurements or suffer from the noise amplification caused by the negative-log preprocessing and the ill-conditioned water and bone separation problem. To overcome these problems, statistical DECT reconstruction approaches based on nonlinear forward models counting the beam polychromaticity show great potential for giving accurate fraction images. This work proposes a full-spectral Bayesian reconstruction approach which allows the reconstruction of high quality fraction images from ordinary polychromatic measurements. This approach is based on a Gaussian noise model with unknown variance assigned directly to the projections without taking negative-log. Referring to Bayesian inferences, the decomposition fractions and observation variance are estimated by using the joint maximum a posteriori (MAP) estimation method. Subject to an adaptive prior model assigned to the variance, the joint estimation problem is then simplified into a single estimation problem. It transforms the joint MAP estimation problem into a minimization problem with a nonquadratic cost function. To solve it, the use of a monotone conjugate gradient algorithm with suboptimal descent steps is proposed. The performance of the proposed approach is analyzed with both simulated and experimental data. The results show that the proposed Bayesian approach is robust to noise and materials. It is also necessary to have the accurate spectrum information about the source-detector system. When dealing with experimental data, the spectrum can be predicted by a Monte Carlo simulator. For the materials between water and bone, less than 5% separation errors are observed on the estimated decomposition fractions. The proposed approach is a statistical reconstruction approach based on a nonlinear forward model counting the full beam polychromaticity and applied directly to the projections without taking negative-log. Compared to the approaches based on linear forward models and the BHA correction approaches, it has advantages in noise robustness and reconstruction accuracy.

  19. Perception of the dynamic visual vertical during sinusoidal linear motion.

    PubMed

    Pomante, A; Selen, L P J; Medendorp, W P

    2017-10-01

    The vestibular system provides information for spatial orientation. However, this information is ambiguous: because the otoliths sense the gravitoinertial force, they cannot distinguish gravitational and inertial components. As a consequence, prolonged linear acceleration of the head can be interpreted as tilt, referred to as the somatogravic effect. Previous modeling work suggests that the brain disambiguates the otolith signal according to the rules of Bayesian inference, combining noisy canal cues with the a priori assumption that prolonged linear accelerations are unlikely. Within this modeling framework the noise of the vestibular signals affects the dynamic characteristics of the tilt percept during linear whole-body motion. To test this prediction, we devised a novel paradigm to psychometrically characterize the dynamic visual vertical-as a proxy for the tilt percept-during passive sinusoidal linear motion along the interaural axis (0.33 Hz motion frequency, 1.75 m/s 2 peak acceleration, 80 cm displacement). While subjects ( n =10) kept fixation on a central body-fixed light, a line was briefly flashed (5 ms) at different phases of the motion, the orientation of which had to be judged relative to gravity. Consistent with the model's prediction, subjects showed a phase-dependent modulation of the dynamic visual vertical, with a subject-specific phase shift with respect to the imposed acceleration signal. The magnitude of this modulation was smaller than predicted, suggesting a contribution of nonvestibular signals to the dynamic visual vertical. Despite their dampening effect, our findings may point to a link between the noise components in the vestibular system and the characteristics of dynamic visual vertical. NEW & NOTEWORTHY A fundamental question in neuroscience is how the brain processes vestibular signals to infer the orientation of the body and objects in space. We show that, under sinusoidal linear motion, systematic error patterns appear in the disambiguation of linear acceleration and spatial orientation. We discuss the dynamics of these illusory percepts in terms of a dynamic Bayesian model that combines uncertainty in the vestibular signals with priors based on the natural statistics of head motion. Copyright © 2017 the American Physiological Society.

  20. Sampling-free Bayesian inversion with adaptive hierarchical tensor representations

    NASA Astrophysics Data System (ADS)

    Eigel, Martin; Marschall, Manuel; Schneider, Reinhold

    2018-03-01

    A sampling-free approach to Bayesian inversion with an explicit polynomial representation of the parameter densities is developed, based on an affine-parametric representation of a linear forward model. This becomes feasible due to the complete treatment in function spaces, which requires an efficient model reduction technique for numerical computations. The advocated perspective yields the crucial benefit that error bounds can be derived for all occuring approximations, leading to provable convergence subject to the discretization parameters. Moreover, it enables a fully adaptive a posteriori control with automatic problem-dependent adjustments of the employed discretizations. The method is discussed in the context of modern hierarchical tensor representations, which are used for the evaluation of a random PDE (the forward model) and the subsequent high-dimensional quadrature of the log-likelihood, alleviating the ‘curse of dimensionality’. Numerical experiments demonstrate the performance and confirm the theoretical results.

  1. High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.

    PubMed

    Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D

    2018-05-30

    NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. The Type Ia Supernova Color-Magnitude Relation and Host Galaxy Dust: A Simple Hierarchical Bayesian Model

    NASA Astrophysics Data System (ADS)

    Mandel, Kaisey S.; Scolnic, Daniel M.; Shariff, Hikmatali; Foley, Ryan J.; Kirshner, Robert P.

    2017-06-01

    Conventional Type Ia supernova (SN Ia) cosmology analyses currently use a simplistic linear regression of magnitude versus color and light curve shape, which does not model intrinsic SN Ia variations and host galaxy dust as physically distinct effects, resulting in low color-magnitude slopes. We construct a probabilistic generative model for the dusty distribution of extinguished absolute magnitudes and apparent colors as the convolution of an intrinsic SN Ia color-magnitude distribution and a host galaxy dust reddening-extinction distribution. If the intrinsic color-magnitude (M B versus B - V) slope {β }{int} differs from the host galaxy dust law R B , this convolution results in a specific curve of mean extinguished absolute magnitude versus apparent color. The derivative of this curve smoothly transitions from {β }{int} in the blue tail to R B in the red tail of the apparent color distribution. The conventional linear fit approximates this effective curve near the average apparent color, resulting in an apparent slope {β }{app} between {β }{int} and R B . We incorporate these effects into a hierarchical Bayesian statistical model for SN Ia light curve measurements, and analyze a data set of SALT2 optical light curve fits of 248 nearby SNe Ia at z< 0.10. The conventional linear fit gives {β }{app}≈ 3. Our model finds {β }{int}=2.3+/- 0.3 and a distinct dust law of {R}B=3.8+/- 0.3, consistent with the average for Milky Way dust, while correcting a systematic distance bias of ˜0.10 mag in the tails of the apparent color distribution. Finally, we extend our model to examine the SN Ia luminosity-host mass dependence in terms of intrinsic and dust components.

  3. A Bayesian approach to identifying structural nonlinearity using free-decay response: Application to damage detection in composites

    USGS Publications Warehouse

    Nichols, J.M.; Link, W.A.; Murphy, K.D.; Olson, C.C.

    2010-01-01

    This work discusses a Bayesian approach to approximating the distribution of parameters governing nonlinear structural systems. Specifically, we use a Markov Chain Monte Carlo method for sampling the posterior parameter distributions thus producing both point and interval estimates for parameters. The method is first used to identify both linear and nonlinear parameters in a multiple degree-of-freedom structural systems using free-decay vibrations. The approach is then applied to the problem of identifying the location, size, and depth of delamination in a model composite beam. The influence of additive Gaussian noise on the response data is explored with respect to the quality of the resulting parameter estimates.

  4. Clinical Outcome Prediction in Aneurysmal Subarachnoid Hemorrhage Using Bayesian Neural Networks with Fuzzy Logic Inferences

    PubMed Central

    Lo, Benjamin W. Y.; Macdonald, R. Loch; Baker, Andrew; Levine, Mitchell A. H.

    2013-01-01

    Objective. The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH). Methods. The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients). Results. Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs). Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique) denoted cut-off points for poor prognosis at greater than 2.5 clusters. Discussion. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication. PMID:23690884

  5. Resolution analysis of marine seismic full waveform data by Bayesian inversion

    NASA Astrophysics Data System (ADS)

    Ray, A.; Sekar, A.; Hoversten, G. M.; Albertin, U.

    2015-12-01

    The Bayesian posterior density function (PDF) of earth models that fit full waveform seismic data convey information on the uncertainty with which the elastic model parameters are resolved. In this work, we apply the trans-dimensional reversible jump Markov Chain Monte Carlo method (RJ-MCMC) for the 1D inversion of noisy synthetic full-waveform seismic data in the frequency-wavenumber domain. While seismic full waveform inversion (FWI) is a powerful method for characterizing subsurface elastic parameters, the uncertainty in the inverted models has remained poorly known, if at all and is highly initial model dependent. The Bayesian method we use is trans-dimensional in that the number of model layers is not fixed, and flexible such that the layer boundaries are free to move around. The resulting parameterization does not require regularization to stabilize the inversion. Depth resolution is traded off with the number of layers, providing an estimate of uncertainty in elastic parameters (compressional and shear velocities Vp and Vs as well as density) with depth. We find that in the absence of additional constraints, Bayesian inversion can result in a wide range of posterior PDFs on Vp, Vs and density. These PDFs range from being clustered around the true model, to those that contain little resolution of any particular features other than those in the near surface, depending on the particular data and target geometry. We present results for a suite of different frequencies and offset ranges, examining the differences in the posterior model densities thus derived. Though these results are for a 1D earth, they are applicable to areas with simple, layered geology and provide valuable insight into the resolving capabilities of FWI, as well as highlight the challenges in solving a highly non-linear problem. The RJ-MCMC method also presents a tantalizing possibility for extension to 2D and 3D Bayesian inversion of full waveform seismic data in the future, as it objectively tackles the problem of model selection (i.e., the number of layers or cells for parameterization), which could ease the computational burden of evaluating forward models with many parameters.

  6. Moments and Root-Mean-Square Error of the Bayesian MMSE Estimator of Classification Error in the Gaussian Model.

    PubMed

    Zollanvari, Amin; Dougherty, Edward R

    2014-06-01

    The most important aspect of any classifier is its error rate, because this quantifies its predictive capacity. Thus, the accuracy of error estimation is critical. Error estimation is problematic in small-sample classifier design because the error must be estimated using the same data from which the classifier has been designed. Use of prior knowledge, in the form of a prior distribution on an uncertainty class of feature-label distributions to which the true, but unknown, feature-distribution belongs, can facilitate accurate error estimation (in the mean-square sense) in circumstances where accurate completely model-free error estimation is impossible. This paper provides analytic asymptotically exact finite-sample approximations for various performance metrics of the resulting Bayesian Minimum Mean-Square-Error (MMSE) error estimator in the case of linear discriminant analysis (LDA) in the multivariate Gaussian model. These performance metrics include the first, second, and cross moments of the Bayesian MMSE error estimator with the true error of LDA, and therefore, the Root-Mean-Square (RMS) error of the estimator. We lay down the theoretical groundwork for Kolmogorov double-asymptotics in a Bayesian setting, which enables us to derive asymptotic expressions of the desired performance metrics. From these we produce analytic finite-sample approximations and demonstrate their accuracy via numerical examples. Various examples illustrate the behavior of these approximations and their use in determining the necessary sample size to achieve a desired RMS. The Supplementary Material contains derivations for some equations and added figures.

  7. Estimating mono- and bi-phasic regression parameters using a mixture piecewise linear Bayesian hierarchical model

    PubMed Central

    Zhao, Rui; Catalano, Paul; DeGruttola, Victor G.; Michor, Franziska

    2017-01-01

    The dynamics of tumor burden, secreted proteins or other biomarkers over time, is often used to evaluate the effectiveness of therapy and to predict outcomes for patients. Many methods have been proposed to investigate longitudinal trends to better characterize patients and to understand disease progression. However, most approaches assume a homogeneous patient population and a uniform response trajectory over time and across patients. Here, we present a mixture piecewise linear Bayesian hierarchical model, which takes into account both population heterogeneity and nonlinear relationships between biomarkers and time. Simulation results show that our method was able to classify subjects according to their patterns of treatment response with greater than 80% accuracy in the three scenarios tested. We then applied our model to a large randomized controlled phase III clinical trial of multiple myeloma patients. Analysis results suggest that the longitudinal tumor burden trajectories in multiple myeloma patients are heterogeneous and nonlinear, even among patients assigned to the same treatment cohort. In addition, between cohorts, there are distinct differences in terms of the regression parameters and the distributions among categories in the mixture. Those results imply that longitudinal data from clinical trials may harbor unobserved subgroups and nonlinear relationships; accounting for both may be important for analyzing longitudinal data. PMID:28723910

  8. The Bayesian group lasso for confounded spatial data

    USGS Publications Warehouse

    Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.

    2017-01-01

    Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.

  9. A Bayesian Approach to Model Selection in Hierarchical Mixtures-of-Experts Architectures.

    PubMed

    Tanner, Martin A.; Peng, Fengchun; Jacobs, Robert A.

    1997-03-01

    There does not exist a statistical model that shows good performance on all tasks. Consequently, the model selection problem is unavoidable; investigators must decide which model is best at summarizing the data for each task of interest. This article presents an approach to the model selection problem in hierarchical mixtures-of-experts architectures. These architectures combine aspects of generalized linear models with those of finite mixture models in order to perform tasks via a recursive "divide-and-conquer" strategy. Markov chain Monte Carlo methodology is used to estimate the distribution of the architectures' parameters. One part of our approach to model selection attempts to estimate the worth of each component of an architecture so that relatively unused components can be pruned from the architecture's structure. A second part of this approach uses a Bayesian hypothesis testing procedure in order to differentiate inputs that carry useful information from nuisance inputs. Simulation results suggest that the approach presented here adheres to the dictum of Occam's razor; simple architectures that are adequate for summarizing the data are favored over more complex structures. Copyright 1997 Elsevier Science Ltd. All Rights Reserved.

  10. Physics of ultrasonic wave propagation in bone and heart characterized using Bayesian parameter estimation

    NASA Astrophysics Data System (ADS)

    Anderson, Christian Carl

    This Dissertation explores the physics underlying the propagation of ultrasonic waves in bone and in heart tissue through the use of Bayesian probability theory. Quantitative ultrasound is a noninvasive modality used for clinical detection, characterization, and evaluation of bone quality and cardiovascular disease. Approaches that extend the state of knowledge of the physics underpinning the interaction of ultrasound with inherently inhomogeneous and isotropic tissue have the potential to enhance its clinical utility. Simulations of fast and slow compressional wave propagation in cancellous bone were carried out to demonstrate the plausibility of a proposed explanation for the widely reported anomalous negative dispersion in cancellous bone. The results showed that negative dispersion could arise from analysis that proceeded under the assumption that the data consist of only a single ultrasonic wave, when in fact two overlapping and interfering waves are present. The confounding effect of overlapping fast and slow waves was addressed by applying Bayesian parameter estimation to simulated data, to experimental data acquired on bone-mimicking phantoms, and to data acquired in vitro on cancellous bone. The Bayesian approach successfully estimated the properties of the individual fast and slow waves even when they strongly overlapped in the acquired data. The Bayesian parameter estimation technique was further applied to an investigation of the anisotropy of ultrasonic properties in cancellous bone. The degree to which fast and slow waves overlap is partially determined by the angle of insonation of ultrasound relative to the predominant direction of trabecular orientation. In the past, studies of anisotropy have been limited by interference between fast and slow waves over a portion of the range of insonation angles. Bayesian analysis estimated attenuation, velocity, and amplitude parameters over the entire range of insonation angles, allowing a more complete characterization of anisotropy. A novel piecewise linear model for the cyclic variation of ultrasonic backscatter from myocardium was proposed. Models of cyclic variation for 100 type 2 diabetes patients and 43 normal control subjects were constructed using Bayesian parameter estimation. Parameters determined from the model, specifically rise time and slew rate, were found to be more reliable in differentiating between subject groups than the previously employed magnitude parameter.

  11. Bayesian inference in an item response theory model with a generalized student t link function

    NASA Astrophysics Data System (ADS)

    Azevedo, Caio L. N.; Migon, Helio S.

    2012-10-01

    In this paper we introduce a new item response theory (IRT) model with a generalized Student t-link function with unknown degrees of freedom (df), named generalized t-link (GtL) IRT model. In this model we consider only the difficulty parameter in the item response function. GtL is an alternative to the two parameter logit and probit models, since the degrees of freedom (df) play a similar role to the discrimination parameter. However, the behavior of the curves of the GtL is different from those of the two parameter models and the usual Student t link, since in GtL the curve obtained from different df's can cross the probit curves in more than one latent trait level. The GtL model has similar proprieties to the generalized linear mixed models, such as the existence of sufficient statistics and easy parameter interpretation. Also, many techniques of parameter estimation, model fit assessment and residual analysis developed for that models can be used for the GtL model. We develop fully Bayesian estimation and model fit assessment tools through a Metropolis-Hastings step within Gibbs sampling algorithm. We consider a prior sensitivity choice concerning the degrees of freedom. The simulation study indicates that the algorithm recovers all parameters properly. In addition, some Bayesian model fit assessment tools are considered. Finally, a real data set is analyzed using our approach and other usual models. The results indicate that our model fits the data better than the two parameter models.

  12. Spatial variability of the effect of air pollution on term birth weight: evaluating influential factors using Bayesian hierarchical models.

    PubMed

    Li, Lianfa; Laurent, Olivier; Wu, Jun

    2016-02-05

    Epidemiological studies suggest that air pollution is adversely associated with pregnancy outcomes. Such associations may be modified by spatially-varying factors including socio-demographic characteristics, land-use patterns and unaccounted exposures. Yet, few studies have systematically investigated the impact of these factors on spatial variability of the air pollution's effects. This study aimed to examine spatial variability of the effects of air pollution on term birth weight across Census tracts and the influence of tract-level factors on such variability. We obtained over 900,000 birth records from 2001 to 2008 in Los Angeles County, California, USA. Air pollution exposure was modeled at individual level for nitrogen dioxide (NO2) and nitrogen oxides (NOx) using spatiotemporal models. Two-stage Bayesian hierarchical non-linear models were developed to (1) quantify the associations between air pollution exposure and term birth weight within each tract; and (2) examine the socio-demographic, land-use, and exposure-related factors contributing to the between-tract variability of the associations between air pollution and term birth weight. Higher air pollution exposure was associated with lower term birth weight (average posterior effects: -14.7 (95 % CI: -19.8, -9.7) g per 10 ppb increment in NO2 and -6.9 (95 % CI: -12.9, -0.9) g per 10 ppb increment in NOx). The variation of the association across Census tracts was significantly influenced by the tract-level socio-demographic, exposure-related and land-use factors. Our models captured the complex non-linear relationship between these factors and the associations between air pollution and term birth weight: we observed the thresholds from which the influence of the tract-level factors was markedly exacerbated or attenuated. Exacerbating factors might reflect additional exposure to environmental insults or lower socio-economic status with higher vulnerability, whereas attenuating factors might indicate reduced exposure or higher socioeconomic status with lower vulnerability. Our Bayesian models effectively combined a priori knowledge with training data to infer the posterior association of air pollution with term birth weight and to evaluate the influence of the tract-level factors on spatial variability of such association. This study contributes new findings about non-linear influences of socio-demographic factors, land-use patterns, and unaccounted exposures on spatial variability of the effects of air pollution.

  13. Sampling schemes and parameter estimation for nonlinear Bernoulli-Gaussian sparse models

    NASA Astrophysics Data System (ADS)

    Boudineau, Mégane; Carfantan, Hervé; Bourguignon, Sébastien; Bazot, Michael

    2016-06-01

    We address the sparse approximation problem in the case where the data are approximated by the linear combination of a small number of elementary signals, each of these signals depending non-linearly on additional parameters. Sparsity is explicitly expressed through a Bernoulli-Gaussian hierarchical model in a Bayesian framework. Posterior mean estimates are computed using Markov Chain Monte-Carlo algorithms. We generalize the partially marginalized Gibbs sampler proposed in the linear case in [1], and build an hybrid Hastings-within-Gibbs algorithm in order to account for the nonlinear parameters. All model parameters are then estimated in an unsupervised procedure. The resulting method is evaluated on a sparse spectral analysis problem. It is shown to converge more efficiently than the classical joint estimation procedure, with only a slight increase of the computational cost per iteration, consequently reducing the global cost of the estimation procedure.

  14. Efficient Bayesian mixed model analysis increases association power in large cohorts

    PubMed Central

    Loh, Po-Ru; Tucker, George; Bulik-Sullivan, Brendan K; Vilhjálmsson, Bjarni J; Finucane, Hilary K; Salem, Rany M; Chasman, Daniel I; Ridker, Paul M; Neale, Benjamin M; Berger, Bonnie; Patterson, Nick; Price, Alkes L

    2014-01-01

    Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts, and may not optimize power. All existing methods require time cost O(MN2) (where N = #samples and M = #SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here, we present a far more efficient mixed model association method, BOLT-LMM, which requires only a small number of O(MN)-time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to nine quantitative traits in 23,294 samples from the Women’s Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for GWAS in large cohorts. PMID:25642633

  15. Smile detectors correlation

    NASA Astrophysics Data System (ADS)

    Yuksel, Kivanc; Chang, Xin; Skarbek, Władysław

    2017-08-01

    The novel smile recognition algorithm is presented based on extraction of 68 facial salient points (fp68) using the ensemble of regression trees. The smile detector exploits the Support Vector Machine linear model. It is trained with few hundreds exemplar images by SVM algorithm working in 136 dimensional space. It is shown by the strict statistical data analysis that such geometric detector strongly depends on the geometry of mouth opening area, measured by triangulation of outer lip contour. To this goal two Bayesian detectors were developed and compared with SVM detector. The first uses the mouth area in 2D image, while the second refers to the mouth area in 3D animated face model. The 3D modeling is based on Candide-3 model and it is performed in real time along with three smile detectors and statistics estimators. The mouth area/Bayesian detectors exhibit high correlation with fp68/SVM detector in a range [0:8; 1:0], depending mainly on light conditions and individual features with advantage of 3D technique, especially in hard light conditions.

  16. Bayesian multivariate hierarchical transformation models for ROC analysis.

    PubMed

    O'Malley, A James; Zou, Kelly H

    2006-02-15

    A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.

  17. Bayesian multivariate hierarchical transformation models for ROC analysis

    PubMed Central

    O'Malley, A. James; Zou, Kelly H.

    2006-01-01

    SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836

  18. Factors affecting GEBV accuracy with single-step Bayesian models.

    PubMed

    Zhou, Lei; Mrode, Raphael; Zhang, Shengli; Zhang, Qin; Li, Bugao; Liu, Jian-Feng

    2018-01-01

    A single-step approach to obtain genomic prediction was first proposed in 2009. Many studies have investigated the components of GEBV accuracy in genomic selection. However, it is still unclear how the population structure and the relationships between training and validation populations influence GEBV accuracy in terms of single-step analysis. Here, we explored the components of GEBV accuracy in single-step Bayesian analysis with a simulation study. Three scenarios with various numbers of QTL (5, 50, and 500) were simulated. Three models were implemented to analyze the simulated data: single-step genomic best linear unbiased prediction (GBLUP; SSGBLUP), single-step BayesA (SS-BayesA), and single-step BayesB (SS-BayesB). According to our results, GEBV accuracy was influenced by the relationships between the training and validation populations more significantly for ungenotyped animals than for genotyped animals. SS-BayesA/BayesB showed an obvious advantage over SSGBLUP with the scenarios of 5 and 50 QTL. SS-BayesB model obtained the lowest accuracy with the 500 QTL in the simulation. SS-BayesA model was the most efficient and robust considering all QTL scenarios. Generally, both the relationships between training and validation populations and LD between markers and QTL contributed to GEBV accuracy in the single-step analysis, and the advantages of single-step Bayesian models were more apparent when the trait is controlled by fewer QTL.

  19. Priors in Whole-Genome Regression: The Bayesian Alphabet Returns

    PubMed Central

    Gianola, Daniel

    2013-01-01

    Whole-genome enabled prediction of complex traits has received enormous attention in animal and plant breeding and is making inroads into human and even Drosophila genetics. The term “Bayesian alphabet” denotes a growing number of letters of the alphabet used to denote various Bayesian linear regressions that differ in the priors adopted, while sharing the same sampling model. We explore the role of the prior distribution in whole-genome regression models for dissecting complex traits in what is now a standard situation with genomic data where the number of unknown parameters (p) typically exceeds sample size (n). Members of the alphabet aim to confront this overparameterization in various manners, but it is shown here that the prior is always influential, unless n ≫ p. This happens because parameters are not likelihood identified, so Bayesian learning is imperfect. Since inferences are not devoid of the influence of the prior, claims about genetic architecture from these methods should be taken with caution. However, all such procedures may deliver reasonable predictions of complex traits, provided that some parameters (“tuning knobs”) are assessed via a properly conducted cross-validation. It is concluded that members of the alphabet have a room in whole-genome prediction of phenotypes, but have somewhat doubtful inferential value, at least when sample size is such that n ≪ p. PMID:23636739

  20. Validation of Bayesian analysis of compartmental kinetic models in medical imaging.

    PubMed

    Sitek, Arkadiusz; Li, Quanzheng; El Fakhri, Georges; Alpert, Nathaniel M

    2016-10-01

    Kinetic compartmental analysis is frequently used to compute physiologically relevant quantitative values from time series of images. In this paper, a new approach based on Bayesian analysis to obtain information about these parameters is presented and validated. The closed-form of the posterior distribution of kinetic parameters is derived with a hierarchical prior to model the standard deviation of normally distributed noise. Markov chain Monte Carlo methods are used for numerical estimation of the posterior distribution. Computer simulations of the kinetics of F18-fluorodeoxyglucose (FDG) are used to demonstrate drawing statistical inferences about kinetic parameters and to validate the theory and implementation. Additionally, point estimates of kinetic parameters and covariance of those estimates are determined using the classical non-linear least squares approach. Posteriors obtained using methods proposed in this work are accurate as no significant deviation from the expected shape of the posterior was found (one-sided P>0.08). It is demonstrated that the results obtained by the standard non-linear least-square methods fail to provide accurate estimation of uncertainty for the same data set (P<0.0001). The results of this work validate new methods for a computer simulations of FDG kinetics. Results show that in situations where the classical approach fails in accurate estimation of uncertainty, Bayesian estimation provides an accurate information about the uncertainties in the parameters. Although a particular example of FDG kinetics was used in the paper, the methods can be extended for different pharmaceuticals and imaging modalities. Copyright © 2016 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.

  1. Experimental and statistical study on fracture boundary of non-irradiated Zircaloy-4 cladding tube under LOCA conditions

    NASA Astrophysics Data System (ADS)

    Narukawa, Takafumi; Yamaguchi, Akira; Jang, Sunghyon; Amaya, Masaki

    2018-02-01

    For estimating fracture probability of fuel cladding tube under loss-of-coolant accident conditions of light-water-reactors, laboratory-scale integral thermal shock tests were conducted on non-irradiated Zircaloy-4 cladding tube specimens. Then, the obtained binary data with respect to fracture or non-fracture of the cladding tube specimen were analyzed statistically. A method to obtain the fracture probability curve as a function of equivalent cladding reacted (ECR) was proposed using Bayesian inference for generalized linear models: probit, logit, and log-probit models. Then, model selection was performed in terms of physical characteristics and information criteria, a widely applicable information criterion and a widely applicable Bayesian information criterion. As a result, it was clarified that the log-probit model was the best among the three models to estimate the fracture probability in terms of the degree of prediction accuracy for both next data to be obtained and the true model. Using the log-probit model, it was shown that 20% ECR corresponded to a 5% probability level with a 95% confidence of fracture of the cladding tube specimens.

  2. Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: An example from a vertigo phase III study with longitudinal count data as primary endpoint

    PubMed Central

    2012-01-01

    Background A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). Results The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. Conclusions The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint. PMID:22962944

  3. Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: an example from a vertigo phase III study with longitudinal count data as primary endpoint.

    PubMed

    Adrion, Christine; Mansmann, Ulrich

    2012-09-10

    A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial. The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint.

  4. Learning an Eddy Viscosity Model Using Shrinkage and Bayesian Calibration: A Jet-in-Crossflow Case Study

    DOE PAGES

    Ray, Jaideep; Lefantzi, Sophia; Arunajatesan, Srinivasan; ...

    2017-09-07

    In this paper, we demonstrate a statistical procedure for learning a high-order eddy viscosity model (EVM) from experimental data and using it to improve the predictive skill of a Reynolds-averaged Navier–Stokes (RANS) simulator. The method is tested in a three-dimensional (3D), transonic jet-in-crossflow (JIC) configuration. The process starts with a cubic eddy viscosity model (CEVM) developed for incompressible flows. It is fitted to limited experimental JIC data using shrinkage regression. The shrinkage process removes all the terms from the model, except an intercept, a linear term, and a quadratic one involving the square of the vorticity. The shrunk eddy viscositymore » model is implemented in an RANS simulator and calibrated, using vorticity measurements, to infer three parameters. The calibration is Bayesian and is solved using a Markov chain Monte Carlo (MCMC) method. A 3D probability density distribution for the inferred parameters is constructed, thus quantifying the uncertainty in the estimate. The phenomenal cost of using a 3D flow simulator inside an MCMC loop is mitigated by using surrogate models (“curve-fits”). A support vector machine classifier (SVMC) is used to impose our prior belief regarding parameter values, specifically to exclude nonphysical parameter combinations. The calibrated model is compared, in terms of its predictive skill, to simulations using uncalibrated linear and CEVMs. Finally, we find that the calibrated model, with one quadratic term, is more accurate than the uncalibrated simulator. The model is also checked at a flow condition at which the model was not calibrated.« less

  5. Learning an Eddy Viscosity Model Using Shrinkage and Bayesian Calibration: A Jet-in-Crossflow Case Study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ray, Jaideep; Lefantzi, Sophia; Arunajatesan, Srinivasan

    In this paper, we demonstrate a statistical procedure for learning a high-order eddy viscosity model (EVM) from experimental data and using it to improve the predictive skill of a Reynolds-averaged Navier–Stokes (RANS) simulator. The method is tested in a three-dimensional (3D), transonic jet-in-crossflow (JIC) configuration. The process starts with a cubic eddy viscosity model (CEVM) developed for incompressible flows. It is fitted to limited experimental JIC data using shrinkage regression. The shrinkage process removes all the terms from the model, except an intercept, a linear term, and a quadratic one involving the square of the vorticity. The shrunk eddy viscositymore » model is implemented in an RANS simulator and calibrated, using vorticity measurements, to infer three parameters. The calibration is Bayesian and is solved using a Markov chain Monte Carlo (MCMC) method. A 3D probability density distribution for the inferred parameters is constructed, thus quantifying the uncertainty in the estimate. The phenomenal cost of using a 3D flow simulator inside an MCMC loop is mitigated by using surrogate models (“curve-fits”). A support vector machine classifier (SVMC) is used to impose our prior belief regarding parameter values, specifically to exclude nonphysical parameter combinations. The calibrated model is compared, in terms of its predictive skill, to simulations using uncalibrated linear and CEVMs. Finally, we find that the calibrated model, with one quadratic term, is more accurate than the uncalibrated simulator. The model is also checked at a flow condition at which the model was not calibrated.« less

  6. Bayesian inference of interaction properties of noisy dynamical systems with time-varying coupling: capabilities and limitations

    NASA Astrophysics Data System (ADS)

    Wilting, Jens; Lehnertz, Klaus

    2015-08-01

    We investigate a recently published analysis framework based on Bayesian inference for the time-resolved characterization of interaction properties of noisy, coupled dynamical systems. It promises wide applicability and a better time resolution than well-established methods. At the example of representative model systems, we show that the analysis framework has the same weaknesses as previous methods, particularly when investigating interacting, structurally different non-linear oscillators. We also inspect the tracking of time-varying interaction properties and propose a further modification of the algorithm, which improves the reliability of obtained results. We exemplarily investigate the suitability of this algorithm to infer strength and direction of interactions between various regions of the human brain during an epileptic seizure. Within the limitations of the applicability of this analysis tool, we show that the modified algorithm indeed allows a better time resolution through Bayesian inference when compared to previous methods based on least square fits.

  7. Moving beyond qualitative evaluations of Bayesian models of cognition.

    PubMed

    Hemmer, Pernille; Tauber, Sean; Steyvers, Mark

    2015-06-01

    Bayesian models of cognition provide a powerful way to understand the behavior and goals of individuals from a computational point of view. Much of the focus in the Bayesian cognitive modeling approach has been on qualitative model evaluations, where predictions from the models are compared to data that is often averaged over individuals. In many cognitive tasks, however, there are pervasive individual differences. We introduce an approach to directly infer individual differences related to subjective mental representations within the framework of Bayesian models of cognition. In this approach, Bayesian data analysis methods are used to estimate cognitive parameters and motivate the inference process within a Bayesian cognitive model. We illustrate this integrative Bayesian approach on a model of memory. We apply the model to behavioral data from a memory experiment involving the recall of heights of people. A cross-validation analysis shows that the Bayesian memory model with inferred subjective priors predicts withheld data better than a Bayesian model where the priors are based on environmental statistics. In addition, the model with inferred priors at the individual subject level led to the best overall generalization performance, suggesting that individual differences are important to consider in Bayesian models of cognition.

  8. Bayesian or Laplacien inference, entropy and information theory and information geometry in data and signal processing

    NASA Astrophysics Data System (ADS)

    Mohammad-Djafari, Ali

    2015-01-01

    The main object of this tutorial article is first to review the main inference tools using Bayesian approach, Entropy, Information theory and their corresponding geometries. This review is focused mainly on the ways these tools have been used in data, signal and image processing. After a short introduction of the different quantities related to the Bayes rule, the entropy and the Maximum Entropy Principle (MEP), relative entropy and the Kullback-Leibler divergence, Fisher information, we will study their use in different fields of data and signal processing such as: entropy in source separation, Fisher information in model order selection, different Maximum Entropy based methods in time series spectral estimation and finally, general linear inverse problems.

  9. Linear mixed-effects models to describe individual tree crown width for China-fir in Fujian Province, southeast China.

    PubMed

    Hao, Xu; Yujun, Sun; Xinjie, Wang; Jin, Wang; Yao, Fu

    2015-01-01

    A multiple linear model was developed for individual tree crown width of Cunninghamia lanceolata (Lamb.) Hook in Fujian province, southeast China. Data were obtained from 55 sample plots of pure China-fir plantation stands. An Ordinary Linear Least Squares (OLS) regression was used to establish the crown width model. To adjust for correlations between observations from the same sample plots, we developed one level linear mixed-effects (LME) models based on the multiple linear model, which take into account the random effects of plots. The best random effects combinations for the LME models were determined by the Akaike's information criterion, the Bayesian information criterion and the -2logarithm likelihood. Heteroscedasticity was reduced by three residual variance functions: the power function, the exponential function and the constant plus power function. The spatial correlation was modeled by three correlation structures: the first-order autoregressive structure [AR(1)], a combination of first-order autoregressive and moving average structures [ARMA(1,1)], and the compound symmetry structure (CS). Then, the LME model was compared to the multiple linear model using the absolute mean residual (AMR), the root mean square error (RMSE), and the adjusted coefficient of determination (adj-R2). For individual tree crown width models, the one level LME model showed the best performance. An independent dataset was used to test the performance of the models and to demonstrate the advantage of calibrating LME models.

  10. Semi-blind Bayesian inference of CMB map and power spectrum

    NASA Astrophysics Data System (ADS)

    Vansyngel, Flavien; Wandelt, Benjamin D.; Cardoso, Jean-François; Benabed, Karim

    2016-04-01

    We present a new blind formulation of the cosmic microwave background (CMB) inference problem. The approach relies on a phenomenological model of the multifrequency microwave sky without the need for physical models of the individual components. For all-sky and high resolution data, it unifies parts of the analysis that had previously been treated separately such as component separation and power spectrum inference. We describe an efficient sampling scheme that fully explores the component separation uncertainties on the inferred CMB products such as maps and/or power spectra. External information about individual components can be incorporated as a prior giving a flexible way to progressively and continuously introduce physical component separation from a maximally blind approach. We connect our Bayesian formalism to existing approaches such as Commander, spectral mismatch independent component analysis (SMICA), and internal linear combination (ILC), and discuss possible future extensions.

  11. Detecting a Change in School Performance: A Bayesian Analysis for a Multilevel Join Point Problem. CSE Technical Report 542.

    ERIC Educational Resources Information Center

    Thum, Yeow Meng; Bhattacharya, Suman Kumar

    To better describe individual behavior within a system, this paper uses a sample of longitudinal test scores from a large urban school system to consider hierarchical Bayes estimation of a multilevel linear regression model in which each individual regression slope of test score on time switches at some unknown point in time, "kj."…

  12. Bayesian inference for two-part mixed-effects model using skew distributions, with application to longitudinal semicontinuous alcohol data.

    PubMed

    Xing, Dongyuan; Huang, Yangxin; Chen, Henian; Zhu, Yiliang; Dagne, Getachew A; Baldwin, Julie

    2017-08-01

    Semicontinuous data featured with an excessive proportion of zeros and right-skewed continuous positive values arise frequently in practice. One example would be the substance abuse/dependence symptoms data for which a substantial proportion of subjects investigated may report zero. Two-part mixed-effects models have been developed to analyze repeated measures of semicontinuous data from longitudinal studies. In this paper, we propose a flexible two-part mixed-effects model with skew distributions for correlated semicontinuous alcohol data under the framework of a Bayesian approach. The proposed model specification consists of two mixed-effects models linked by the correlated random effects: (i) a model on the occurrence of positive values using a generalized logistic mixed-effects model (Part I); and (ii) a model on the intensity of positive values using a linear mixed-effects model where the model errors follow skew distributions including skew- t and skew-normal distributions (Part II). The proposed method is illustrated with an alcohol abuse/dependence symptoms data from a longitudinal observational study, and the analytic results are reported by comparing potential models under different random-effects structures. Simulation studies are conducted to assess the performance of the proposed models and method.

  13. Measuring the Acoustic Release of a Chemotherapeutic Agent from Folate-Targeted Polymeric Micelles.

    PubMed

    Abusara, Ayah; Abdel-Hafez, Mamoun; Husseini, Ghaleb

    2018-08-01

    In this paper, we compare the use of Bayesian filters for the estimation of release and re-encapsulation rates of a chemotherapeutic agent (namely Doxorubicin) from nanocarriers in an acoustically activated drug release system. The study is implemented using an advanced kinetic model that takes into account cavitation events causing the antineoplastic agent's release from polymeric micelles upon exposure to ultrasound. This model is an improvement over the previous representations of acoustic release that used simple zero-, first- and second-order release and re-encapsulation kinetics to study acoustically triggered drug release from polymeric micelles. The new model incorporates drug release and micellar reassembly events caused by cavitation allowing for the controlled release of chemotherapeutics specially and temporally. Different Bayesian estimators are tested for this purpose including Kalman filters (KF), Extended Kalman filters (EKF), Particle filters (PF), and multi-model KF and EKF. Simulated and experimental results are used to verify the performance of the above-mentioned estimators. The proposed methods demonstrate the utility and high-accuracy of using estimation methods in modeling this drug delivery technique. The results show that, in both cases (linear and non-linear dynamics), the modeling errors are expensive but can be minimized using a multi-model approach. In addition, particle filters are more flexible filters that perform reasonably well compared to the other two filters. The study improved the accuracy of the kinetic models used to capture acoustically activated drug release from polymeric micelles, which may in turn help in designing hardware and software capable of precisely controlling the delivered amount of chemotherapeutics to cancerous tissue.

  14. Learning quadratic receptive fields from neural responses to natural stimuli.

    PubMed

    Rajan, Kanaka; Marre, Olivier; Tkačik, Gašper

    2013-07-01

    Models of neural responses to stimuli with complex spatiotemporal correlation structure often assume that neurons are selective for only a small number of linear projections of a potentially high-dimensional input. In this review, we explore recent modeling approaches where the neural response depends on the quadratic form of the input rather than on its linear projection, that is, the neuron is sensitive to the local covariance structure of the signal preceding the spike. To infer this quadratic dependence in the presence of arbitrary (e.g., naturalistic) stimulus distribution, we review several inference methods, focusing in particular on two information theory-based approaches (maximization of stimulus energy and of noise entropy) and two likelihood-based approaches (Bayesian spike-triggered covariance and extensions of generalized linear models). We analyze the formal relationship between the likelihood-based and information-based approaches to demonstrate how they lead to consistent inference. We demonstrate the practical feasibility of these procedures by using model neurons responding to a flickering variance stimulus.

  15. Estimation of white matter fiber parameters from compressed multiresolution diffusion MRI using sparse Bayesian learning.

    PubMed

    Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Duarte-Carvajalino, Julio M; Sapiro, Guillermo; Lenglet, Christophe

    2018-02-15

    We present a sparse Bayesian unmixing algorithm BusineX: Bayesian Unmixing for Sparse Inference-based Estimation of Fiber Crossings (X), for estimation of white matter fiber parameters from compressed (under-sampled) diffusion MRI (dMRI) data. BusineX combines compressive sensing with linear unmixing and introduces sparsity to the previously proposed multiresolution data fusion algorithm RubiX, resulting in a method for improved reconstruction, especially from data with lower number of diffusion gradients. We formulate the estimation of fiber parameters as a sparse signal recovery problem and propose a linear unmixing framework with sparse Bayesian learning for the recovery of sparse signals, the fiber orientations and volume fractions. The data is modeled using a parametric spherical deconvolution approach and represented using a dictionary created with the exponential decay components along different possible diffusion directions. Volume fractions of fibers along these directions define the dictionary weights. The proposed sparse inference, which is based on the dictionary representation, considers the sparsity of fiber populations and exploits the spatial redundancy in data representation, thereby facilitating inference from under-sampled q-space. The algorithm improves parameter estimation from dMRI through data-dependent local learning of hyperparameters, at each voxel and for each possible fiber orientation, that moderate the strength of priors governing the parameter variances. Experimental results on synthetic and in-vivo data show improved accuracy with a lower uncertainty in fiber parameter estimates. BusineX resolves a higher number of second and third fiber crossings. For under-sampled data, the algorithm is also shown to produce more reliable estimates. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Approximate Bayesian Computation by Subset Simulation using hierarchical state-space models

    NASA Astrophysics Data System (ADS)

    Vakilzadeh, Majid K.; Huang, Yong; Beck, James L.; Abrahamsson, Thomas

    2017-02-01

    A new multi-level Markov Chain Monte Carlo algorithm for Approximate Bayesian Computation, ABC-SubSim, has recently appeared that exploits the Subset Simulation method for efficient rare-event simulation. ABC-SubSim adaptively creates a nested decreasing sequence of data-approximating regions in the output space that correspond to increasingly closer approximations of the observed output vector in this output space. At each level, multiple samples of the model parameter vector are generated by a component-wise Metropolis algorithm so that the predicted output corresponding to each parameter value falls in the current data-approximating region. Theoretically, if continued to the limit, the sequence of data-approximating regions would converge on to the observed output vector and the approximate posterior distributions, which are conditional on the data-approximation region, would become exact, but this is not practically feasible. In this paper we study the performance of the ABC-SubSim algorithm for Bayesian updating of the parameters of dynamical systems using a general hierarchical state-space model. We note that the ABC methodology gives an approximate posterior distribution that actually corresponds to an exact posterior where a uniformly distributed combined measurement and modeling error is added. We also note that ABC algorithms have a problem with learning the uncertain error variances in a stochastic state-space model and so we treat them as nuisance parameters and analytically integrate them out of the posterior distribution. In addition, the statistical efficiency of the original ABC-SubSim algorithm is improved by developing a novel strategy to regulate the proposal variance for the component-wise Metropolis algorithm at each level. We demonstrate that Self-regulated ABC-SubSim is well suited for Bayesian system identification by first applying it successfully to model updating of a two degree-of-freedom linear structure for three cases: globally, locally and un-identifiable model classes, and then to model updating of a two degree-of-freedom nonlinear structure with Duffing nonlinearities in its interstory force-deflection relationship.

  17. An adaptive sparse-grid high-order stochastic collocation method for Bayesian inference in groundwater reactive transport modeling

    NASA Astrophysics Data System (ADS)

    Zhang, Guannan; Lu, Dan; Ye, Ming; Gunzburger, Max; Webster, Clayton

    2013-10-01

    Bayesian analysis has become vital to uncertainty quantification in groundwater modeling, but its application has been hindered by the computational cost associated with numerous model executions required by exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, a new approach is developed to improve the computational efficiency of Bayesian inference by constructing a surrogate of the PPDF, using an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, this paper utilizes a compactly supported higher-order hierarchical basis to construct the surrogate system, resulting in a significant reduction in the number of required model executions. In addition, using the hierarchical surplus as an error indicator allows locally adaptive refinement of sparse grids in the parameter space, which further improves computational efficiency. To efficiently build the surrogate system for the PPDF with multiple significant modes, optimization techniques are used to identify the modes, for which high-probability regions are defined and components of the aSG-hSC approximation are constructed. After the surrogate is determined, the PPDF can be evaluated by sampling the surrogate system directly without model execution, resulting in improved efficiency of the surrogate-based MCMC compared with conventional MCMC. The developed method is evaluated using two synthetic groundwater reactive transport models. The first example involves coupled linear reactions and demonstrates the accuracy of our high-order hierarchical basis approach in approximating high-dimensional posteriori distribution. The second example is highly nonlinear because of the reactions of uranium surface complexation, and demonstrates how the iterative aSG-hSC method is able to capture multimodal and non-Gaussian features of PPDF caused by model nonlinearity. Both experiments show that aSG-hSC is an effective and efficient tool for Bayesian inference.

  18. Bayesian inference in camera trapping studies for a class of spatial capture-recapture models

    USGS Publications Warehouse

    Royle, J. Andrew; Karanth, K. Ullas; Gopalaswamy, Arjun M.; Kumar, N. Samba

    2009-01-01

    We develop a class of models for inference about abundance or density using spatial capture-recapture data from studies based on camera trapping and related methods. The model is a hierarchical model composed of two components: a point process model describing the distribution of individuals in space (or their home range centers) and a model describing the observation of individuals in traps. We suppose that trap- and individual-specific capture probabilities are a function of distance between individual home range centers and trap locations. We show that the models can be regarded as generalized linear mixed models, where the individual home range centers are random effects. We adopt a Bayesian framework for inference under these models using a formulation based on data augmentation. We apply the models to camera trapping data on tigers from the Nagarahole Reserve, India, collected over 48 nights in 2006. For this study, 120 camera locations were used, but cameras were only operational at 30 locations during any given sample occasion. Movement of traps is common in many camera-trapping studies and represents an important feature of the observation model that we address explicitly in our application.

  19. A spatial capture-recapture model to estimate fish survival and location from linear continuous monitoring arrays

    USGS Publications Warehouse

    Raabe, Joshua K.; Gardner, Beth; Hightower, Joseph E.

    2013-01-01

    We developed a spatial capture–recapture model to evaluate survival and activity centres (i.e., mean locations) of tagged individuals detected along a linear array. Our spatially explicit version of the Cormack–Jolly–Seber model, analyzed using a Bayesian framework, correlates movement between periods and can incorporate environmental or other covariates. We demonstrate the model using 2010 data for anadromous American shad (Alosa sapidissima) tagged with passive integrated transponders (PIT) at a weir near the mouth of a North Carolina river and passively monitored with an upstream array of PIT antennas. The river channel constrained migrations, resulting in linear, one-dimensional encounter histories that included both weir captures and antenna detections. Individual activity centres in a given time period were a function of the individual’s previous estimated location and the river conditions (i.e., gage height). Model results indicate high within-river spawning mortality (mean weekly survival = 0.80) and more extensive movements during elevated river conditions. This model is applicable for any linear array (e.g., rivers, shorelines, and corridors), opening new opportunities to study demographic parameters, movement or migration, and habitat use.

  20. Development and comparison of Bayesian modularization method in uncertainty assessment of hydrological models

    NASA Astrophysics Data System (ADS)

    Li, L.; Xu, C.-Y.; Engeland, K.

    2012-04-01

    With respect to model calibration, parameter estimation and analysis of uncertainty sources, different approaches have been used in hydrological models. Bayesian method is one of the most widely used methods for uncertainty assessment of hydrological models, which incorporates different sources of information into a single analysis through Bayesian theorem. However, none of these applications can well treat the uncertainty in extreme flows of hydrological models' simulations. This study proposes a Bayesian modularization method approach in uncertainty assessment of conceptual hydrological models by considering the extreme flows. It includes a comprehensive comparison and evaluation of uncertainty assessments by a new Bayesian modularization method approach and traditional Bayesian models using the Metropolis Hasting (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions are used in combination with traditional Bayesian: the AR (1) plus Normal and time period independent model (Model 1), the AR (1) plus Normal and time period dependent model (Model 2) and the AR (1) plus multi-normal model (Model 3). The results reveal that (1) the simulations derived from Bayesian modularization method are more accurate with the highest Nash-Sutcliffe efficiency value, and (2) the Bayesian modularization method performs best in uncertainty estimates of entire flows and in terms of the application and computational efficiency. The study thus introduces a new approach for reducing the extreme flow's effect on the discharge uncertainty assessment of hydrological models via Bayesian. Keywords: extreme flow, uncertainty assessment, Bayesian modularization, hydrological model, WASMOD

  1. Evaluating Spatial Variability in Sediment and Phosphorus Concentration-Discharge Relationships Using Bayesian Inference and Self-Organizing Maps

    NASA Astrophysics Data System (ADS)

    Underwood, Kristen L.; Rizzo, Donna M.; Schroth, Andrew W.; Dewoolkar, Mandar M.

    2017-12-01

    Given the variable biogeochemical, physical, and hydrological processes driving fluvial sediment and nutrient export, the water science and management communities need data-driven methods to identify regions prone to production and transport under variable hydrometeorological conditions. We use Bayesian analysis to segment concentration-discharge linear regression models for total suspended solids (TSS) and particulate and dissolved phosphorus (PP, DP) using 22 years of monitoring data from 18 Lake Champlain watersheds. Bayesian inference was leveraged to estimate segmented regression model parameters and identify threshold position. The identified threshold positions demonstrated a considerable range below and above the median discharge—which has been used previously as the default breakpoint in segmented regression models to discern differences between pre and post-threshold export regimes. We then applied a Self-Organizing Map (SOM), which partitioned the watersheds into clusters of TSS, PP, and DP export regimes using watershed characteristics, as well as Bayesian regression intercepts and slopes. A SOM defined two clusters of high-flux basins, one where PP flux was predominantly episodic and hydrologically driven; and another in which the sediment and nutrient sourcing and mobilization were more bimodal, resulting from both hydrologic processes at post-threshold discharges and reactive processes (e.g., nutrient cycling or lateral/vertical exchanges of fine sediment) at prethreshold discharges. A separate DP SOM defined two high-flux clusters exhibiting a bimodal concentration-discharge response, but driven by differing land use. Our novel framework shows promise as a tool with broad management application that provides insights into landscape drivers of riverine solute and sediment export.

  2. Integrating GLL-Weibull Distribution Within a Bayesian Framework for Life Prediction of Shape Memory Alloy Spring Undergoing Thermo-mechanical Fatigue

    NASA Astrophysics Data System (ADS)

    Kundu, Pradeep; Nath, Tameshwer; Palani, I. A.; Lad, Bhupesh K.

    2018-06-01

    The present paper tackles an important but unmapped problem of the reliability estimations of smart materials. First, an experimental setup is developed for accelerated life testing of the shape memory alloy (SMA) springs. Generalized log-linear Weibull (GLL-Weibull) distribution-based novel approach is then developed for SMA spring life estimation. Applied stimulus (voltage), elongation and cycles of operation are used as inputs for the life prediction model. The values of the parameter coefficients of the model provide better interpretability compared to artificial intelligence based life prediction approaches. In addition, the model also considers the effect of operating conditions, making it generic for a range of the operating conditions. Moreover, a Bayesian framework is used to continuously update the prediction with the actual degradation value of the springs, thereby reducing the uncertainty in the data and improving the prediction accuracy. In addition, the deterioration of material with number of cycles is also investigated using thermogravimetric analysis and scanning electron microscopy.

  3. Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data

    PubMed Central

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.

    2017-01-01

    Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564

  4. Past and present cosmic structure in the SDSS DR7 main sample

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jasche, J.; Leclercq, F.; Wandelt, B.D., E-mail: jasche@iap.fr, E-mail: florent.leclercq@polytechnique.org, E-mail: wandelt@iap.fr

    2015-01-01

    We present a chrono-cosmography project, aiming at the inference of the four dimensional formation history of the observed large scale structure from its origin to the present epoch. To do so, we perform a full-scale Bayesian analysis of the northern galactic cap of the Sloan Digital Sky Survey (SDSS) Data Release 7 main galaxy sample, relying on a fully probabilistic, physical model of the non-linearly evolved density field. Besides inferring initial conditions from observations, our methodology naturally and accurately reconstructs non-linear features at the present epoch, such as walls and filaments, corresponding to high-order correlation functions generated by late-time structuremore » formation. Our inference framework self-consistently accounts for typical observational systematic and statistical uncertainties such as noise, survey geometry and selection effects. We further account for luminosity dependent galaxy biases and automatic noise calibration within a fully Bayesian approach. As a result, this analysis provides highly-detailed and accurate reconstructions of the present density field on scales larger than ∼ 3 Mpc/h, constrained by SDSS observations. This approach also leads to the first quantitative inference of plausible formation histories of the dynamic large scale structure underlying the observed galaxy distribution. The results described in this work constitute the first full Bayesian non-linear analysis of the cosmic large scale structure with the demonstrated capability of uncertainty quantification. Some of these results will be made publicly available along with this work. The level of detail of inferred results and the high degree of control on observational uncertainties pave the path towards high precision chrono-cosmography, the subject of simultaneously studying the dynamics and the morphology of the inhomogeneous Universe.« less

  5. Association of climate drivers with rainfall in New South Wales, Australia, using Bayesian Model Averaging

    NASA Astrophysics Data System (ADS)

    Duc, Hiep Nguyen; Rivett, Kelly; MacSween, Katrina; Le-Anh, Linh

    2017-01-01

    Rainfall in New South Wales (NSW), located in the southeast of the Australian continent, is known to be influenced by four major climate drivers: the El Niño/Southern Oscillation (ENSO), the Interdecadal Pacific Oscillation (IPO), the Southern Annular Mode (SAM) and the Indian Ocean Dipole (IOD). Many studies have shown the influences of ENSO, IPO modulation, SAM and IOD on rainfall in Australia and on southeast Australia in particular. However, only limited work has been undertaken using a multiple regression framework to examine the extent of the combined effect of these climate drivers on rainfall. This paper analysed the role of these combined climate drivers and their interaction on the rainfall in NSW using Bayesian Model Averaging (BMA) to account for model uncertainty by considering each of the linear models across the whole model space which is equal to the set of all possible combinations of predictors to find the model posterior probabilities and their expected predictor coefficients. Using BMA for linear regression models, we are able to corroborate and confirm the results from many previous studies. In addition, the method gives the ranking order of importance and the probability of the association of each of the climate drivers and their interaction on the rainfall at a site. The ability to quantify the relative contribution of the climate drivers offers the key to understand the complex interaction of drivers on rainfall, or lack of rainfall in a region, such as the three big droughts in southeastern Australia which have been the subject of discussion and debate recently on their causes.

  6. Estimating relative risks in multicenter studies with a small number of centers - which methods to use? A simulation study.

    PubMed

    Pedroza, Claudia; Truong, Van Thi Thanh

    2017-11-02

    Analyses of multicenter studies often need to account for center clustering to ensure valid inference. For binary outcomes, it is particularly challenging to properly adjust for center when the number of centers or total sample size is small, or when there are few events per center. Our objective was to evaluate the performance of generalized estimating equation (GEE) log-binomial and Poisson models, generalized linear mixed models (GLMMs) assuming binomial and Poisson distributions, and a Bayesian binomial GLMM to account for center effect in these scenarios. We conducted a simulation study with few centers (≤30) and 50 or fewer subjects per center, using both a randomized controlled trial and an observational study design to estimate relative risk. We compared the GEE and GLMM models with a log-binomial model without adjustment for clustering in terms of bias, root mean square error (RMSE), and coverage. For the Bayesian GLMM, we used informative neutral priors that are skeptical of large treatment effects that are almost never observed in studies of medical interventions. All frequentist methods exhibited little bias, and the RMSE was very similar across the models. The binomial GLMM had poor convergence rates, ranging from 27% to 85%, but performed well otherwise. The results show that both GEE models need to use small sample corrections for robust SEs to achieve proper coverage of 95% CIs. The Bayesian GLMM had similar convergence rates but resulted in slightly more biased estimates for the smallest sample sizes. However, it had the smallest RMSE and good coverage across all scenarios. These results were very similar for both study designs. For the analyses of multicenter studies with a binary outcome and few centers, we recommend adjustment for center with either a GEE log-binomial or Poisson model with appropriate small sample corrections or a Bayesian binomial GLMM with informative priors.

  7. Prediction of road accidents: A Bayesian hierarchical approach.

    PubMed

    Deublein, Markus; Schubert, Matthias; Adey, Bryan T; Köhler, Jochen; Faber, Michael H

    2013-03-01

    In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models. Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis of the observed frequencies of the model response variables, e.g. the occurrence of an accident, and observed values of the risk indicating variables, e.g. degree of road curvature. Subsequently, parameter learning is done using updating algorithms, to determine the posterior predictive probability distributions of the model response variables, conditional on the values of the risk indicating variables. The methodology is illustrated through a case study using data of the Austrian rural motorway network. In the case study, on randomly selected road segments the methodology is used to produce a model to predict the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link between two Austrian cities. It is shown that the proposed methodology can be used to develop models to estimate the occurrence of road accidents for any road network provided that the required data are available. Copyright © 2012 Elsevier Ltd. All rights reserved.

  8. A Bayesian Nonparametric Approach to Test Equating

    ERIC Educational Resources Information Center

    Karabatsos, George; Walker, Stephen G.

    2009-01-01

    A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are…

  9. Model Diagnostics for Bayesian Networks

    ERIC Educational Resources Information Center

    Sinharay, Sandip

    2006-01-01

    Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…

  10. Monitoring Human Development Goals: A Straightforward (Bayesian) Methodology for Cross-National Indices

    ERIC Educational Resources Information Center

    Abayomi, Kobi; Pizarro, Gonzalo

    2013-01-01

    We offer a straightforward framework for measurement of progress, across many dimensions, using cross-national social indices, which we classify as linear combinations of multivariate country level data onto a univariate score. We suggest a Bayesian approach which yields probabilistic (confidence type) intervals for the point estimates of country…

  11. Bayesian linearized amplitude-versus-frequency inversion for quality factor and its application

    NASA Astrophysics Data System (ADS)

    Yang, Xinchao; Teng, Long; Li, Jingnan; Cheng, Jiubing

    2018-06-01

    We propose a straightforward attenuation inversion method by utilizing the amplitude-versus-frequency (AVF) characteristics of seismic data. A new linearized approximation equation of the angle and frequency dependent reflectivity in viscoelastic media is derived. We then use the presented equation to implement the Bayesian linear AVF inversion. The inversion result includes not only P-wave and S-wave velocities, and densities, but also P-wave and S-wave quality factors. Synthetic tests show that the AVF inversion surpasses the AVA inversion for quality factor estimation. However, a higher signal noise ratio (SNR) of data is necessary for the AVF inversion. To show its feasibility, we apply both the new Bayesian AVF inversion and conventional AVA inversion to a tight gas reservoir data in Sichuan Basin in China. Considering the SNR of the field data, a combination of AVF inversion for attenuation parameters and AVA inversion for elastic parameters is recommended. The result reveals that attenuation estimations could serve as a useful complement in combination with the AVA inversion results for the detection of tight gas reservoirs.

  12. A Permutation Approach for Selecting the Penalty Parameter in Penalized Model Selection

    PubMed Central

    Sabourin, Jeremy A; Valdar, William; Nobel, Andrew B

    2015-01-01

    Summary We describe a simple, computationally effcient, permutation-based procedure for selecting the penalty parameter in LASSO penalized regression. The procedure, permutation selection, is intended for applications where variable selection is the primary focus, and can be applied in a variety of structural settings, including that of generalized linear models. We briefly discuss connections between permutation selection and existing theory for the LASSO. In addition, we present a simulation study and an analysis of real biomedical data sets in which permutation selection is compared with selection based on the following: cross-validation (CV), the Bayesian information criterion (BIC), Scaled Sparse Linear Regression, and a selection method based on recently developed testing procedures for the LASSO. PMID:26243050

  13. Approximate Bayesian computation for forward modeling in cosmology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Akeret, Joël; Refregier, Alexandre; Amara, Adam

    Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to themore » posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release.« less

  14. Inter-Ethnic/Racial Facial Variations: A Systematic Review and Bayesian Meta-Analysis of Photogrammetric Studies

    PubMed Central

    Wen, Yi Feng; Wong, Hai Ming; Lin, Ruitao; Yin, Guosheng; McGrath, Colman

    2015-01-01

    Background Numerous facial photogrammetric studies have been published around the world. We aimed to critically review these studies so as to establish population norms for various angular and linear facial measurements; and to determine inter-ethnic/racial facial variations. Methods and Findings A comprehensive and systematic search of PubMed, ISI Web of Science, Embase, and Scopus was conducted to identify facial photogrammetric studies published before December, 2014. Subjects of eligible studies were either Africans, Asians or Caucasians. A Bayesian hierarchical random effects model was developed to estimate posterior means and 95% credible intervals (CrI) for each measurement by ethnicity/race. Linear contrasts were constructed to explore inter-ethnic/racial facial variations. We identified 38 eligible studies reporting 11 angular and 18 linear facial measurements. Risk of bias of the studies ranged from 0.06 to 0.66. At the significance level of 0.05, African males were found to have smaller nasofrontal angle (posterior mean difference: 8.1°, 95% CrI: 2.2°–13.5°) compared to Caucasian males and larger nasofacial angle (7.4°, 0.1°–13.2°) compared to Asian males. Nasolabial angle was more obtuse in Caucasian females than in African (17.4°, 0.2°–35.3°) and Asian (9.1°, 0.4°–17.3°) females. Additional inter-ethnic/racial variations were revealed when the level of statistical significance was set at 0.10. Conclusions A comprehensive database for angular and linear facial measurements was established from existing studies using the statistical model and inter-ethnic/racial variations of facial features were observed. The results have implications for clinical practice and highlight the need and value for high quality photogrammetric studies. PMID:26247212

  15. Inter-Ethnic/Racial Facial Variations: A Systematic Review and Bayesian Meta-Analysis of Photogrammetric Studies.

    PubMed

    Wen, Yi Feng; Wong, Hai Ming; Lin, Ruitao; Yin, Guosheng; McGrath, Colman

    2015-01-01

    Numerous facial photogrammetric studies have been published around the world. We aimed to critically review these studies so as to establish population norms for various angular and linear facial measurements; and to determine inter-ethnic/racial facial variations. A comprehensive and systematic search of PubMed, ISI Web of Science, Embase, and Scopus was conducted to identify facial photogrammetric studies published before December, 2014. Subjects of eligible studies were either Africans, Asians or Caucasians. A Bayesian hierarchical random effects model was developed to estimate posterior means and 95% credible intervals (CrI) for each measurement by ethnicity/race. Linear contrasts were constructed to explore inter-ethnic/racial facial variations. We identified 38 eligible studies reporting 11 angular and 18 linear facial measurements. Risk of bias of the studies ranged from 0.06 to 0.66. At the significance level of 0.05, African males were found to have smaller nasofrontal angle (posterior mean difference: 8.1°, 95% CrI: 2.2°-13.5°) compared to Caucasian males and larger nasofacial angle (7.4°, 0.1°-13.2°) compared to Asian males. Nasolabial angle was more obtuse in Caucasian females than in African (17.4°, 0.2°-35.3°) and Asian (9.1°, 0.4°-17.3°) females. Additional inter-ethnic/racial variations were revealed when the level of statistical significance was set at 0.10. A comprehensive database for angular and linear facial measurements was established from existing studies using the statistical model and inter-ethnic/racial variations of facial features were observed. The results have implications for clinical practice and highlight the need and value for high quality photogrammetric studies.

  16. A note on probabilistic models over strings: the linear algebra approach.

    PubMed

    Bouchard-Côté, Alexandre

    2013-12-01

    Probabilistic models over strings have played a key role in developing methods that take into consideration indels as phylogenetically informative events. There is an extensive literature on using automata and transducers on phylogenies to do inference on these probabilistic models, in which an important theoretical question is the complexity of computing the normalization of a class of string-valued graphical models. This question has been investigated using tools from combinatorics, dynamic programming, and graph theory, and has practical applications in Bayesian phylogenetics. In this work, we revisit this theoretical question from a different point of view, based on linear algebra. The main contribution is a set of results based on this linear algebra view that facilitate the analysis and design of inference algorithms on string-valued graphical models. As an illustration, we use this method to give a new elementary proof of a known result on the complexity of inference on the "TKF91" model, a well-known probabilistic model over strings. Compared to previous work, our proving method is easier to extend to other models, since it relies on a novel weak condition, triangular transducers, which is easy to establish in practice. The linear algebra view provides a concise way of describing transducer algorithms and their compositions, opens the possibility of transferring fast linear algebra libraries (for example, based on GPUs), as well as low rank matrix approximation methods, to string-valued inference problems.

  17. How to Detect the Location and Time of a Covert Chemical Attack: A Bayesian Approach

    DTIC Science & Technology

    2009-12-01

    Inverse Problems, Design and Optimization Symposium 2004. Rio de Janeiro , Brazil. Chan, R., and Yee, E. (1997). A simple model for the probability...sensor interpretation applications and has been successfully applied, for example, to estimate the source strength of pollutant releases in multi...coagulation, and second-order pollutant diffusion in sorption- desorption, are not linear. Furthermore, wide uncertainty bounds exist for several of

  18. Log-Linear Models for Gene Association

    PubMed Central

    Hu, Jianhua; Joshi, Adarsh; Johnson, Valen E.

    2009-01-01

    We describe a class of log-linear models for the detection of interactions in high-dimensional genomic data. This class of models leads to a Bayesian model selection algorithm that can be applied to data that have been reduced to contingency tables using ranks of observations within subjects, and discretization of these ranks within gene/network components. Many normalization issues associated with the analysis of genomic data are thereby avoided. A prior density based on Ewens’ sampling distribution is used to restrict the number of interacting components assigned high posterior probability, and the calculation of posterior model probabilities is expedited by approximations based on the likelihood ratio statistic. Simulation studies are used to evaluate the efficiency of the resulting algorithm for known interaction structures. Finally, the algorithm is validated in a microarray study for which it was possible to obtain biological confirmation of detected interactions. PMID:19655032

  19. Model Selection in Historical Research Using Approximate Bayesian Computation

    PubMed Central

    Rubio-Campillo, Xavier

    2016-01-01

    Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953

  20. Bayesian Model Averaging for Propensity Score Analysis

    ERIC Educational Resources Information Center

    Kaplan, David; Chen, Jianshen

    2013-01-01

    The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…

  1. Bayesian inference to identify parameters in viscoelasticity

    NASA Astrophysics Data System (ADS)

    Rappel, Hussein; Beex, Lars A. A.; Bordas, Stéphane P. A.

    2017-08-01

    This contribution discusses Bayesian inference (BI) as an approach to identify parameters in viscoelasticity. The aims are: (i) to show that the prior has a substantial influence for viscoelasticity, (ii) to show that this influence decreases for an increasing number of measurements and (iii) to show how different types of experiments influence the identified parameters and their uncertainties. The standard linear solid model is the material description of interest and a relaxation test, a constant strain-rate test and a creep test are the tensile experiments focused on. The experimental data are artificially created, allowing us to make a one-to-one comparison between the input parameters and the identified parameter values. Besides dealing with the aforementioned issues, we believe that this contribution forms a comprehensible start for those interested in applying BI in viscoelasticity.

  2. ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration

    PubMed Central

    Bottolo, Leonardo; Langley, Sarah R.; Petretto, Enrico; Tiret, Laurence; Tregouet, David; Richardson, Sylvia

    2011-01-01

    Summary: ESS++ is a C++ implementation of a fully Bayesian variable selection approach for single and multiple response linear regression. ESS++ works well both when the number of observations is larger than the number of predictors and in the ‘large p, small n’ case. In the current version, ESS++ can handle several hundred observations, thousands of predictors and a few responses simultaneously. The core engine of ESS++ for the selection of relevant predictors is based on Evolutionary Monte Carlo. Our implementation is open source, allowing community-based alterations and improvements. Availability: C++ source code and documentation including compilation instructions are available under GNU licence at http://bgx.org.uk/software/ESS.html. Contact: l.bottolo@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21233165

  3. A recursive Bayesian updating model of haptic stiffness perception.

    PubMed

    Wu, Bing; Klatzky, Roberta L

    2018-06-01

    Stiffness of many materials follows Hooke's Law, but the mechanism underlying the haptic perception of stiffness is not as simple as it seems in the physical definition. The present experiments support a model by which stiffness perception is adaptively updated during dynamic interaction. Participants actively explored virtual springs and estimated their stiffness relative to a reference. The stimuli were simulations of linear springs or nonlinear springs created by modulating a linear counterpart with low-amplitude, half-cycle (Experiment 1) or full-cycle (Experiment 2) sinusoidal force. Experiment 1 showed that subjective stiffness increased (decreased) as a linear spring was positively (negatively) modulated by a half-sinewave force. In Experiment 2, an opposite pattern was observed for full-sinewave modulations. Modeling showed that the results were best described by an adaptive process that sequentially and recursively updated an estimate of stiffness using the force and displacement information sampled over trajectory and time. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  4. Bayesian Inference for Functional Dynamics Exploring in fMRI Data.

    PubMed

    Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing

    2016-01-01

    This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.

  5. Receptive Field Inference with Localized Priors

    PubMed Central

    Park, Mijung; Pillow, Jonathan W.

    2011-01-01

    The linear receptive field describes a mapping from sensory stimuli to a one-dimensional variable governing a neuron's spike response. However, traditional receptive field estimators such as the spike-triggered average converge slowly and often require large amounts of data. Bayesian methods seek to overcome this problem by biasing estimates towards solutions that are more likely a priori, typically those with small, smooth, or sparse coefficients. Here we introduce a novel Bayesian receptive field estimator designed to incorporate locality, a powerful form of prior information about receptive field structure. The key to our approach is a hierarchical receptive field model that flexibly adapts to localized structure in both spacetime and spatiotemporal frequency, using an inference method known as empirical Bayes. We refer to our method as automatic locality determination (ALD), and show that it can accurately recover various types of smooth, sparse, and localized receptive fields. We apply ALD to neural data from retinal ganglion cells and V1 simple cells, and find it achieves error rates several times lower than standard estimators. Thus, estimates of comparable accuracy can be achieved with substantially less data. Finally, we introduce a computationally efficient Markov Chain Monte Carlo (MCMC) algorithm for fully Bayesian inference under the ALD prior, yielding accurate Bayesian confidence intervals for small or noisy datasets. PMID:22046110

  6. Combination of dynamic Bayesian network classifiers for the recognition of degraded characters

    NASA Astrophysics Data System (ADS)

    Likforman-Sulem, Laurence; Sigelle, Marc

    2009-01-01

    We investigate in this paper the combination of DBN (Dynamic Bayesian Network) classifiers, either independent or coupled, for the recognition of degraded characters. The independent classifiers are a vertical HMM and a horizontal HMM whose observable outputs are the image columns and the image rows respectively. The coupled classifiers, presented in a previous study, associate the vertical and horizontal observation streams into single DBNs. The scores of the independent and coupled classifiers are then combined linearly at the decision level. We compare the different classifiers -independent, coupled or linearly combined- on two tasks: the recognition of artificially degraded handwritten digits and the recognition of real degraded old printed characters. Our results show that coupled DBNs perform better on degraded characters than the linear combination of independent HMM scores. Our results also show that the best classifier is obtained by linearly combining the scores of the best coupled DBN and the best independent HMM.

  7. Getting more from accuracy and response time data: methods for fitting the linear ballistic accumulator.

    PubMed

    Donkin, Chris; Averell, Lee; Brown, Scott; Heathcote, Andrew

    2009-11-01

    Cognitive models of the decision process provide greater insight into response time and accuracy than do standard ANOVA techniques. However, such models can be mathematically and computationally difficult to apply. We provide instructions and computer code for three methods for estimating the parameters of the linear ballistic accumulator (LBA), a new and computationally tractable model of decisions between two or more choices. These methods-a Microsoft Excel worksheet, scripts for the statistical program R, and code for implementation of the LBA into the Bayesian sampling software WinBUGS-vary in their flexibility and user accessibility. We also provide scripts in R that produce a graphical summary of the data and model predictions. In a simulation study, we explored the effect of sample size on parameter recovery for each method. The materials discussed in this article may be downloaded as a supplement from http://brm.psychonomic-journals.org/content/supplemental.

  8. Mean Field Variational Bayesian Data Assimilation

    NASA Astrophysics Data System (ADS)

    Vrettas, M.; Cornford, D.; Opper, M.

    2012-04-01

    Current data assimilation schemes propose a range of approximate solutions to the classical data assimilation problem, particularly state estimation. Broadly there are three main active research areas: ensemble Kalman filter methods which rely on statistical linearization of the model evolution equations, particle filters which provide a discrete point representation of the posterior filtering or smoothing distribution and 4DVAR methods which seek the most likely posterior smoothing solution. In this paper we present a recent extension to our variational Bayesian algorithm which seeks the most probably posterior distribution over the states, within the family of non-stationary Gaussian processes. Our original work on variational Bayesian approaches to data assimilation sought the best approximating time varying Gaussian process to the posterior smoothing distribution for stochastic dynamical systems. This approach was based on minimising the Kullback-Leibler divergence between the true posterior over paths, and our Gaussian process approximation. So long as the observation density was sufficiently high to bring the posterior smoothing density close to Gaussian the algorithm proved very effective, on lower dimensional systems. However for higher dimensional systems, the algorithm was computationally very demanding. We have been developing a mean field version of the algorithm which treats the state variables at a given time as being independent in the posterior approximation, but still accounts for their relationships between each other in the mean solution arising from the original dynamical system. In this work we present the new mean field variational Bayesian approach, illustrating its performance on a range of classical data assimilation problems. We discuss the potential and limitations of the new approach. We emphasise that the variational Bayesian approach we adopt, in contrast to other variational approaches, provides a bound on the marginal likelihood of the observations given parameters in the model which also allows inference of parameters such as observation errors, and parameters in the model and model error representation, particularly if this is written as a deterministic form with small additive noise. We stress that our approach can address very long time window and weak constraint settings. However like traditional variational approaches our Bayesian variational method has the benefit of being posed as an optimisation problem. We finish with a sketch of the future directions for our approach.

  9. Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola

    PubMed Central

    Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope

    2010-01-01

    The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775

  10. Population-level differences in disease transmission: A Bayesian analysis of multiple smallpox epidemics

    PubMed Central

    Elderd, Bret D.; Dwyer, Greg; Dukic, Vanja

    2013-01-01

    Estimates of a disease’s basic reproductive rate R0 play a central role in understanding outbreaks and planning intervention strategies. In many calculations of R0, a simplifying assumption is that different host populations have effectively identical transmission rates. This assumption can lead to an underestimate of the overall uncertainty associated with R0, which, due to the non-linearity of epidemic processes, may result in a mis-estimate of epidemic intensity and miscalculated expenditures associated with public-health interventions. In this paper, we utilize a Bayesian method for quantifying the overall uncertainty arising from differences in population-specific basic reproductive rates. Using this method, we fit spatial and non-spatial susceptible-exposed-infected-recovered (SEIR) models to a series of 13 smallpox outbreaks. Five outbreaks occurred in populations that had been previously exposed to smallpox, while the remaining eight occurred in Native-American populations that were naïve to the disease at the time. The Native-American outbreaks were close in a spatial and temporal sense. Using Bayesian Information Criterion (BIC), we show that the best model includes population-specific R0 values. These differences in R0 values may, in part, be due to differences in genetic background, social structure, or food and water availability. As a result of these inter-population differences, the overall uncertainty associated with the “population average” value of smallpox R0 is larger, a finding that can have important consequences for controlling epidemics. In general, Bayesian hierarchical models are able to properly account for the uncertainty associated with multiple epidemics, provide a clearer understanding of variability in epidemic dynamics, and yield a better assessment of the range of potential risks and consequences that decision makers face. PMID:24021521

  11. Quantum Bayesian networks with application to games displaying Parrondo's paradox

    NASA Astrophysics Data System (ADS)

    Pejic, Michael

    Bayesian networks and their accompanying graphical models are widely used for prediction and analysis across many disciplines. We will reformulate these in terms of linear maps. This reformulation will suggest a natural extension, which we will show is equivalent to standard textbook quantum mechanics. Therefore, this extension will be termed quantum. However, the term quantum should not be taken to imply this extension is necessarily only of utility in situations traditionally thought of as in the domain of quantum mechanics. In principle, it may be employed in any modelling situation, say forecasting the weather or the stock market---it is up to experiment to determine if this extension is useful in practice. Even restricting to the domain of quantum mechanics, with this new formulation the advantages of Bayesian networks can be maintained for models incorporating quantum and mixed classical-quantum behavior. The use of these will be illustrated by various basic examples. Parrondo's paradox refers to the situation where two, multi-round games with a fixed winning criteria, both with probability greater than one-half for one player to win, are combined. Using a possibly biased coin to determine the rule to employ for each round, paradoxically, the previously losing player now wins the combined game with probabilitygreater than one-half. Using the extended Bayesian networks, we will formulate and analyze classical observed, classical hidden, and quantum versions of a game that displays this paradox, finding bounds for the discrepancy from naive expectations for the occurrence of the paradox. A quantum paradox inspired by Parrondo's paradox will also be analyzed. We will prove a bound for the discrepancy from naive expectations for this paradox as well. Games involving quantum walks that achieve this bound will be presented.

  12. Bayesian structural equation modeling in sport and exercise psychology.

    PubMed

    Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus

    2015-08-01

    Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.

  13. Bayesian inference based on stationary Fokker-Planck sampling.

    PubMed

    Berrones, Arturo

    2010-06-01

    A novel formalism for bayesian learning in the context of complex inference models is proposed. The method is based on the use of the stationary Fokker-Planck (SFP) approach to sample from the posterior density. Stationary Fokker-Planck sampling generalizes the Gibbs sampler algorithm for arbitrary and unknown conditional densities. By the SFP procedure, approximate analytical expressions for the conditionals and marginals of the posterior can be constructed. At each stage of SFP, the approximate conditionals are used to define a Gibbs sampling process, which is convergent to the full joint posterior. By the analytical marginals efficient learning methods in the context of artificial neural networks are outlined. Offline and incremental bayesian inference and maximum likelihood estimation from the posterior are performed in classification and regression examples. A comparison of SFP with other Monte Carlo strategies in the general problem of sampling from arbitrary densities is also presented. It is shown that SFP is able to jump large low-probability regions without the need of a careful tuning of any step-size parameter. In fact, the SFP method requires only a small set of meaningful parameters that can be selected following clear, problem-independent guidelines. The computation cost of SFP, measured in terms of loss function evaluations, grows linearly with the given model's dimension.

  14. 3-D linear inversion of gravity data: method and application to Basse-Terre volcanic island, Guadeloupe, Lesser Antilles

    NASA Astrophysics Data System (ADS)

    Barnoud, Anne; Coutant, Olivier; Bouligand, Claire; Gunawan, Hendra; Deroussi, Sébastien

    2016-04-01

    We use a Bayesian formalism combined with a grid node discretization for the linear inversion of gravimetric data in terms of 3-D density distribution. The forward modelling and the inversion method are derived from seismological inversion techniques in order to facilitate joint inversion or interpretation of density and seismic velocity models. The Bayesian formulation introduces covariance matrices on model parameters to regularize the ill-posed problem and reduce the non-uniqueness of the solution. This formalism favours smooth solutions and allows us to specify a spatial correlation length and to perform inversions at multiple scales. We also extract resolution parameters from the resolution matrix to discuss how well our density models are resolved. This method is applied to the inversion of data from the volcanic island of Basse-Terre in Guadeloupe, Lesser Antilles. A series of synthetic tests are performed to investigate advantages and limitations of the methodology in this context. This study results in the first 3-D density models of the island of Basse-Terre for which we identify: (i) a southward decrease of densities parallel to the migration of volcanic activity within the island, (ii) three dense anomalies beneath Petite Plaine Valley, Beaugendre Valley and the Grande-Découverte-Carmichaël-Soufrière Complex that may reflect the trace of former major volcanic feeding systems, (iii) shallow low-density anomalies in the southern part of Basse-Terre, especially around La Soufrière active volcano, Piton de Bouillante edifice and along the western coast, reflecting the presence of hydrothermal systems and fractured and altered rocks.

  15. Multi-objective calibration and uncertainty analysis of hydrologic models; A comparative study between formal and informal methods

    NASA Astrophysics Data System (ADS)

    Shafii, M.; Tolson, B.; Matott, L. S.

    2012-04-01

    Hydrologic modeling has benefited from significant developments over the past two decades. This has resulted in building of higher levels of complexity into hydrologic models, which eventually makes the model evaluation process (parameter estimation via calibration and uncertainty analysis) more challenging. In order to avoid unreasonable parameter estimates, many researchers have suggested implementation of multi-criteria calibration schemes. Furthermore, for predictive hydrologic models to be useful, proper consideration of uncertainty is essential. Consequently, recent research has emphasized comprehensive model assessment procedures in which multi-criteria parameter estimation is combined with statistically-based uncertainty analysis routines such as Bayesian inference using Markov Chain Monte Carlo (MCMC) sampling. Such a procedure relies on the use of formal likelihood functions based on statistical assumptions, and moreover, the Bayesian inference structured on MCMC samplers requires a considerably large number of simulations. Due to these issues, especially in complex non-linear hydrological models, a variety of alternative informal approaches have been proposed for uncertainty analysis in the multi-criteria context. This study aims at exploring a number of such informal uncertainty analysis techniques in multi-criteria calibration of hydrological models. The informal methods addressed in this study are (i) Pareto optimality which quantifies the parameter uncertainty using the Pareto solutions, (ii) DDS-AU which uses the weighted sum of objective functions to derive the prediction limits, and (iii) GLUE which describes the total uncertainty through identification of behavioral solutions. The main objective is to compare such methods with MCMC-based Bayesian inference with respect to factors such as computational burden, and predictive capacity, which are evaluated based on multiple comparative measures. The measures for comparison are calculated both for calibration and evaluation periods. The uncertainty analysis methodologies are applied to a simple 5-parameter rainfall-runoff model, called HYMOD.

  16. Smooth random change point models.

    PubMed

    van den Hout, Ardo; Muniz-Terrera, Graciela; Matthews, Fiona E

    2011-03-15

    Change point models are used to describe processes over time that show a change in direction. An example of such a process is cognitive ability, where a decline a few years before death is sometimes observed. A broken-stick model consists of two linear parts and a breakpoint where the two lines intersect. Alternatively, models can be formulated that imply a smooth change between the two linear parts. Change point models can be extended by adding random effects to account for variability between subjects. A new smooth change point model is introduced and examples are presented that show how change point models can be estimated using functions in R for mixed-effects models. The Bayesian inference using WinBUGS is also discussed. The methods are illustrated using data from a population-based longitudinal study of ageing, the Cambridge City over 75 Cohort Study. The aim is to identify how many years before death individuals experience a change in the rate of decline of their cognitive ability. Copyright © 2010 John Wiley & Sons, Ltd.

  17. Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems

    NASA Technical Reports Server (NTRS)

    He, Yuning; Davies, Misty Dawn

    2014-01-01

    The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.

  18. A Bayesian kriging approach for blending satellite and ground precipitation observations

    USGS Publications Warehouse

    Verdin, Andrew P.; Rajagopalan, Balaji; Kleiber, William; Funk, Christopher C.

    2015-01-01

    Drought and flood management practices require accurate estimates of precipitation. Gauge observations, however, are often sparse in regions with complicated terrain, clustered in valleys, and of poor quality. Consequently, the spatial extent of wet events is poorly represented. Satellite-derived precipitation data are an attractive alternative, though they tend to underestimate the magnitude of wet events due to their dependency on retrieval algorithms and the indirect relationship between satellite infrared observations and precipitation intensities. Here we offer a Bayesian kriging approach for blending precipitation gauge data and the Climate Hazards Group Infrared Precipitation satellite-derived precipitation estimates for Central America, Colombia, and Venezuela. First, the gauge observations are modeled as a linear function of satellite-derived estimates and any number of other variables—for this research we include elevation. Prior distributions are defined for all model parameters and the posterior distributions are obtained simultaneously via Markov chain Monte Carlo sampling. The posterior distributions of these parameters are required for spatial estimation, and thus are obtained prior to implementing the spatial kriging model. This functional framework is applied to model parameters obtained by sampling from the posterior distributions, and the residuals of the linear model are subject to a spatial kriging model. Consequently, the posterior distributions and uncertainties of the blended precipitation estimates are obtained. We demonstrate this method by applying it to pentadal and monthly total precipitation fields during 2009. The model's performance and its inherent ability to capture wet events are investigated. We show that this blending method significantly improves upon the satellite-derived estimates and is also competitive in its ability to represent wet events. This procedure also provides a means to estimate a full conditional distribution of the “true” observed precipitation value at each grid cell.

  19. Bayesian seismic inversion based on rock-physics prior modeling for the joint estimation of acoustic impedance, porosity and lithofacies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Passos de Figueiredo, Leandro, E-mail: leandrop.fgr@gmail.com; Grana, Dario; Santos, Marcio

    We propose a Bayesian approach for seismic inversion to estimate acoustic impedance, porosity and lithofacies within the reservoir conditioned to post-stack seismic and well data. The link between elastic and petrophysical properties is given by a joint prior distribution for the logarithm of impedance and porosity, based on a rock-physics model. The well conditioning is performed through a background model obtained by well log interpolation. Two different approaches are presented: in the first approach, the prior is defined by a single Gaussian distribution, whereas in the second approach it is defined by a Gaussian mixture to represent the well datamore » multimodal distribution and link the Gaussian components to different geological lithofacies. The forward model is based on a linearized convolutional model. For the single Gaussian case, we obtain an analytical expression for the posterior distribution, resulting in a fast algorithm to compute the solution of the inverse problem, i.e. the posterior distribution of acoustic impedance and porosity as well as the facies probability given the observed data. For the Gaussian mixture prior, it is not possible to obtain the distributions analytically, hence we propose a Gibbs algorithm to perform the posterior sampling and obtain several reservoir model realizations, allowing an uncertainty analysis of the estimated properties and lithofacies. Both methodologies are applied to a real seismic dataset with three wells to obtain 3D models of acoustic impedance, porosity and lithofacies. The methodologies are validated through a blind well test and compared to a standard Bayesian inversion approach. Using the probability of the reservoir lithofacies, we also compute a 3D isosurface probability model of the main oil reservoir in the studied field.« less

  20. Bayesian inference on risk differences: an application to multivariate meta-analysis of adverse events in clinical trials.

    PubMed

    Chen, Yong; Luo, Sheng; Chu, Haitao; Wei, Peng

    2013-05-01

    Multivariate meta-analysis is useful in combining evidence from independent studies which involve several comparisons among groups based on a single outcome. For binary outcomes, the commonly used statistical models for multivariate meta-analysis are multivariate generalized linear mixed effects models which assume risks, after some transformation, follow a multivariate normal distribution with possible correlations. In this article, we consider an alternative model for multivariate meta-analysis where the risks are modeled by the multivariate beta distribution proposed by Sarmanov (1966). This model have several attractive features compared to the conventional multivariate generalized linear mixed effects models, including simplicity of likelihood function, no need to specify a link function, and has a closed-form expression of distribution functions for study-specific risk differences. We investigate the finite sample performance of this model by simulation studies and illustrate its use with an application to multivariate meta-analysis of adverse events of tricyclic antidepressants treatment in clinical trials.

  1. Development and comparison in uncertainty assessment based Bayesian modularization method in hydrological modeling

    NASA Astrophysics Data System (ADS)

    Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn

    2013-04-01

    SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.

  2. Bayesian models: A statistical primer for ecologists

    USGS Publications Warehouse

    Hobbs, N. Thompson; Hooten, Mevin B.

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models

  3. An optical flow-based state-space model of the vocal folds.

    PubMed

    Granados, Alba; Brunskog, Jonas

    2017-06-01

    High-speed movies of the vocal fold vibration are valuable data to reveal vocal fold features for voice pathology diagnosis. This work presents a suitable Bayesian model and a purely theoretical discussion for further development of a framework for continuum biomechanical features estimation. A linear and Gaussian nonstationary state-space model is proposed and thoroughly discussed. The evolution model is based on a self-sustained three-dimensional finite element model of the vocal folds, and the observation model involves a dense optical flow algorithm. The results show that the method is able to capture different deformation patterns between the computed optical flow and the finite element deformation, controlled by the choice of the model tissue parameters.

  4. Analysis of trend changes in Northern African palaeo-climate by using Bayesian inference

    NASA Astrophysics Data System (ADS)

    Schütz, Nadine; Trauth, Martin H.; Holschneider, Matthias

    2010-05-01

    Climate variability of Northern Africa is of high interest due to climate-evolutionary linkages under study. The reconstruction of the palaeo-climate over long time scales, including the expected linkages (> 3 Ma), is mainly accessible by proxy data from deep sea drilling cores. By concentrating on published data sets, we try to decipher rhythms and trends to detect correlations between different proxy time series by advanced mathematical methods. Our preliminary data is dust concentration, as an indicator for climatic changes such as humidity, from the ODP sites 659, 721 and 967 situated around Northern Africa. Our interest is in challenging the available time series with advanced statistical methods to detect significant trend changes and to compare different model assumptions. For that purpose, we want to avoid the rescaling of the time axis to obtain equidistant time steps for filtering methods. Additionally we demand an plausible description of the errors for the estimated parameters, in terms of confidence intervals. Finally, depending on what model we restrict on, we also want an insight in the parameter structure of the assumed models. To gain this information, we focus on Bayesian inference by formulating the problem as a linear mixed model, so that the expectation and deviation are of linear structure. By using the Bayesian method we can formulate the posteriori density as a function of the model parameters and calculate this probability density in the parameter space. Depending which parameters are of interest, we analytically and numerically marginalize the posteriori with respect to the remaining parameters of less interest. We apply a simple linear mixed model to calculate the posteriori densities of the ODP sites 659 and 721 concerning the last 5 Ma at maximum. From preliminary calculations on these data sets, we can confirm results gained by the method of breakfit regression combined with block bootstrapping ([1]). We obtain a significant change point around (1.63 - 1.82) Ma, which correlates with a global climate transition due to the establishment of the Walker circulation ([2]). Furthermore we detect another significant change point around (2.7 - 3.2) Ma, which correlates with the end of the Pliocene warm period (permanent El Niño-like conditions) and the onset of a colder global climate ([3], [4]). The discussion on the algorithm, the results of calculated confidence intervals, the available information about the applied model in the parameter space and the comparison of multiple change point models will be presented. [1] Trauth, M.H., et al., Quaternary Science Reviews, 28, 2009 [2] Wara, M.W., et al., Science, Vol. 309, 2005 [3] Chiang, J.C.H., Annual Review of Earth and Planetary Sciences, Vol. 37, 2009 [4] deMenocal, P., Earth and Planetary Science Letters, 220, 2004

  5. Hierarchical Bayesian Markov switching models with application to predicting spawning success of shovelnose sturgeon

    USGS Publications Warehouse

    Holan, S.H.; Davis, G.M.; Wildhaber, M.L.; DeLonay, A.J.; Papoulias, D.M.

    2009-01-01

    The timing of spawning in fish is tightly linked to environmental factors; however, these factors are not very well understood for many species. Specifically, little information is available to guide recruitment efforts for endangered species such as the sturgeon. Therefore, we propose a Bayesian hierarchical model for predicting the success of spawning of the shovelnose sturgeon which uses both biological and behavioural (longitudinal) data. In particular, we use data that were produced from a tracking study that was conducted in the Lower Missouri River. The data that were produced from this study consist of biological variables associated with readiness to spawn along with longitudinal behavioural data collected by using telemetry and archival data storage tags. These high frequency data are complex both biologically and in the underlying behavioural process. To accommodate such complexity we developed a hierarchical linear regression model that uses an eigenvalue predictor, derived from the transition probability matrix of a two-state Markov switching model with generalized auto-regressive conditional heteroscedastic dynamics. Finally, to minimize the computational burden that is associated with estimation of this model, a parallel computing approach is proposed. ?? Journal compilation 2009 Royal Statistical Society.

  6. A Bayesian approach to modelling the impact of hydrodynamic shear stress on biofilm deformation

    PubMed Central

    Wilkinson, Darren J.; Jayathilake, Pahala Gedara; Rushton, Steve P.; Bridgens, Ben; Li, Bowen; Zuliani, Paolo

    2018-01-01

    We investigate the feasibility of using a surrogate-based method to emulate the deformation and detachment behaviour of a biofilm in response to hydrodynamic shear stress. The influence of shear force, growth rate and viscoelastic parameters on the patterns of growth, structure and resulting shape of microbial biofilms was examined. We develop a statistical modelling approach to this problem, using combination of Bayesian Poisson regression and dynamic linear models for the emulation. We observe that the hydrodynamic shear force affects biofilm deformation in line with some literature. Sensitivity results also showed that the expected number of shear events, shear flow, yield coefficient for heterotrophic bacteria and extracellular polymeric substance (EPS) stiffness per unit EPS mass are the four principal mechanisms governing the bacteria detachment in this study. The sensitivity of the model parameters is temporally dynamic, emphasising the significance of conducting the sensitivity analysis across multiple time points. The surrogate models are shown to perform well, and produced ≈ 480 fold increase in computational efficiency. We conclude that a surrogate-based approach is effective, and resulting biofilm structure is determined primarily by a balance between bacteria growth, viscoelastic parameters and applied shear stress. PMID:29649240

  7. Spatial Bayesian latent factor regression modeling of coordinate-based meta-analysis data.

    PubMed

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D; Nichols, Thomas E

    2018-03-01

    Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the article are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to (i) identify areas of consistent activation; and (ii) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterized as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. © 2017, The International Biometric Society.

  8. Evidence cross-validation and Bayesian inference of MAST plasma equilibria

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nessi, G. T. von; Hole, M. J.; Svensson, J.

    2012-01-15

    In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlledmore » Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.« less

  9. Bayesian inference in geomagnetism

    NASA Technical Reports Server (NTRS)

    Backus, George E.

    1988-01-01

    The inverse problem in empirical geomagnetic modeling is investigated, with critical examination of recently published studies. Particular attention is given to the use of Bayesian inference (BI) to select the damping parameter lambda in the uniqueness portion of the inverse problem. The mathematical bases of BI and stochastic inversion are explored, with consideration of bound-softening problems and resolution in linear Gaussian BI. The problem of estimating the radial magnetic field B(r) at the earth core-mantle boundary from surface and satellite measurements is then analyzed in detail, with specific attention to the selection of lambda in the studies of Gubbins (1983) and Gubbins and Bloxham (1985). It is argued that the selection method is inappropriate and leads to lambda values much larger than those that would result if a reasonable bound on the heat flow at the CMB were assumed.

  10. LANDMARK-BASED SPEECH RECOGNITION: REPORT OF THE 2004 JOHNS HOPKINS SUMMER WORKSHOP.

    PubMed

    Hasegawa-Johnson, Mark; Baker, James; Borys, Sarah; Chen, Ken; Coogan, Emily; Greenberg, Steven; Juneja, Amit; Kirchhoff, Katrin; Livescu, Karen; Mohan, Srividya; Muller, Jennifer; Sonmez, Kemal; Wang, Tianyu

    2005-01-01

    Three research prototype speech recognition systems are described, all of which use recently developed methods from artificial intelligence (specifically support vector machines, dynamic Bayesian networks, and maximum entropy classification) in order to implement, in the form of an automatic speech recognizer, current theories of human speech perception and phonology (specifically landmark-based speech perception, nonlinear phonology, and articulatory phonology). All three systems begin with a high-dimensional multiframe acoustic-to-distinctive feature transformation, implemented using support vector machines trained to detect and classify acoustic phonetic landmarks. Distinctive feature probabilities estimated by the support vector machines are then integrated using one of three pronunciation models: a dynamic programming algorithm that assumes canonical pronunciation of each word, a dynamic Bayesian network implementation of articulatory phonology, or a discriminative pronunciation model trained using the methods of maximum entropy classification. Log probability scores computed by these models are then combined, using log-linear combination, with other word scores available in the lattice output of a first-pass recognizer, and the resulting combination score is used to compute a second-pass speech recognition output.

  11. Algorithm for ion beam figuring of low-gradient mirrors.

    PubMed

    Jiao, Changjun; Li, Shengyi; Xie, Xuhui

    2009-07-20

    Ion beam figuring technology for low-gradient mirrors is discussed. Ion beam figuring is a noncontact machining technique in which a beam of high-energy ions is directed toward a target workpiece to remove material in a predetermined and controlled fashion. Owing to this noncontact mode of material removal, problems associated with tool wear and edge effects, which are common in conventional contact polishing processes, are avoided. Based on the Bayesian principle, an iterative dwell time algorithm for planar mirrors is deduced from the computer-controlled optical surfacing (CCOS) principle. With the properties of the removal function, the shaping process of low-gradient mirrors can be approximated by the linear model for planar mirrors. With these discussions, the error surface figuring technology for low-gradient mirrors with a linear path is set up. With the near-Gaussian property of the removal function, the figuring process with a spiral path can be described by the conventional linear CCOS principle, and a Bayesian-based iterative algorithm can be used to deconvolute the dwell time. Moreover, the selection criterion of the spiral parameter is given. Ion beam figuring technology with a spiral scan path based on these methods can be used to figure mirrors with non-axis-symmetrical errors. Experiments on SiC chemical vapor deposition planar and Zerodur paraboloid samples are made, and the final surface errors are all below 1/100 lambda.

  12. A study of finite mixture model: Bayesian approach on financial time series data

    NASA Astrophysics Data System (ADS)

    Phoong, Seuk-Yen; Ismail, Mohd Tahir

    2014-07-01

    Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.

  13. A flexible cure rate model for spatially correlated survival data based on generalized extreme value distribution and Gaussian process priors.

    PubMed

    Li, Dan; Wang, Xia; Dey, Dipak K

    2016-09-01

    Our present work proposes a new survival model in a Bayesian context to analyze right-censored survival data for populations with a surviving fraction, assuming that the log failure time follows a generalized extreme value distribution. Many applications require a more flexible modeling of covariate information than a simple linear or parametric form for all covariate effects. It is also necessary to include the spatial variation in the model, since it is sometimes unexplained by the covariates considered in the analysis. Therefore, the nonlinear covariate effects and the spatial effects are incorporated into the systematic component of our model. Gaussian processes (GPs) provide a natural framework for modeling potentially nonlinear relationship and have recently become extremely powerful in nonlinear regression. Our proposed model adopts a semiparametric Bayesian approach by imposing a GP prior on the nonlinear structure of continuous covariate. With the consideration of data availability and computational complexity, the conditionally autoregressive distribution is placed on the region-specific frailties to handle spatial correlation. The flexibility and gains of our proposed model are illustrated through analyses of simulated data examples as well as a dataset involving a colon cancer clinical trial from the state of Iowa. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Bayesian Hierarchical Random Intercept Model Based on Three Parameter Gamma Distribution

    NASA Astrophysics Data System (ADS)

    Wirawati, Ika; Iriawan, Nur; Irhamah

    2017-06-01

    Hierarchical data structures are common throughout many areas of research. Beforehand, the existence of this type of data was less noticed in the analysis. The appropriate statistical analysis to handle this type of data is the hierarchical linear model (HLM). This article will focus only on random intercept model (RIM), as a subclass of HLM. This model assumes that the intercept of models in the lowest level are varied among those models, and their slopes are fixed. The differences of intercepts were suspected affected by some variables in the upper level. These intercepts, therefore, are regressed against those upper level variables as predictors. The purpose of this paper would demonstrate a proven work of the proposed two level RIM of the modeling on per capita household expenditure in Maluku Utara, which has five characteristics in the first level and three characteristics of districts/cities in the second level. The per capita household expenditure data in the first level were captured by the three parameters Gamma distribution. The model, therefore, would be more complex due to interaction of many parameters for representing the hierarchical structure and distribution pattern of the data. To simplify the estimation processes of parameters, the computational Bayesian method couple with Markov Chain Monte Carlo (MCMC) algorithm and its Gibbs Sampling are employed.

  15. Bayesian Data-Model Fit Assessment for Structural Equation Modeling

    ERIC Educational Resources Information Center

    Levy, Roy

    2011-01-01

    Bayesian approaches to modeling are receiving an increasing amount of attention in the areas of model construction and estimation in factor analysis, structural equation modeling (SEM), and related latent variable models. However, model diagnostics and model criticism remain relatively understudied aspects of Bayesian SEM. This article describes…

  16. State Space Model with hidden variables for reconstruction of gene regulatory networks.

    PubMed

    Wu, Xi; Li, Peng; Wang, Nan; Gong, Ping; Perkins, Edward J; Deng, Youping; Zhang, Chaoyang

    2011-01-01

    State Space Model (SSM) is a relatively new approach to inferring gene regulatory networks. It requires less computational time than Dynamic Bayesian Networks (DBN). There are two types of variables in the linear SSM, observed variables and hidden variables. SSM uses an iterative method, namely Expectation-Maximization, to infer regulatory relationships from microarray datasets. The hidden variables cannot be directly observed from experiments. How to determine the number of hidden variables has a significant impact on the accuracy of network inference. In this study, we used SSM to infer Gene regulatory networks (GRNs) from synthetic time series datasets, investigated Bayesian Information Criterion (BIC) and Principle Component Analysis (PCA) approaches to determining the number of hidden variables in SSM, and evaluated the performance of SSM in comparison with DBN. True GRNs and synthetic gene expression datasets were generated using GeneNetWeaver. Both DBN and linear SSM were used to infer GRNs from the synthetic datasets. The inferred networks were compared with the true networks. Our results show that inference precision varied with the number of hidden variables. For some regulatory networks, the inference precision of DBN was higher but SSM performed better in other cases. Although the overall performance of the two approaches is compatible, SSM is much faster and capable of inferring much larger networks than DBN. This study provides useful information in handling the hidden variables and improving the inference precision.

  17. Predictability of depression severity based on posterior alpha oscillations.

    PubMed

    Jiang, H; Popov, T; Jylänki, P; Bi, K; Yao, Z; Lu, Q; Jensen, O; van Gerven, M A J

    2016-04-01

    We aimed to integrate neural data and an advanced machine learning technique to predict individual major depressive disorder (MDD) patient severity. MEG data was acquired from 22 MDD patients and 22 healthy controls (HC) resting awake with eyes closed. Individual power spectra were calculated by a Fourier transform. Sources were reconstructed via beamforming technique. Bayesian linear regression was applied to predict depression severity based on the spatial distribution of oscillatory power. In MDD patients, decreased theta (4-8 Hz) and alpha (8-14 Hz) power was observed in fronto-central and posterior areas respectively, whereas increased beta (14-30 Hz) power was observed in fronto-central regions. In particular, posterior alpha power was negatively related to depression severity. The Bayesian linear regression model showed significant depression severity prediction performance based on the spatial distribution of both alpha (r=0.68, p=0.0005) and beta power (r=0.56, p=0.007) respectively. Our findings point to a specific alteration of oscillatory brain activity in MDD patients during rest as characterized from MEG data in terms of spectral and spatial distribution. The proposed model yielded a quantitative and objective estimation for the depression severity, which in turn has a potential for diagnosis and monitoring of the recovery process. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  18. More green space is related to less antidepressant prescription rates in the Netherlands: A Bayesian geoadditive quantile regression approach.

    PubMed

    Helbich, Marco; Klein, Nadja; Roberts, Hannah; Hagedoorn, Paulien; Groenewegen, Peter P

    2018-06-20

    Exposure to green space seems to be beneficial for self-reported mental health. In this study we used an objective health indicator, namely antidepressant prescription rates. Current studies rely exclusively upon mean regression models assuming linear associations. It is, however, plausible that the presence of green space is non-linearly related with different quantiles of the outcome antidepressant prescription rates. These restrictions may contribute to inconsistent findings. Our aim was: a) to assess antidepressant prescription rates in relation to green space, and b) to analyze how the relationship varies non-linearly across different quantiles of antidepressant prescription rates. We used cross-sectional data for the year 2014 at a municipality level in the Netherlands. Ecological Bayesian geoadditive quantile regressions were fitted for the 15%, 50%, and 85% quantiles to estimate green space-prescription rate correlations, controlling for physical activity levels, socio-demographics, urbanicity, etc. RESULTS: The results suggested that green space was overall inversely and non-linearly associated with antidepressant prescription rates. More important, the associations differed across the quantiles, although the variation was modest. Significant non-linearities were apparent: The associations were slightly positive in the lower quantile and strongly negative in the upper one. Our findings imply that an increased availability of green space within a municipality may contribute to a reduction in the number of antidepressant prescriptions dispensed. Green space is thus a central health and community asset, whilst a minimum level of 28% needs to be established for health gains. The highest effectiveness occurred at a municipality surface percentage higher than 79%. This inverse dose-dependent relation has important implications for setting future community-level health and planning policies. Copyright © 2018 Elsevier Inc. All rights reserved.

  19. Robust cue integration: a Bayesian model and evidence from cue-conflict studies with stereoscopic and figure cues to slant.

    PubMed

    Knill, David C

    2007-05-23

    Most research on depth cue integration has focused on stimulus regimes in which stimuli contain the small cue conflicts that one might expect to normally arise from sensory noise. In these regimes, linear models for cue integration provide a good approximation to system performance. This article focuses on situations in which large cue conflicts can naturally occur in stimuli. We describe a Bayesian model for nonlinear cue integration that makes rational inferences about scenes across the entire range of possible cue conflicts. The model derives from the simple intuition that multiple properties of scenes or causal factors give rise to the image information associated with most cues. To make perceptual inferences about one property of a scene, an ideal observer must necessarily take into account the possible contribution of these other factors to the information provided by a cue. In the context of classical depth cues, large cue conflicts most commonly arise when one or another cue is generated by an object or scene that violates the strongest form of constraint that makes the cue informative. For example, when binocularly viewing a slanted trapezoid, the slant interpretation of the figure derived by assuming that the figure is rectangular may conflict greatly with the slant suggested by stereoscopic disparities. An optimal Bayesian estimator incorporates the possibility that different constraints might apply to objects in the world and robustly integrates cues with large conflicts by effectively switching between different internal models of the prior constraints underlying one or both cues. We performed two experiments to test the predictions of the model when applied to estimating surface slant from binocular disparities and the compression cue (the aspect ratio of figures in an image). The apparent weight that subjects gave to the compression cue decreased smoothly as a function of the conflict between the cues but did not shrink to zero; that is, subjects did not fully veto the compression cue at large cue conflicts. A Bayesian model that assumes a mixed prior distribution of figure shapes in the world, with a large proportion being very regular and a smaller proportion having random shapes, provides a good quantitative fit for subjects' performance. The best fitting model parameters are consistent with the sensory noise to be expected in measurements of figure shape, further supporting the Bayesian model as an account of robust cue integration.

  20. A geometric approach to non-linear correlations with intrinsic scatter

    NASA Astrophysics Data System (ADS)

    Pihajoki, Pauli

    2017-12-01

    We propose a new mathematical model for n - k-dimensional non-linear correlations with intrinsic scatter in n-dimensional data. The model is based on Riemannian geometry and is naturally symmetric with respect to the measured variables and invariant under coordinate transformations. We combine the model with a Bayesian approach for estimating the parameters of the correlation relation and the intrinsic scatter. A side benefit of the approach is that censored and truncated data sets and independent, arbitrary measurement errors can be incorporated. We also derive analytic likelihoods for the typical astrophysical use case of linear relations in n-dimensional Euclidean space. We pay particular attention to the case of linear regression in two dimensions and compare our results to existing methods. Finally, we apply our methodology to the well-known MBH-σ correlation between the mass of a supermassive black hole in the centre of a galactic bulge and the corresponding bulge velocity dispersion. The main result of our analysis is that the most likely slope of this correlation is ∼6 for the data sets used, rather than the values in the range of ∼4-5 typically quoted in the literature for these data.

  1. Bayesian multimodel inference for dose-response studies

    USGS Publications Warehouse

    Link, W.A.; Albers, P.H.

    2007-01-01

    Statistical inference in dose?response studies is model-based: The analyst posits a mathematical model of the relation between exposure and response, estimates parameters of the model, and reports conclusions conditional on the model. Such analyses rarely include any accounting for the uncertainties associated with model selection. The Bayesian inferential system provides a convenient framework for model selection and multimodel inference. In this paper we briefly describe the Bayesian paradigm and Bayesian multimodel inference. We then present a family of models for multinomial dose?response data and apply Bayesian multimodel inferential methods to the analysis of data on the reproductive success of American kestrels (Falco sparveriuss) exposed to various sublethal dietary concentrations of methylmercury.

  2. A guide to Bayesian model selection for ecologists

    USGS Publications Warehouse

    Hooten, Mevin B.; Hobbs, N.T.

    2015-01-01

    The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.

  3. Computational Approaches to Spatial Orientation: From Transfer Functions to Dynamic Bayesian Inference

    PubMed Central

    MacNeilage, Paul R.; Ganesan, Narayan; Angelaki, Dora E.

    2008-01-01

    Spatial orientation is the sense of body orientation and self-motion relative to the stationary environment, fundamental to normal waking behavior and control of everyday motor actions including eye movements, postural control, and locomotion. The brain achieves spatial orientation by integrating visual, vestibular, and somatosensory signals. Over the past years, considerable progress has been made toward understanding how these signals are processed by the brain using multiple computational approaches that include frequency domain analysis, the concept of internal models, observer theory, Bayesian theory, and Kalman filtering. Here we put these approaches in context by examining the specific questions that can be addressed by each technique and some of the scientific insights that have resulted. We conclude with a recent application of particle filtering, a probabilistic simulation technique that aims to generate the most likely state estimates by incorporating internal models of sensor dynamics and physical laws and noise associated with sensory processing as well as prior knowledge or experience. In this framework, priors for low angular velocity and linear acceleration can explain the phenomena of velocity storage and frequency segregation, both of which have been modeled previously using arbitrary low-pass filtering. How Kalman and particle filters may be implemented by the brain is an emerging field. Unlike past neurophysiological research that has aimed to characterize mean responses of single neurons, investigations of dynamic Bayesian inference should attempt to characterize population activities that constitute probabilistic representations of sensory and prior information. PMID:18842952

  4. A small-area ecologic study of myocardial infarction, neighborhood deprivation, and sex: a Bayesian modeling approach.

    PubMed

    Deguen, Séverine; Lalloue, Benoît; Bard, Denis; Havard, Sabrina; Arveiler, Dominique; Zmirou-Navier, Denis

    2010-07-01

    Socioeconomic inequalities in the risk of coronary heart disease (CHD) are well documented for men and women. CHD incidence is greater for men but its association with socioeconomic status is usually found to be stronger among women. We explored the sex-specific association between neighborhood deprivation level and the risk of myocardial infarction (MI) at a small-area scale. We studied 1193 myocardial infarction events in people aged 35-74 years in the Strasbourg metropolitan area, France (2000-2003). We used a deprivation index to assess the neighborhood deprivation level. To take into account spatial dependence and the variability of MI rates due to the small number of events, we used a hierarchical Bayesian modeling approach. We fitted hierarchical Bayesian models to estimate sex-specific relative and absolute MI risks across deprivation categories. We tested departure from additive joint effects of deprivation and sex. The risk of MI increased with the deprivation level for both sexes, but was higher for men for all deprivation classes. Relative rates increased along the deprivation scale more steadily for women and followed a different pattern: linear for men and nonlinear for women. Our data provide evidence of effect modification, with departure from an additive joint effect of deprivation and sex. We document sex differences in the socioeconomic gradient of MI risk in Strasbourg. Women appear more susceptible at levels of extreme deprivation; this result is not a chance finding, given the large difference in event rates between men and women.

  5. On the Adequacy of Bayesian Evaluations of Categorization Models: Reply to Vanpaemel and Lee (2012)

    ERIC Educational Resources Information Center

    Wills, Andy J.; Pothos, Emmanuel M.

    2012-01-01

    Vanpaemel and Lee (2012) argued, and we agree, that the comparison of formal models can be facilitated by Bayesian methods. However, Bayesian methods neither precede nor supplant our proposals (Wills & Pothos, 2012), as Bayesian methods can be applied both to our proposals and to their polar opposites. Furthermore, the use of Bayesian methods to…

  6. Robust Bayesian linear regression with application to an analysis of the CODATA values for the Planck constant

    NASA Astrophysics Data System (ADS)

    Wübbeler, Gerd; Bodnar, Olha; Elster, Clemens

    2018-02-01

    Weighted least-squares estimation is commonly applied in metrology to fit models to measurements that are accompanied with quoted uncertainties. The weights are chosen in dependence on the quoted uncertainties. However, when data and model are inconsistent in view of the quoted uncertainties, this procedure does not yield adequate results. When it can be assumed that all uncertainties ought to be rescaled by a common factor, weighted least-squares estimation may still be used, provided that a simple correction of the uncertainty obtained for the estimated model is applied. We show that these uncertainties and credible intervals are robust, as they do not rely on the assumption of a Gaussian distribution of the data. Hence, common software for weighted least-squares estimation may still safely be employed in such a case, followed by a simple modification of the uncertainties obtained by that software. We also provide means of checking the assumptions of such an approach. The Bayesian regression procedure is applied to analyze the CODATA values for the Planck constant published over the past decades in terms of three different models: a constant model, a straight line model and a spline model. Our results indicate that the CODATA values may not have yet stabilized.

  7. Bayesian feature selection for high-dimensional linear regression via the Ising approximation with applications to genomics.

    PubMed

    Fisher, Charles K; Mehta, Pankaj

    2015-06-01

    Feature selection, identifying a subset of variables that are relevant for predicting a response, is an important and challenging component of many methods in statistics and machine learning. Feature selection is especially difficult and computationally intensive when the number of variables approaches or exceeds the number of samples, as is often the case for many genomic datasets. Here, we introduce a new approach--the Bayesian Ising Approximation (BIA)-to rapidly calculate posterior probabilities for feature relevance in L2 penalized linear regression. In the regime where the regression problem is strongly regularized by the prior, we show that computing the marginal posterior probabilities for features is equivalent to computing the magnetizations of an Ising model with weak couplings. Using a mean field approximation, we show it is possible to rapidly compute the feature selection path described by the posterior probabilities as a function of the L2 penalty. We present simulations and analytical results illustrating the accuracy of the BIA on some simple regression problems. Finally, we demonstrate the applicability of the BIA to high-dimensional regression by analyzing a gene expression dataset with nearly 30 000 features. These results also highlight the impact of correlations between features on Bayesian feature selection. An implementation of the BIA in C++, along with data for reproducing our gene expression analyses, are freely available at http://physics.bu.edu/∼pankajm/BIACode. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  8. Functional Multi-Locus QTL Mapping of Temporal Trends in Scots Pine Wood Traits

    PubMed Central

    Li, Zitong; Hallingbäck, Henrik R.; Abrahamsson, Sara; Fries, Anders; Gull, Bengt Andersson; Sillanpää, Mikko J.; García-Gil, M. Rosario

    2014-01-01

    Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals. PMID:25305041

  9. Functional multi-locus QTL mapping of temporal trends in Scots pine wood traits.

    PubMed

    Li, Zitong; Hallingbäck, Henrik R; Abrahamsson, Sara; Fries, Anders; Gull, Bengt Andersson; Sillanpää, Mikko J; García-Gil, M Rosario

    2014-10-09

    Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals. Copyright © 2014 Li et al.

  10. Modeling Information Content Via Dirichlet-Multinomial Regression Analysis.

    PubMed

    Ferrari, Alberto

    2017-01-01

    Shannon entropy is being increasingly used in biomedical research as an index of complexity and information content in sequences of symbols, e.g. languages, amino acid sequences, DNA methylation patterns and animal vocalizations. Yet, distributional properties of information entropy as a random variable have seldom been the object of study, leading to researchers mainly using linear models or simulation-based analytical approach to assess differences in information content, when entropy is measured repeatedly in different experimental conditions. Here a method to perform inference on entropy in such conditions is proposed. Building on results coming from studies in the field of Bayesian entropy estimation, a symmetric Dirichlet-multinomial regression model, able to deal efficiently with the issue of mean entropy estimation, is formulated. Through a simulation study the model is shown to outperform linear modeling in a vast range of scenarios and to have promising statistical properties. As a practical example, the method is applied to a data set coming from a real experiment on animal communication.

  11. Modelling lactation curve for milk fat to protein ratio in Iranian buffaloes (Bubalus bubalis) using non-linear mixed models.

    PubMed

    Hossein-Zadeh, Navid Ghavi

    2016-08-01

    The aim of this study was to compare seven non-linear mathematical models (Brody, Wood, Dhanoa, Sikka, Nelder, Rook and Dijkstra) to examine their efficiency in describing the lactation curves for milk fat to protein ratio (FPR) in Iranian buffaloes. Data were 43 818 test-day records for FPR from the first three lactations of Iranian buffaloes which were collected on 523 dairy herds in the period from 1996 to 2012 by the Animal Breeding Center of Iran. Each model was fitted to monthly FPR records of buffaloes using the non-linear mixed model procedure (PROC NLMIXED) in SAS and the parameters were estimated. The models were tested for goodness of fit using Akaike's information criterion (AIC), Bayesian information criterion (BIC) and log maximum likelihood (-2 Log L). The Nelder and Sikka mixed models provided the best fit of lactation curve for FPR in the first and second lactations of Iranian buffaloes, respectively. However, Wood, Dhanoa and Sikka mixed models provided the best fit of lactation curve for FPR in the third parity buffaloes. Evaluation of first, second and third lactation features showed that all models, except for Dijkstra model in the third lactation, under-predicted test time at which daily FPR was minimum. On the other hand, minimum FPR was over-predicted by all equations. Evaluation of the different models used in this study indicated that non-linear mixed models were sufficient for fitting test-day FPR records of Iranian buffaloes.

  12. Uncertainty aggregation and reduction in structure-material performance prediction

    NASA Astrophysics Data System (ADS)

    Hu, Zhen; Mahadevan, Sankaran; Ao, Dan

    2018-02-01

    An uncertainty aggregation and reduction framework is presented for structure-material performance prediction. Different types of uncertainty sources, structural analysis model, and material performance prediction model are connected through a Bayesian network for systematic uncertainty aggregation analysis. To reduce the uncertainty in the computational structure-material performance prediction model, Bayesian updating using experimental observation data is investigated based on the Bayesian network. It is observed that the Bayesian updating results will have large error if the model cannot accurately represent the actual physics, and that this error will be propagated to the predicted performance distribution. To address this issue, this paper proposes a novel uncertainty reduction method by integrating Bayesian calibration with model validation adaptively. The observation domain of the quantity of interest is first discretized into multiple segments. An adaptive algorithm is then developed to perform model validation and Bayesian updating over these observation segments sequentially. Only information from observation segments where the model prediction is highly reliable is used for Bayesian updating; this is found to increase the effectiveness and efficiency of uncertainty reduction. A composite rotorcraft hub component fatigue life prediction model, which combines a finite element structural analysis model and a material damage model, is used to demonstrate the proposed method.

  13. Bayesian selection of misspecified models is overconfident and may cause spurious posterior probabilities for phylogenetic trees.

    PubMed

    Yang, Ziheng; Zhu, Tianqi

    2018-02-20

    The Bayesian method is noted to produce spuriously high posterior probabilities for phylogenetic trees in analysis of large datasets, but the precise reasons for this overconfidence are unknown. In general, the performance of Bayesian selection of misspecified models is poorly understood, even though this is of great scientific interest since models are never true in real data analysis. Here we characterize the asymptotic behavior of Bayesian model selection and show that when the competing models are equally wrong, Bayesian model selection exhibits surprising and polarized behaviors in large datasets, supporting one model with full force while rejecting the others. If one model is slightly less wrong than the other, the less wrong model will eventually win when the amount of data increases, but the method may become overconfident before it becomes reliable. We suggest that this extreme behavior may be a major factor for the spuriously high posterior probabilities for evolutionary trees. The philosophical implications of our results to the application of Bayesian model selection to evaluate opposing scientific hypotheses are yet to be explored, as are the behaviors of non-Bayesian methods in similar situations.

  14. Short communication: Genetic variation of saturated fatty acids in Holsteins in the Walloon region of Belgium.

    PubMed

    Arnould, V M-R; Hammami, H; Soyeurt, H; Gengler, N

    2010-09-01

    Random regression test-day models using Legendre polynomials are commonly used for the estimation of genetic parameters and genetic evaluation for test-day milk production traits. However, some researchers have reported that these models present some undesirable properties such as the overestimation of variances at the edges of lactation. Describing genetic variation of saturated fatty acids expressed in milk fat might require the testing of different models. Therefore, 3 different functions were used and compared to take into account the lactation curve: (1) Legendre polynomials with the same order as currently applied for genetic model for production traits; 2) linear splines with 10 knots; and 3) linear splines with the same 10 knots reduced to 3 parameters. The criteria used were Akaike's information and Bayesian information criteria, percentage square biases, and log-likelihood function. These criteria indentified Legendre polynomials and linear splines with 10 knots reduced to 3 parameters models as the most useful. Reducing more complex models using eigenvalues seemed appealing because the resulting models are less time demanding and can reduce convergence difficulties, because convergence properties also seemed to be improved. Finally, the results showed that the reduced spline model was very similar to the Legendre polynomials model. Copyright (c) 2010 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  15. Comparison of two weighted integration models for the cueing task: linear and likelihood

    NASA Technical Reports Server (NTRS)

    Shimozaki, Steven S.; Eckstein, Miguel P.; Abbey, Craig K.

    2003-01-01

    In a task in which the observer must detect a signal at two locations, presenting a precue that predicts the location of a signal leads to improved performance with a valid cue (signal location matches the cue), compared to an invalid cue (signal location does not match the cue). The cue validity effect has often been explained with a limited capacity attentional mechanism improving the perceptual quality at the cued location. Alternatively, the cueing effect can also be explained by unlimited capacity models that assume a weighted combination of noisy responses across the two locations. We compare two weighted integration models, a linear model and a sum of weighted likelihoods model based on a Bayesian observer. While qualitatively these models are similar, quantitatively they predict different cue validity effects as the signal-to-noise ratios (SNR) increase. To test these models, 3 observers performed in a cued discrimination task of Gaussian targets with an 80% valid precue across a broad range of SNR's. Analysis of a limited capacity attentional switching model was also included and rejected. The sum of weighted likelihoods model best described the psychophysical results, suggesting that human observers approximate a weighted combination of likelihoods, and not a weighted linear combination.

  16. Bayesian Fundamentalism or Enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition.

    PubMed

    Jones, Matt; Love, Bradley C

    2011-08-01

    The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls that have plagued previous theoretical movements.

  17. How the Bayesians Got Their Beliefs (and What Those Beliefs Actually Are): Comment on Bowers and Davis (2012)

    ERIC Educational Resources Information Center

    Griffiths, Thomas L.; Chater, Nick; Norris, Dennis; Pouget, Alexandre

    2012-01-01

    Bowers and Davis (2012) criticize Bayesian modelers for telling "just so" stories about cognition and neuroscience. Their criticisms are weakened by not giving an accurate characterization of the motivation behind Bayesian modeling or the ways in which Bayesian models are used and by not evaluating this theoretical framework against specific…

  18. Slope Estimation in Noisy Piecewise Linear Functions✩

    PubMed Central

    Ingle, Atul; Bucklew, James; Sethares, William; Varghese, Tomy

    2014-01-01

    This paper discusses the development of a slope estimation algorithm called MAPSlope for piecewise linear data that is corrupted by Gaussian noise. The number and locations of slope change points (also known as breakpoints) are assumed to be unknown a priori though it is assumed that the possible range of slope values lies within known bounds. A stochastic hidden Markov model that is general enough to encompass real world sources of piecewise linear data is used to model the transitions between slope values and the problem of slope estimation is addressed using a Bayesian maximum a posteriori approach. The set of possible slope values is discretized, enabling the design of a dynamic programming algorithm for posterior density maximization. Numerical simulations are used to justify choice of a reasonable number of quantization levels and also to analyze mean squared error performance of the proposed algorithm. An alternating maximization algorithm is proposed for estimation of unknown model parameters and a convergence result for the method is provided. Finally, results using data from political science, finance and medical imaging applications are presented to demonstrate the practical utility of this procedure. PMID:25419020

  19. Time Series Analysis and Forecasting of Wastewater Inflow into Bandar Tun Razak Sewage Treatment Plant in Selangor, Malaysia

    NASA Astrophysics Data System (ADS)

    Abunama, Taher; Othman, Faridah

    2017-06-01

    Analysing the fluctuations of wastewater inflow rates in sewage treatment plants (STPs) is essential to guarantee a sufficient treatment of wastewater before discharging it to the environment. The main objectives of this study are to statistically analyze and forecast the wastewater inflow rates into the Bandar Tun Razak STP in Kuala Lumpur, Malaysia. A time series analysis of three years’ weekly influent data (156weeks) has been conducted using the Auto-Regressive Integrated Moving Average (ARIMA) model. Various combinations of ARIMA orders (p, d, q) have been tried to select the most fitted model, which was utilized to forecast the wastewater inflow rates. The linear regression analysis was applied to testify the correlation between the observed and predicted influents. ARIMA (3, 1, 3) model was selected with the highest significance R-square and lowest normalized Bayesian Information Criterion (BIC) value, and accordingly the wastewater inflow rates were forecasted to additional 52weeks. The linear regression analysis between the observed and predicted values of the wastewater inflow rates showed a positive linear correlation with a coefficient of 0.831.

  20. Slope Estimation in Noisy Piecewise Linear Functions.

    PubMed

    Ingle, Atul; Bucklew, James; Sethares, William; Varghese, Tomy

    2015-03-01

    This paper discusses the development of a slope estimation algorithm called MAPSlope for piecewise linear data that is corrupted by Gaussian noise. The number and locations of slope change points (also known as breakpoints) are assumed to be unknown a priori though it is assumed that the possible range of slope values lies within known bounds. A stochastic hidden Markov model that is general enough to encompass real world sources of piecewise linear data is used to model the transitions between slope values and the problem of slope estimation is addressed using a Bayesian maximum a posteriori approach. The set of possible slope values is discretized, enabling the design of a dynamic programming algorithm for posterior density maximization. Numerical simulations are used to justify choice of a reasonable number of quantization levels and also to analyze mean squared error performance of the proposed algorithm. An alternating maximization algorithm is proposed for estimation of unknown model parameters and a convergence result for the method is provided. Finally, results using data from political science, finance and medical imaging applications are presented to demonstrate the practical utility of this procedure.

  1. Bayesian ensemble refinement by replica simulations and reweighting.

    PubMed

    Hummer, Gerhard; Köfinger, Jürgen

    2015-12-28

    We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.

  2. Bayesian ensemble refinement by replica simulations and reweighting

    NASA Astrophysics Data System (ADS)

    Hummer, Gerhard; Köfinger, Jürgen

    2015-12-01

    We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.

  3. Effective connectivity between superior temporal gyrus and Heschl's gyrus during white noise listening: linear versus non-linear models.

    PubMed

    Hamid, Ka; Yusoff, An; Rahman, Mza; Mohamad, M; Hamid, Aia

    2012-04-01

    This fMRI study is about modelling the effective connectivity between Heschl's gyrus (HG) and the superior temporal gyrus (STG) in human primary auditory cortices. MATERIALS #ENTITYSTARTX00026; Ten healthy male participants were required to listen to white noise stimuli during functional magnetic resonance imaging (fMRI) scans. Statistical parametric mapping (SPM) was used to generate individual and group brain activation maps. For input region determination, two intrinsic connectivity models comprising bilateral HG and STG were constructed using dynamic causal modelling (DCM). The models were estimated and inferred using DCM while Bayesian Model Selection (BMS) for group studies was used for model comparison and selection. Based on the winning model, six linear and six non-linear causal models were derived and were again estimated, inferred, and compared to obtain a model that best represents the effective connectivity between HG and the STG, balancing accuracy and complexity. Group results indicated significant asymmetrical activation (p(uncorr) < 0.001) in bilateral HG and STG. Model comparison results showed strong evidence of STG as the input centre. The winning model is preferred by 6 out of 10 participants. The results were supported by BMS results for group studies with the expected posterior probability, r = 0.7830 and exceedance probability, ϕ = 0.9823. One-sample t-tests performed on connection values obtained from the winning model indicated that the valid connections for the winning model are the unidirectional parallel connections from STG to bilateral HG (p < 0.05). Subsequent model comparison between linear and non-linear models using BMS prefers non-linear connection (r = 0.9160, ϕ = 1.000) from which the connectivity between STG and the ipsi- and contralateral HG is gated by the activity in STG itself. We are able to demonstrate that the effective connectivity between HG and STG while listening to white noise for the respective participants can be explained by a non-linear dynamic causal model with the activity in STG influencing the STG-HG connectivity non-linearly.

  4. Bayesian spatiotemporal crash frequency models with mixture components for space-time interactions.

    PubMed

    Cheng, Wen; Gill, Gurdiljot Singh; Zhang, Yongping; Cao, Zhong

    2018-03-01

    The traffic safety research has developed spatiotemporal models to explore the variations in the spatial pattern of crash risk over time. Many studies observed notable benefits associated with the inclusion of spatial and temporal correlation and their interactions. However, the safety literature lacks sufficient research for the comparison of different temporal treatments and their interaction with spatial component. This study developed four spatiotemporal models with varying complexity due to the different temporal treatments such as (I) linear time trend; (II) quadratic time trend; (III) Autoregressive-1 (AR-1); and (IV) time adjacency. Moreover, the study introduced a flexible two-component mixture for the space-time interaction which allows greater flexibility compared to the traditional linear space-time interaction. The mixture component allows the accommodation of global space-time interaction as well as the departures from the overall spatial and temporal risk patterns. This study performed a comprehensive assessment of mixture models based on the diverse criteria pertaining to goodness-of-fit, cross-validation and evaluation based on in-sample data for predictive accuracy of crash estimates. The assessment of model performance in terms of goodness-of-fit clearly established the superiority of the time-adjacency specification which was evidently more complex due to the addition of information borrowed from neighboring years, but this addition of parameters allowed significant advantage at posterior deviance which subsequently benefited overall fit to crash data. The Base models were also developed to study the comparison between the proposed mixture and traditional space-time components for each temporal model. The mixture models consistently outperformed the corresponding Base models due to the advantages of much lower deviance. For cross-validation comparison of predictive accuracy, linear time trend model was adjudged the best as it recorded the highest value of log pseudo marginal likelihood (LPML). Four other evaluation criteria were considered for typical validation using the same data for model development. Under each criterion, observed crash counts were compared with three types of data containing Bayesian estimated, normal predicted, and model replicated ones. The linear model again performed the best in most scenarios except one case of using model replicated data and two cases involving prediction without including random effects. These phenomena indicated the mediocre performance of linear trend when random effects were excluded for evaluation. This might be due to the flexible mixture space-time interaction which can efficiently absorb the residual variability escaping from the predictable part of the model. The comparison of Base and mixture models in terms of prediction accuracy further bolstered the superiority of the mixture models as the mixture ones generated more precise estimated crash counts across all four models, suggesting that the advantages associated with mixture component at model fit were transferable to prediction accuracy. Finally, the residual analysis demonstrated the consistently superior performance of random effect models which validates the importance of incorporating the correlation structures to account for unobserved heterogeneity. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Bayesian hierarchical piecewise regression models: a tool to detect trajectory divergence between groups in long-term observational studies.

    PubMed

    Buscot, Marie-Jeanne; Wotherspoon, Simon S; Magnussen, Costan G; Juonala, Markus; Sabin, Matthew A; Burgner, David P; Lehtimäki, Terho; Viikari, Jorma S A; Hutri-Kähönen, Nina; Raitakari, Olli T; Thomson, Russell J

    2017-06-06

    Bayesian hierarchical piecewise regression (BHPR) modeling has not been previously formulated to detect and characterise the mechanism of trajectory divergence between groups of participants that have longitudinal responses with distinct developmental phases. These models are useful when participants in a prospective cohort study are grouped according to a distal dichotomous health outcome. Indeed, a refined understanding of how deleterious risk factor profiles develop across the life-course may help inform early-life interventions. Previous techniques to determine between-group differences in risk factors at each age may result in biased estimate of the age at divergence. We demonstrate the use of Bayesian hierarchical piecewise regression (BHPR) to generate a point estimate and credible interval for the age at which trajectories diverge between groups for continuous outcome measures that exhibit non-linear within-person response profiles over time. We illustrate our approach by modeling the divergence in childhood-to-adulthood body mass index (BMI) trajectories between two groups of adults with/without type 2 diabetes mellitus (T2DM) in the Cardiovascular Risk in Young Finns Study (YFS). Using the proposed BHPR approach, we estimated the BMI profiles of participants with T2DM diverged from healthy participants at age 16 years for males (95% credible interval (CI):13.5-18 years) and 21 years for females (95% CI: 19.5-23 years). These data suggest that a critical window for weight management intervention in preventing T2DM might exist before the age when BMI growth rate is naturally expected to decrease. Simulation showed that when using pairwise comparison of least-square means from categorical mixed models, smaller sample sizes tended to conclude a later age of divergence. In contrast, the point estimate of the divergence time is not biased by sample size when using the proposed BHPR method. BHPR is a powerful analytic tool to model long-term non-linear longitudinal outcomes, enabling the identification of the age at which risk factor trajectories diverge between groups of participants. The method is suitable for the analysis of unbalanced longitudinal data, with only a limited number of repeated measures per participants and where the time-related outcome is typically marked by transitional changes or by distinct phases of change over time.

  6. Bayesian Regression with Network Prior: Optimal Bayesian Filtering Perspective

    PubMed Central

    Qian, Xiaoning; Dougherty, Edward R.

    2017-01-01

    The recently introduced intrinsically Bayesian robust filter (IBRF) provides fully optimal filtering relative to a prior distribution over an uncertainty class ofjoint random process models, whereas formerly the theory was limited to model-constrained Bayesian robust filters, for which optimization was limited to the filters that are optimal for models in the uncertainty class. This paper extends the IBRF theory to the situation where there are both a prior on the uncertainty class and sample data. The result is optimal Bayesian filtering (OBF), where optimality is relative to the posterior distribution derived from the prior and the data. The IBRF theories for effective characteristics and canonical expansions extend to the OBF setting. A salient focus of the present work is to demonstrate the advantages of Bayesian regression within the OBF setting over the classical Bayesian approach in the context otlinear Gaussian models. PMID:28824268

  7. Multiplicative Forests for Continuous-Time Processes

    PubMed Central

    Weiss, Jeremy C.; Natarajan, Sriraam; Page, David

    2013-01-01

    Learning temporal dependencies between variables over continuous time is an important and challenging task. Continuous-time Bayesian networks effectively model such processes but are limited by the number of conditional intensity matrices, which grows exponentially in the number of parents per variable. We develop a partition-based representation using regression trees and forests whose parameter spaces grow linearly in the number of node splits. Using a multiplicative assumption we show how to update the forest likelihood in closed form, producing efficient model updates. Our results show multiplicative forests can be learned from few temporal trajectories with large gains in performance and scalability. PMID:25284967

  8. Multiplicative Forests for Continuous-Time Processes.

    PubMed

    Weiss, Jeremy C; Natarajan, Sriraam; Page, David

    2012-01-01

    Learning temporal dependencies between variables over continuous time is an important and challenging task. Continuous-time Bayesian networks effectively model such processes but are limited by the number of conditional intensity matrices, which grows exponentially in the number of parents per variable. We develop a partition-based representation using regression trees and forests whose parameter spaces grow linearly in the number of node splits. Using a multiplicative assumption we show how to update the forest likelihood in closed form, producing efficient model updates. Our results show multiplicative forests can be learned from few temporal trajectories with large gains in performance and scalability.

  9. Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination.

    PubMed

    Zhao, Qibin; Zhang, Liqing; Cichocki, Andrzej

    2015-09-01

    CANDECOMP/PARAFAC (CP) tensor factorization of incomplete data is a powerful technique for tensor completion through explicitly capturing the multilinear latent factors. The existing CP algorithms require the tensor rank to be manually specified, however, the determination of tensor rank remains a challenging problem especially for CP rank . In addition, existing approaches do not take into account uncertainty information of latent factors, as well as missing entries. To address these issues, we formulate CP factorization using a hierarchical probabilistic model and employ a fully Bayesian treatment by incorporating a sparsity-inducing prior over multiple latent factors and the appropriate hyperpriors over all hyperparameters, resulting in automatic rank determination. To learn the model, we develop an efficient deterministic Bayesian inference algorithm, which scales linearly with data size. Our method is characterized as a tuning parameter-free approach, which can effectively infer underlying multilinear factors with a low-rank constraint, while also providing predictive distributions over missing entries. Extensive simulations on synthetic data illustrate the intrinsic capability of our method to recover the ground-truth of CP rank and prevent the overfitting problem, even when a large amount of entries are missing. Moreover, the results from real-world applications, including image inpainting and facial image synthesis, demonstrate that our method outperforms state-of-the-art approaches for both tensor factorization and tensor completion in terms of predictive performance.

  10. Modeling Diagnostic Assessments with Bayesian Networks

    ERIC Educational Resources Information Center

    Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego

    2007-01-01

    This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…

  11. Comparing spatially varying coefficient models: a case study examining violent crime rates and their relationships to alcohol outlets and illegal drug arrests

    NASA Astrophysics Data System (ADS)

    Wheeler, David C.; Waller, Lance A.

    2009-03-01

    In this paper, we compare and contrast a Bayesian spatially varying coefficient process (SVCP) model with a geographically weighted regression (GWR) model for the estimation of the potentially spatially varying regression effects of alcohol outlets and illegal drug activity on violent crime in Houston, Texas. In addition, we focus on the inherent coefficient shrinkage properties of the Bayesian SVCP model as a way to address increased coefficient variance that follows from collinearity in GWR models. We outline the advantages of the Bayesian model in terms of reducing inflated coefficient variance, enhanced model flexibility, and more formal measuring of model uncertainty for prediction. We find spatially varying effects for alcohol outlets and drug violations, but the amount of variation depends on the type of model used. For the Bayesian model, this variation is controllable through the amount of prior influence placed on the variance of the coefficients. For example, the spatial pattern of coefficients is similar for the GWR and Bayesian models when a relatively large prior variance is used in the Bayesian model.

  12. Philosophy and the practice of Bayesian statistics

    PubMed Central

    Gelman, Andrew; Shalizi, Cosma Rohilla

    2015-01-01

    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. PMID:22364575

  13. Philosophy and the practice of Bayesian statistics.

    PubMed

    Gelman, Andrew; Shalizi, Cosma Rohilla

    2013-02-01

    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. © 2012 The British Psychological Society.

  14. Regression analysis using dependent Polya trees.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J

    2013-11-30

    Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

  15. Bayesian inference based on dual generalized order statistics from the exponentiated Weibull model

    NASA Astrophysics Data System (ADS)

    Al Sobhi, Mashail M.

    2015-02-01

    Bayesian estimation for the two parameters and the reliability function of the exponentiated Weibull model are obtained based on dual generalized order statistics (DGOS). Also, Bayesian prediction bounds for future DGOS from exponentiated Weibull model are obtained. The symmetric and asymmetric loss functions are considered for Bayesian computations. The Markov chain Monte Carlo (MCMC) methods are used for computing the Bayes estimates and prediction bounds. The results have been specialized to the lower record values. Comparisons are made between Bayesian and maximum likelihood estimators via Monte Carlo simulation.

  16. Fundamentals and Recent Developments in Approximate Bayesian Computation

    PubMed Central

    Lintusaari, Jarno; Gutmann, Michael U.; Dutta, Ritabrata; Kaski, Samuel; Corander, Jukka

    2017-01-01

    Abstract Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.] PMID:28175922

  17. Bayesian inversion of refraction seismic traveltime data

    NASA Astrophysics Data System (ADS)

    Ryberg, T.; Haberland, Ch

    2018-03-01

    We apply a Bayesian Markov chain Monte Carlo (McMC) formalism to the inversion of refraction seismic, traveltime data sets to derive 2-D velocity models below linear arrays (i.e. profiles) of sources and seismic receivers. Typical refraction data sets, especially when using the far-offset observations, are known as having experimental geometries which are very poor, highly ill-posed and far from being ideal. As a consequence, the structural resolution quickly degrades with depth. Conventional inversion techniques, based on regularization, potentially suffer from the choice of appropriate inversion parameters (i.e. number and distribution of cells, starting velocity models, damping and smoothing constraints, data noise level, etc.) and only local model space exploration. McMC techniques are used for exhaustive sampling of the model space without the need of prior knowledge (or assumptions) of inversion parameters, resulting in a large number of models fitting the observations. Statistical analysis of these models allows to derive an average (reference) solution and its standard deviation, thus providing uncertainty estimates of the inversion result. The highly non-linear character of the inversion problem, mainly caused by the experiment geometry, does not allow to derive a reference solution and error map by a simply averaging procedure. We present a modified averaging technique, which excludes parts of the prior distribution in the posterior values due to poor ray coverage, thus providing reliable estimates of inversion model properties even in those parts of the models. The model is discretized by a set of Voronoi polygons (with constant slowness cells) or a triangulated mesh (with interpolation within the triangles). Forward traveltime calculations are performed by a fast, finite-difference-based eikonal solver. The method is applied to a data set from a refraction seismic survey from Northern Namibia and compared to conventional tomography. An inversion test for a synthetic data set from a known model is also presented.

  18. Joint Bayesian Estimation of Quasar Continua and the Lyα Forest Flux Probability Distribution Function

    NASA Astrophysics Data System (ADS)

    Eilers, Anna-Christina; Hennawi, Joseph F.; Lee, Khee-Gan

    2017-08-01

    We present a new Bayesian algorithm making use of Markov Chain Monte Carlo sampling that allows us to simultaneously estimate the unknown continuum level of each quasar in an ensemble of high-resolution spectra, as well as their common probability distribution function (PDF) for the transmitted Lyα forest flux. This fully automated PDF regulated continuum fitting method models the unknown quasar continuum with a linear principal component analysis (PCA) basis, with the PCA coefficients treated as nuisance parameters. The method allows one to estimate parameters governing the thermal state of the intergalactic medium (IGM), such as the slope of the temperature-density relation γ -1, while marginalizing out continuum uncertainties in a fully Bayesian way. Using realistic mock quasar spectra created from a simplified semi-numerical model of the IGM, we show that this method recovers the underlying quasar continua to a precision of ≃ 7 % and ≃ 10 % at z = 3 and z = 5, respectively. Given the number of principal component spectra, this is comparable to the underlying accuracy of the PCA model itself. Most importantly, we show that we can achieve a nearly unbiased estimate of the slope γ -1 of the IGM temperature-density relation with a precision of +/- 8.6 % at z = 3 and +/- 6.1 % at z = 5, for an ensemble of ten mock high-resolution quasar spectra. Applying this method to real quasar spectra and comparing to a more realistic IGM model from hydrodynamical simulations would enable precise measurements of the thermal and cosmological parameters governing the IGM, albeit with somewhat larger uncertainties, given the increased flexibility of the model.

  19. A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E

    2013-06-01

    Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.

  20. Efficient fuzzy Bayesian inference algorithms for incorporating expert knowledge in parameter estimation

    NASA Astrophysics Data System (ADS)

    Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad

    2016-05-01

    Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert elicitation methodology is developed and applied to the real-world test case in order to provide a road map for the use of fuzzy Bayesian inference in groundwater modeling applications.

  1. Bayesian cross-validation for model evaluation and selection, with application to the North American Breeding Bird Survey

    USGS Publications Warehouse

    Link, William; Sauer, John R.

    2016-01-01

    The analysis of ecological data has changed in two important ways over the last 15 years. The development and easy availability of Bayesian computational methods has allowed and encouraged the fitting of complex hierarchical models. At the same time, there has been increasing emphasis on acknowledging and accounting for model uncertainty. Unfortunately, the ability to fit complex models has outstripped the development of tools for model selection and model evaluation: familiar model selection tools such as Akaike's information criterion and the deviance information criterion are widely known to be inadequate for hierarchical models. In addition, little attention has been paid to the evaluation of model adequacy in context of hierarchical modeling, i.e., to the evaluation of fit for a single model. In this paper, we describe Bayesian cross-validation, which provides tools for model selection and evaluation. We describe the Bayesian predictive information criterion and a Bayesian approximation to the BPIC known as the Watanabe-Akaike information criterion. We illustrate the use of these tools for model selection, and the use of Bayesian cross-validation as a tool for model evaluation, using three large data sets from the North American Breeding Bird Survey.

  2. Linear models of coregionalization for multivariate lattice data: Order-dependent and order-free cMCARs.

    PubMed

    MacNab, Ying C

    2016-08-01

    This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics. © The Author(s) 2016.

  3. Bayesian networks for maritime traffic accident prevention: benefits and challenges.

    PubMed

    Hänninen, Maria

    2014-12-01

    Bayesian networks are quantitative modeling tools whose applications to the maritime traffic safety context are becoming more popular. This paper discusses the utilization of Bayesian networks in maritime safety modeling. Based on literature and the author's own experiences, the paper studies what Bayesian networks can offer to maritime accident prevention and safety modeling and discusses a few challenges in their application to this context. It is argued that the capability of representing rather complex, not necessarily causal but uncertain relationships makes Bayesian networks an attractive modeling tool for the maritime safety and accidents. Furthermore, as the maritime accident and safety data is still rather scarce and has some quality problems, the possibility to combine data with expert knowledge and the easy way of updating the model after acquiring more evidence further enhance their feasibility. However, eliciting the probabilities from the maritime experts might be challenging and the model validation can be tricky. It is concluded that with the utilization of several data sources, Bayesian updating, dynamic modeling, and hidden nodes for latent variables, Bayesian networks are rather well-suited tools for the maritime safety management and decision-making. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Appraisal of jump distributions in ensemble-based sampling algorithms

    NASA Astrophysics Data System (ADS)

    Dejanic, Sanda; Scheidegger, Andreas; Rieckermann, Jörg; Albert, Carlo

    2017-04-01

    Sampling Bayesian posteriors of model parameters is often required for making model-based probabilistic predictions. For complex environmental models, standard Monte Carlo Markov Chain (MCMC) methods are often infeasible because they require too many sequential model runs. Therefore, we focused on ensemble methods that use many Markov chains in parallel, since they can be run on modern cluster architectures. Little is known about how to choose the best performing sampler, for a given application. A poor choice can lead to an inappropriate representation of posterior knowledge. We assessed two different jump moves, the stretch and the differential evolution move, underlying, respectively, the software packages EMCEE and DREAM, which are popular in different scientific communities. For the assessment, we used analytical posteriors with features as they often occur in real posteriors, namely high dimensionality, strong non-linear correlations or multimodality. For posteriors with non-linear features, standard convergence diagnostics based on sample means can be insufficient. Therefore, we resorted to an entropy-based convergence measure. We assessed the samplers by means of their convergence speed, robustness and effective sample sizes. For posteriors with strongly non-linear features, we found that the stretch move outperforms the differential evolution move, w.r.t. all three aspects.

  5. Spatio-temporal modelling of climate-sensitive disease risk: Towards an early warning system for dengue in Brazil

    NASA Astrophysics Data System (ADS)

    Lowe, Rachel; Bailey, Trevor C.; Stephenson, David B.; Graham, Richard J.; Coelho, Caio A. S.; Sá Carvalho, Marilia; Barcellos, Christovam

    2011-03-01

    This paper considers the potential for using seasonal climate forecasts in developing an early warning system for dengue fever epidemics in Brazil. In the first instance, a generalised linear model (GLM) is used to select climate and other covariates which are both readily available and prove significant in prediction of confirmed monthly dengue cases based on data collected across the whole of Brazil for the period January 2001 to December 2008 at the microregion level (typically consisting of one large city and several smaller municipalities). The covariates explored include temperature and precipitation data on a 2.5°×2.5° longitude-latitude grid with time lags relevant to dengue transmission, an El Niño Southern Oscillation index and other relevant socio-economic and environmental variables. A negative binomial model formulation is adopted in this model selection to allow for extra-Poisson variation (overdispersion) in the observed dengue counts caused by unknown/unobserved confounding factors and possible correlations in these effects in both time and space. Subsequently, the selected global model is refined in the context of the South East region of Brazil, where dengue predominates, by reverting to a Poisson framework and explicitly modelling the overdispersion through a combination of unstructured and spatio-temporal structured random effects. The resulting spatio-temporal hierarchical model (or GLMM—generalised linear mixed model) is implemented via a Bayesian framework using Markov Chain Monte Carlo (MCMC). Dengue predictions are found to be enhanced both spatially and temporally when using the GLMM and the Bayesian framework allows posterior predictive distributions for dengue cases to be derived, which can be useful for developing a dengue alert system. Using this model, we conclude that seasonal climate forecasts could have potential value in helping to predict dengue incidence months in advance of an epidemic in South East Brazil.

  6. Bayesian Framework for Water Quality Model Uncertainty Estimation and Risk Management

    EPA Science Inventory

    A formal Bayesian methodology is presented for integrated model calibration and risk-based water quality management using Bayesian Monte Carlo simulation and maximum likelihood estimation (BMCML). The primary focus is on lucid integration of model calibration with risk-based wat...

  7. Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method.

    PubMed

    Norris, Peter M; da Silva, Arlindo M

    2016-07-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.

  8. Low-rank separated representation surrogates of high-dimensional stochastic functions: Application in Bayesian inference

    NASA Astrophysics Data System (ADS)

    Validi, AbdoulAhad

    2014-03-01

    This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow.

  9. Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability Using High-Resolution Cloud Observations. Part 1: Method

    NASA Technical Reports Server (NTRS)

    Norris, Peter M.; Da Silva, Arlindo M.

    2016-01-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.

  10. Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method

    PubMed Central

    Norris, Peter M.; da Silva, Arlindo M.

    2018-01-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC. PMID:29618847

  11. Bayesian Normalization Model for Label-Free Quantitative Analysis by LC-MS

    PubMed Central

    Nezami Ranjbar, Mohammad R.; Tadesse, Mahlet G.; Wang, Yue; Ressom, Habtom W.

    2016-01-01

    We introduce a new method for normalization of data acquired by liquid chromatography coupled with mass spectrometry (LC-MS) in label-free differential expression analysis. Normalization of LC-MS data is desired prior to subsequent statistical analysis to adjust variabilities in ion intensities that are not caused by biological differences but experimental bias. There are different sources of bias including variabilities during sample collection and sample storage, poor experimental design, noise, etc. In addition, instrument variability in experiments involving a large number of LC-MS runs leads to a significant drift in intensity measurements. Although various methods have been proposed for normalization of LC-MS data, there is no universally applicable approach. In this paper, we propose a Bayesian normalization model (BNM) that utilizes scan-level information from LC-MS data. Specifically, the proposed method uses peak shapes to model the scan-level data acquired from extracted ion chromatograms (EIC) with parameters considered as a linear mixed effects model. We extended the model into BNM with drift (BNMD) to compensate for the variability in intensity measurements due to long LC-MS runs. We evaluated the performance of our method using synthetic and experimental data. In comparison with several existing methods, the proposed BNM and BNMD yielded significant improvement. PMID:26357332

  12. Reconciling differences in stratospheric ozone composites

    NASA Astrophysics Data System (ADS)

    Ball, William T.; Alsing, Justin; Mortlock, Daniel J.; Rozanov, Eugene V.; Tummon, Fiona; Haigh, Joanna D.

    2017-10-01

    Observations of stratospheric ozone from multiple instruments now span three decades; combining these into composite datasets allows long-term ozone trends to be estimated. Recently, several ozone composites have been published, but trends disagree by latitude and altitude, even between composites built upon the same instrument data. We confirm that the main causes of differences in decadal trend estimates lie in (i) steps in the composite time series when the instrument source data changes and (ii) artificial sub-decadal trends in the underlying instrument data. These artefacts introduce features that can alias with regressors in multiple linear regression (MLR) analysis; both can lead to inaccurate trend estimates. Here, we aim to remove these artefacts using Bayesian methods to infer the underlying ozone time series from a set of composites by building a joint-likelihood function using a Gaussian-mixture density to model outliers introduced by data artefacts, together with a data-driven prior on ozone variability that incorporates knowledge of problems during instrument operation. We apply this Bayesian self-calibration approach to stratospheric ozone in 10° bands from 60° S to 60° N and from 46 to 1 hPa (˜ 21-48 km) for 1985-2012. There are two main outcomes: (i) we independently identify and confirm many of the data problems previously identified, but which remain unaccounted for in existing composites; (ii) we construct an ozone composite, with uncertainties, that is free from most of these problems - we call this the BAyeSian Integrated and Consolidated (BASIC) composite. To analyse the new BASIC composite, we use dynamical linear modelling (DLM), which provides a more robust estimate of long-term changes through Bayesian inference than MLR. BASIC and DLM, together, provide a step forward in improving estimates of decadal trends. Our results indicate a significant recovery of ozone since 1998 in the upper stratosphere, of both northern and southern midlatitudes, in all four composites analysed, and particularly in the BASIC composite. The BASIC results also show no hemispheric difference in the recovery at midlatitudes, in contrast to an apparent feature that is present, but not consistent, in the four composites. Our overall conclusion is that it is possible to effectively combine different ozone composites and account for artefacts and drifts, and that this leads to a clear and significant result that upper stratospheric ozone levels have increased since 1998, following an earlier decline.

  13. The Spike-and-Slab Lasso Generalized Linear Models for Prediction and Associated Genes Detection.

    PubMed

    Tang, Zaixiang; Shen, Yueping; Zhang, Xinyan; Yi, Nengjun

    2017-01-01

    Large-scale "omics" data have been increasingly used as an important resource for prognostic prediction of diseases and detection of associated genes. However, there are considerable challenges in analyzing high-dimensional molecular data, including the large number of potential molecular predictors, limited number of samples, and small effect of each predictor. We propose new Bayesian hierarchical generalized linear models, called spike-and-slab lasso GLMs, for prognostic prediction and detection of associated genes using large-scale molecular data. The proposed model employs a spike-and-slab mixture double-exponential prior for coefficients that can induce weak shrinkage on large coefficients, and strong shrinkage on irrelevant coefficients. We have developed a fast and stable algorithm to fit large-scale hierarchal GLMs by incorporating expectation-maximization (EM) steps into the fast cyclic coordinate descent algorithm. The proposed approach integrates nice features of two popular methods, i.e., penalized lasso and Bayesian spike-and-slab variable selection. The performance of the proposed method is assessed via extensive simulation studies. The results show that the proposed approach can provide not only more accurate estimates of the parameters, but also better prediction. We demonstrate the proposed procedure on two cancer data sets: a well-known breast cancer data set consisting of 295 tumors, and expression data of 4919 genes; and the ovarian cancer data set from TCGA with 362 tumors, and expression data of 5336 genes. Our analyses show that the proposed procedure can generate powerful models for predicting outcomes and detecting associated genes. The methods have been implemented in a freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/). Copyright © 2017 by the Genetics Society of America.

  14. A Comparison of General Diagnostic Models (GDM) and Bayesian Networks Using a Middle School Mathematics Test

    ERIC Educational Resources Information Center

    Wu, Haiyan

    2013-01-01

    General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…

  15. Perceptual decision making: drift-diffusion model is equivalent to a Bayesian model

    PubMed Central

    Bitzer, Sebastian; Park, Hame; Blankenburg, Felix; Kiebel, Stefan J.

    2014-01-01

    Behavioral data obtained with perceptual decision making experiments are typically analyzed with the drift-diffusion model. This parsimonious model accumulates noisy pieces of evidence toward a decision bound to explain the accuracy and reaction times of subjects. Recently, Bayesian models have been proposed to explain how the brain extracts information from noisy input as typically presented in perceptual decision making tasks. It has long been known that the drift-diffusion model is tightly linked with such functional Bayesian models but the precise relationship of the two mechanisms was never made explicit. Using a Bayesian model, we derived the equations which relate parameter values between these models. In practice we show that this equivalence is useful when fitting multi-subject data. We further show that the Bayesian model suggests different decision variables which all predict equal responses and discuss how these may be discriminated based on neural correlates of accumulated evidence. In addition, we discuss extensions to the Bayesian model which would be difficult to derive for the drift-diffusion model. We suggest that these and other extensions may be highly useful for deriving new experiments which test novel hypotheses. PMID:24616689

  16. Construction of monitoring model and algorithm design on passenger security during shipping based on improved Bayesian network.

    PubMed

    Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng

    2014-01-01

    A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping.

  17. Construction of Monitoring Model and Algorithm Design on Passenger Security during Shipping Based on Improved Bayesian Network

    PubMed Central

    Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng

    2014-01-01

    A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping. PMID:25254227

  18. Hierarchical Bayesian Modeling of Fluid-Induced Seismicity

    NASA Astrophysics Data System (ADS)

    Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.

    2017-11-01

    In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.

  19. Estimating virus occurrence using Bayesian modeling in multiple drinking water systems of the United States

    USGS Publications Warehouse

    Varughese, Eunice A.; Brinkman, Nichole E; Anneken, Emily M; Cashdollar, Jennifer S; Fout, G. Shay; Furlong, Edward T.; Kolpin, Dana W.; Glassmeyer, Susan T.; Keely, Scott P

    2017-01-01

    incorporated into a Bayesian model to more accurately determine viral load in both source and treated water. Results of the Bayesian model indicated that viruses are present in source water and treated water. By using a Bayesian framework that incorporates inhibition, as well as many other parameters that affect viral detection, this study offers an approach for more accurately estimating the occurrence of viral pathogens in environmental waters.

  20. A local approach for focussed Bayesian fusion

    NASA Astrophysics Data System (ADS)

    Sander, Jennifer; Heizmann, Michael; Goussev, Igor; Beyerer, Jürgen

    2009-04-01

    Local Bayesian fusion approaches aim to reduce high storage and computational costs of Bayesian fusion which is separated from fixed modeling assumptions. Using the small world formalism, we argue why this proceeding is conform with Bayesian theory. Then, we concentrate on the realization of local Bayesian fusion by focussing the fusion process solely on local regions that are task relevant with a high probability. The resulting local models correspond then to restricted versions of the original one. In a previous publication, we used bounds for the probability of misleading evidence to show the validity of the pre-evaluation of task specific knowledge and prior information which we perform to build local models. In this paper, we prove the validity of this proceeding using information theoretic arguments. For additional efficiency, local Bayesian fusion can be realized in a distributed manner. Here, several local Bayesian fusion tasks are evaluated and unified after the actual fusion process. For the practical realization of distributed local Bayesian fusion, software agents are predestinated. There is a natural analogy between the resulting agent based architecture and criminal investigations in real life. We show how this analogy can be used to improve the efficiency of distributed local Bayesian fusion additionally. Using a landscape model, we present an experimental study of distributed local Bayesian fusion in the field of reconnaissance, which highlights its high potential.

  1. Functional form and risk adjustment of hospital costs: Bayesian analysis of a Box-Cox random coefficients model.

    PubMed

    Hollenbeak, Christopher S

    2005-10-15

    While risk-adjusted outcomes are often used to compare the performance of hospitals and physicians, the most appropriate functional form for the risk adjustment process is not always obvious for continuous outcomes such as costs. Semi-log models are used most often to correct skewness in cost data, but there has been limited research to determine whether the log transformation is sufficient or whether another transformation is more appropriate. This study explores the most appropriate functional form for risk-adjusting the cost of coronary artery bypass graft (CABG) surgery. Data included patients undergoing CABG surgery at four hospitals in the midwest and were fit to a Box-Cox model with random coefficients (BCRC) using Markov chain Monte Carlo methods. Marginal likelihoods and Bayes factors were computed to perform model comparison of alternative model specifications. Rankings of hospital performance were created from the simulation output and the rankings produced by Bayesian estimates were compared to rankings produced by standard models fit using classical methods. Results suggest that, for these data, the most appropriate functional form is not logarithmic, but corresponds to a Box-Cox transformation of -1. Furthermore, Bayes factors overwhelmingly rejected the natural log transformation. However, the hospital ranking induced by the BCRC model was not different from the ranking produced by maximum likelihood estimates of either the linear or semi-log model. Copyright (c) 2005 John Wiley & Sons, Ltd.

  2. Bayesian inversion of the global present-day GIA signal uncertainty from RSL data

    NASA Astrophysics Data System (ADS)

    Caron, Lambert; Ivins, Erik R.; Adhikari, Surendra; Larour, Eric

    2017-04-01

    Various geophysical signals measured in the process of studying the present-day climate change (such as changes in the Earth gravitational potential, ocean altimery or GPS data) include a secular Glacial Isostatic Adjustment contribution that has to be corrected for. Yet, one of the current major challenges that Glacial Isostatic Adjustment modelling is currently struggling with is to accurately determine the uncertainty of the predicted present-day GIA signal. This is especially true at the global scale, where coupling between ice history and mantle rheology greatly contributes to the non-uniqueness of the solutions. Here we propose to use more than 11000 paleo sea level records to constrain a set of GIA Bayesian inversions and thoroughly explore its parameters space. We include two linearly relaxing models to represent the mantle rheology and couple them with a scalable ice history model in order to better assess the non-uniqueness of the solutions. From the resulting estimates of the Probability Density Function, we then extract maps of uncertainty affecting the present-day vertical land motion and geoid due to GIA at the global scale, and their associated expectation of the signal.

  3. On uncertainty quantification in hydrogeology and hydrogeophysics

    NASA Astrophysics Data System (ADS)

    Linde, Niklas; Ginsbourger, David; Irving, James; Nobile, Fabio; Doucet, Arnaud

    2017-12-01

    Recent advances in sensor technologies, field methodologies, numerical modeling, and inversion approaches have contributed to unprecedented imaging of hydrogeological properties and detailed predictions at multiple temporal and spatial scales. Nevertheless, imaging results and predictions will always remain imprecise, which calls for appropriate uncertainty quantification (UQ). In this paper, we outline selected methodological developments together with pioneering UQ applications in hydrogeology and hydrogeophysics. The applied mathematics and statistics literature is not easy to penetrate and this review aims at helping hydrogeologists and hydrogeophysicists to identify suitable approaches for UQ that can be applied and further developed to their specific needs. To bypass the tremendous computational costs associated with forward UQ based on full-physics simulations, we discuss proxy-modeling strategies and multi-resolution (Multi-level Monte Carlo) methods. We consider Bayesian inversion for non-linear and non-Gaussian state-space problems and discuss how Sequential Monte Carlo may become a practical alternative. We also describe strategies to account for forward modeling errors in Bayesian inversion. Finally, we consider hydrogeophysical inversion, where petrophysical uncertainty is often ignored leading to overconfident parameter estimation. The high parameter and data dimensions encountered in hydrogeological and geophysical problems make UQ a complicated and important challenge that has only been partially addressed to date.

  4. A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction

    PubMed Central

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R.; Buenrostro-Mariscal, Raymundo

    2017-01-01

    There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. PMID:28391241

  5. A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction.

    PubMed

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R; Buenrostro-Mariscal, Raymundo

    2017-06-07

    There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. Copyright © 2017 Montesinos-López et al.

  6. Role of Statistical Random-Effects Linear Models in Personalized Medicine.

    PubMed

    Diaz, Francisco J; Yeh, Hung-Wen; de Leon, Jose

    2012-03-01

    Some empirical studies and recent developments in pharmacokinetic theory suggest that statistical random-effects linear models are valuable tools that allow describing simultaneously patient populations as a whole and patients as individuals. This remarkable characteristic indicates that these models may be useful in the development of personalized medicine, which aims at finding treatment regimes that are appropriate for particular patients, not just appropriate for the average patient. In fact, published developments show that random-effects linear models may provide a solid theoretical framework for drug dosage individualization in chronic diseases. In particular, individualized dosages computed with these models by means of an empirical Bayesian approach may produce better results than dosages computed with some methods routinely used in therapeutic drug monitoring. This is further supported by published empirical and theoretical findings that show that random effects linear models may provide accurate representations of phase III and IV steady-state pharmacokinetic data, and may be useful for dosage computations. These models have applications in the design of clinical algorithms for drug dosage individualization in chronic diseases; in the computation of dose correction factors; computation of the minimum number of blood samples from a patient that are necessary for calculating an optimal individualized drug dosage in therapeutic drug monitoring; measure of the clinical importance of clinical, demographic, environmental or genetic covariates; study of drug-drug interactions in clinical settings; the implementation of computational tools for web-site-based evidence farming; design of pharmacogenomic studies; and in the development of a pharmacological theory of dosage individualization.

  7. Massive parallelization of serial inference algorithms for a complex generalized linear model

    PubMed Central

    Suchard, Marc A.; Simpson, Shawn E.; Zorych, Ivan; Ryan, Patrick; Madigan, David

    2014-01-01

    Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we show how high-performance statistical computation, including graphics processing units, relatively inexpensive highly parallel computing devices, can enable complex methods in large databases. We focus on optimization and massive parallelization of cyclic coordinate descent approaches to fit a conditioned generalized linear model involving tens of millions of observations and thousands of predictors in a Bayesian context. We find orders-of-magnitude improvement in overall run-time. Coordinate descent approaches are ubiquitous in high-dimensional statistics and the algorithms we propose open up exciting new methodological possibilities with the potential to significantly improve drug safety. PMID:25328363

  8. Stochastic static fault slip inversion from geodetic data with non-negativity and bound constraints

    NASA Astrophysics Data System (ADS)

    Nocquet, J.-M.

    2018-07-01

    Despite surface displacements observed by geodesy are linear combinations of slip at faults in an elastic medium, determining the spatial distribution of fault slip remains a ill-posed inverse problem. A widely used approach to circumvent the illness of the inversion is to add regularization constraints in terms of smoothing and/or damping so that the linear system becomes invertible. However, the choice of regularization parameters is often arbitrary, and sometimes leads to significantly different results. Furthermore, the resolution analysis is usually empirical and cannot be made independently of the regularization. The stochastic approach of inverse problems provides a rigorous framework where the a priori information about the searched parameters is combined with the observations in order to derive posterior probabilities of the unkown parameters. Here, I investigate an approach where the prior probability density function (pdf) is a multivariate Gaussian function, with single truncation to impose positivity of slip or double truncation to impose positivity and upper bounds on slip for interseismic modelling. I show that the joint posterior pdf is similar to the linear untruncated Gaussian case and can be expressed as a truncated multivariate normal (TMVN) distribution. The TMVN form can then be used to obtain semi-analytical formulae for the single, 2-D or n-D marginal pdf. The semi-analytical formula involves the product of a Gaussian by an integral term that can be evaluated using recent developments in TMVN probabilities calculations. Posterior mean and covariance can also be efficiently derived. I show that the maximum posterior (MAP) can be obtained using a non-negative least-squares algorithm for the single truncated case or using the bounded-variable least-squares algorithm for the double truncated case. I show that the case of independent uniform priors can be approximated using TMVN. The numerical equivalence to Bayesian inversions using Monte Carlo Markov chain (MCMC) sampling is shown for a synthetic example and a real case for interseismic modelling in Central Peru. The TMVN method overcomes several limitations of the Bayesian approach using MCMC sampling. First, the need of computer power is largely reduced. Second, unlike Bayesian MCMC-based approach, marginal pdf, mean, variance or covariance are obtained independently one from each other. Third, the probability and cumulative density functions can be obtained with any density of points. Finally, determining the MAP is extremely fast.

  9. On the null distribution of Bayes factors in linear regression

    USDA-ARS?s Scientific Manuscript database

    We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...

  10. Bayesian Gaussian regression analysis of malnutrition for children under five years of age in Ethiopia, EMDHS 2014.

    PubMed

    Mohammed, Seid; Asfaw, Zeytu G

    2018-01-01

    The term malnutrition generally refers to both under-nutrition and over-nutrition, but this study uses the term to refer solely to a deficiency of nutrition. In Ethiopia, child malnutrition is one of the most serious public health problem and the highest in the world. The purpose of the present study was to identify the high risk factors of malnutrition and test different statistical models for childhood malnutrition and, thereafter weighing the preferable model through model comparison criteria. Bayesian Gaussian regression model was used to analyze the effect of selected socioeconomic, demographic, health and environmental covariates on malnutrition under five years old child's. Inference was made using Bayesian approach based on Markov Chain Monte Carlo (MCMC) simulation techniques in BayesX. The study found that the variables such as sex of a child, preceding birth interval, age of the child, father's education level, source of water, mother's body mass index, head of household sex, mother's age at birth, wealth index, birth order, diarrhea, child's size at birth and duration of breast feeding showed significant effects on children's malnutrition in Ethiopia. The age of child, mother's age at birth and mother's body mass index could also be important factors with a non linear effect for the child's malnutrition in Ethiopia. Thus, the present study emphasizes a special care on variables such as sex of child, preceding birth interval, father's education level, source of water, sex of head of household, wealth index, birth order, diarrhea, child's size at birth, duration of breast feeding, age of child, mother's age at birth and mother's body mass index to combat childhood malnutrition in developing countries.

  11. Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data

    ERIC Educational Resources Information Center

    Lee, Sik-Yum

    2006-01-01

    A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…

  12. Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837

    ERIC Educational Resources Information Center

    Levy, Roy

    2014-01-01

    Digital games offer an appealing environment for assessing student proficiencies, including skills and misconceptions in a diagnostic setting. This paper proposes a dynamic Bayesian network modeling approach for observations of student performance from an educational video game. A Bayesian approach to model construction, calibration, and use in…

  13. Temporal BYY encoding, Markovian state spaces, and space dimension determination.

    PubMed

    Xu, Lei

    2004-09-01

    As a complementary to those temporal coding approaches of the current major stream, this paper aims at the Markovian state space temporal models from the perspective of the temporal Bayesian Ying-Yang (BYY) learning with both new insights and new results on not only the discrete state featured Hidden Markov model and extensions but also the continuous state featured linear state spaces and extensions, especially with a new learning mechanism that makes selection of the state number or the dimension of state space either automatically during adaptive learning or subsequently after learning via model selection criteria obtained from this mechanism. Experiments are demonstrated to show how the proposed approach works.

  14. Bayesian techniques for analyzing group differences in the Iowa Gambling Task: A case study of intuitive and deliberate decision-makers.

    PubMed

    Steingroever, Helen; Pachur, Thorsten; Šmíra, Martin; Lee, Michael D

    2018-06-01

    The Iowa Gambling Task (IGT) is one of the most popular experimental paradigms for comparing complex decision-making across groups. Most commonly, IGT behavior is analyzed using frequentist tests to compare performance across groups, and to compare inferred parameters of cognitive models developed for the IGT. Here, we present a Bayesian alternative based on Bayesian repeated-measures ANOVA for comparing performance, and a suite of three complementary model-based methods for assessing the cognitive processes underlying IGT performance. The three model-based methods involve Bayesian hierarchical parameter estimation, Bayes factor model comparison, and Bayesian latent-mixture modeling. We illustrate these Bayesian methods by applying them to test the extent to which differences in intuitive versus deliberate decision style are associated with differences in IGT performance. The results show that intuitive and deliberate decision-makers behave similarly on the IGT, and the modeling analyses consistently suggest that both groups of decision-makers rely on similar cognitive processes. Our results challenge the notion that individual differences in intuitive and deliberate decision styles have a broad impact on decision-making. They also highlight the advantages of Bayesian methods, especially their ability to quantify evidence in favor of the null hypothesis, and that they allow model-based analyses to incorporate hierarchical and latent-mixture structures.

  15. Invited commentary: Lost in estimation--searching for alternatives to markov chains to fit complex Bayesian models.

    PubMed

    Molitor, John

    2012-03-01

    Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.

  16. The Bayesian reader: explaining word recognition as an optimal Bayesian decision process.

    PubMed

    Norris, Dennis

    2006-04-01

    This article presents a theory of visual word recognition that assumes that, in the tasks of word identification, lexical decision, and semantic categorization, human readers behave as optimal Bayesian decision makers. This leads to the development of a computational model of word recognition, the Bayesian reader. The Bayesian reader successfully simulates some of the most significant data on human reading. The model accounts for the nature of the function relating word frequency to reaction time and identification threshold, the effects of neighborhood density and its interaction with frequency, and the variation in the pattern of neighborhood density effects seen in different experimental tasks. Both the general behavior of the model and the way the model predicts different patterns of results in different tasks follow entirely from the assumption that human readers approximate optimal Bayesian decision makers. ((c) 2006 APA, all rights reserved).

  17. A statistical assessment of seismic models of the U.S. continental crust using Bayesian inversion of ambient noise surface wave dispersion data

    NASA Astrophysics Data System (ADS)

    Olugboji, T. M.; Lekic, V.; McDonough, W.

    2017-07-01

    We present a new approach for evaluating existing crustal models using ambient noise data sets and its associated uncertainties. We use a transdimensional hierarchical Bayesian inversion approach to invert ambient noise surface wave phase dispersion maps for Love and Rayleigh waves using measurements obtained from Ekström (2014). Spatiospectral analysis shows that our results are comparable to a linear least squares inverse approach (except at higher harmonic degrees), but the procedure has additional advantages: (1) it yields an autoadaptive parameterization that follows Earth structure without making restricting assumptions on model resolution (regularization or damping) and data errors; (2) it can recover non-Gaussian phase velocity probability distributions while quantifying the sources of uncertainties in the data measurements and modeling procedure; and (3) it enables statistical assessments of different crustal models (e.g., CRUST1.0, LITHO1.0, and NACr14) using variable resolution residual and standard deviation maps estimated from the ensemble. These assessments show that in the stable old crust of the Archean, the misfits are statistically negligible, requiring no significant update to crustal models from the ambient noise data set. In other regions of the U.S., significant updates to regionalization and crustal structure are expected especially in the shallow sedimentary basins and the tectonically active regions, where the differences between model predictions and data are statistically significant.

  18. Continuous-time discrete-space models for animal movement

    USGS Publications Warehouse

    Hanks, Ephraim M.; Hooten, Mevin B.; Alldredge, Mat W.

    2015-01-01

    The processes influencing animal movement and resource selection are complex and varied. Past efforts to model behavioral changes over time used Bayesian statistical models with variable parameter space, such as reversible-jump Markov chain Monte Carlo approaches, which are computationally demanding and inaccessible to many practitioners. We present a continuous-time discrete-space (CTDS) model of animal movement that can be fit using standard generalized linear modeling (GLM) methods. This CTDS approach allows for the joint modeling of location-based as well as directional drivers of movement. Changing behavior over time is modeled using a varying-coefficient framework which maintains the computational simplicity of a GLM approach, and variable selection is accomplished using a group lasso penalty. We apply our approach to a study of two mountain lions (Puma concolor) in Colorado, USA.

  19. Bayesian flood forecasting methods: A review

    NASA Astrophysics Data System (ADS)

    Han, Shasha; Coulibaly, Paulin

    2017-08-01

    Over the past few decades, floods have been seen as one of the most common and largely distributed natural disasters in the world. If floods could be accurately forecasted in advance, then their negative impacts could be greatly minimized. It is widely recognized that quantification and reduction of uncertainty associated with the hydrologic forecast is of great importance for flood estimation and rational decision making. Bayesian forecasting system (BFS) offers an ideal theoretic framework for uncertainty quantification that can be developed for probabilistic flood forecasting via any deterministic hydrologic model. It provides suitable theoretical structure, empirically validated models and reasonable analytic-numerical computation method, and can be developed into various Bayesian forecasting approaches. This paper presents a comprehensive review on Bayesian forecasting approaches applied in flood forecasting from 1999 till now. The review starts with an overview of fundamentals of BFS and recent advances in BFS, followed with BFS application in river stage forecasting and real-time flood forecasting, then move to a critical analysis by evaluating advantages and limitations of Bayesian forecasting methods and other predictive uncertainty assessment approaches in flood forecasting, and finally discusses the future research direction in Bayesian flood forecasting. Results show that the Bayesian flood forecasting approach is an effective and advanced way for flood estimation, it considers all sources of uncertainties and produces a predictive distribution of the river stage, river discharge or runoff, thus gives more accurate and reliable flood forecasts. Some emerging Bayesian forecasting methods (e.g. ensemble Bayesian forecasting system, Bayesian multi-model combination) were shown to overcome limitations of single model or fixed model weight and effectively reduce predictive uncertainty. In recent years, various Bayesian flood forecasting approaches have been developed and widely applied, but there is still room for improvements. Future research in the context of Bayesian flood forecasting should be on assimilation of various sources of newly available information and improvement of predictive performance assessment methods.

  20. Bayesian modeling of flexible cognitive control

    PubMed Central

    Jiang, Jiefeng; Heller, Katherine; Egner, Tobias

    2014-01-01

    “Cognitive control” describes endogenous guidance of behavior in situations where routine stimulus-response associations are suboptimal for achieving a desired goal. The computational and neural mechanisms underlying this capacity remain poorly understood. We examine recent advances stemming from the application of a Bayesian learner perspective that provides optimal prediction for control processes. In reviewing the application of Bayesian models to cognitive control, we note that an important limitation in current models is a lack of a plausible mechanism for the flexible adjustment of control over conflict levels changing at varying temporal scales. We then show that flexible cognitive control can be achieved by a Bayesian model with a volatility-driven learning mechanism that modulates dynamically the relative dependence on recent and remote experiences in its prediction of future control demand. We conclude that the emergent Bayesian perspective on computational mechanisms of cognitive control holds considerable promise, especially if future studies can identify neural substrates of the variables encoded by these models, and determine the nature (Bayesian or otherwise) of their neural implementation. PMID:24929218

  1. Bayesian statistics in medicine: a 25 year review.

    PubMed

    Ashby, Deborah

    2006-11-15

    This review examines the state of Bayesian thinking as Statistics in Medicine was launched in 1982, reflecting particularly on its applicability and uses in medical research. It then looks at each subsequent five-year epoch, with a focus on papers appearing in Statistics in Medicine, putting these in the context of major developments in Bayesian thinking and computation with reference to important books, landmark meetings and seminal papers. It charts the growth of Bayesian statistics as it is applied to medicine and makes predictions for the future. From sparse beginnings, where Bayesian statistics was barely mentioned, Bayesian statistics has now permeated all the major areas of medical statistics, including clinical trials, epidemiology, meta-analyses and evidence synthesis, spatial modelling, longitudinal modelling, survival modelling, molecular genetics and decision-making in respect of new technologies.

  2. Bayesian least squares deconvolution

    NASA Astrophysics Data System (ADS)

    Asensio Ramos, A.; Petit, P.

    2015-11-01

    Aims: We develop a fully Bayesian least squares deconvolution (LSD) that can be applied to the reliable detection of magnetic signals in noise-limited stellar spectropolarimetric observations using multiline techniques. Methods: We consider LSD under the Bayesian framework and we introduce a flexible Gaussian process (GP) prior for the LSD profile. This prior allows the result to automatically adapt to the presence of signal. We exploit several linear algebra identities to accelerate the calculations. The final algorithm can deal with thousands of spectral lines in a few seconds. Results: We demonstrate the reliability of the method with synthetic experiments and we apply it to real spectropolarimetric observations of magnetic stars. We are able to recover the magnetic signals using a small number of spectral lines, together with the uncertainty at each velocity bin. This allows the user to consider if the detected signal is reliable. The code to compute the Bayesian LSD profile is freely available.

  3. Asteroid orbital error analysis: Theory and application

    NASA Technical Reports Server (NTRS)

    Muinonen, K.; Bowell, Edward

    1992-01-01

    We present a rigorous Bayesian theory for asteroid orbital error estimation in which the probability density of the orbital elements is derived from the noise statistics of the observations. For Gaussian noise in a linearized approximation the probability density is also Gaussian, and the errors of the orbital elements at a given epoch are fully described by the covariance matrix. The law of error propagation can then be applied to calculate past and future positional uncertainty ellipsoids (Cappellari et al. 1976, Yeomans et al. 1987, Whipple et al. 1991). To our knowledge, this is the first time a Bayesian approach has been formulated for orbital element estimation. In contrast to the classical Fisherian school of statistics, the Bayesian school allows a priori information to be formally present in the final estimation. However, Bayesian estimation does give the same results as Fisherian estimation when no priori information is assumed (Lehtinen 1988, and reference therein).

  4. Multiscale Bayesian neural networks for soil water content estimation

    NASA Astrophysics Data System (ADS)

    Jana, Raghavendra B.; Mohanty, Binayak P.; Springer, Everett P.

    2008-08-01

    Artificial neural networks (ANN) have been used for some time now to estimate soil hydraulic parameters from other available or more easily measurable soil properties. However, most such uses of ANNs as pedotransfer functions (PTFs) have been at matching spatial scales (1:1) of inputs and outputs. This approach assumes that the outputs are only required at the same scale as the input data. Unfortunately, this is rarely true. Different hydrologic, hydroclimatic, and contaminant transport models require soil hydraulic parameter data at different spatial scales, depending upon their grid sizes. While conventional (deterministic) ANNs have been traditionally used in these studies, the use of Bayesian training of ANNs is a more recent development. In this paper, we develop a Bayesian framework to derive soil water retention function including its uncertainty at the point or local scale using PTFs trained with coarser-scale Soil Survey Geographic (SSURGO)-based soil data. The approach includes an ANN trained with Bayesian techniques as a PTF tool with training and validation data collected across spatial extents (scales) in two different regions in the United States. The two study areas include the Las Cruces Trench site in the Rio Grande basin of New Mexico, and the Southern Great Plains 1997 (SGP97) hydrology experimental region in Oklahoma. Each region-specific Bayesian ANN is trained using soil texture and bulk density data from the SSURGO database (scale 1:24,000), and predictions of the soil water contents at different pressure heads with point scale data (1:1) inputs are made. The resulting outputs are corrected for bias using both linear and nonlinear correction techniques. The results show good agreement between the soil water content values measured at the point scale and those predicted by the Bayesian ANN-based PTFs for both the study sites. Overall, Bayesian ANNs coupled with nonlinear bias correction are found to be very suitable tools for deriving soil hydraulic parameters at the local/fine scale from soil physical properties at coarser-scale and across different spatial extents. This approach could potentially be used for soil hydraulic properties estimation and downscaling.

  5. Genetic parameters of body weight and ascites in broilers: effect of different incidence rates of ascites syndrome.

    PubMed

    Ahmadpanah, J; Ghavi Hossein-Zadeh, N; Shadparvar, A A; Pakdel, A

    2017-02-01

    1. The objectives of the current study were to investigate the effect of incidence rate (5%, 10%, 20%, 30% and 50%) of ascites syndrome on the expression of genetic characteristics for body weight at 5 weeks of age (BW5) and AS and to compare different methods of genetic parameter estimation for these traits. 2. Based on stochastic simulation, a population with discrete generations was created in which random mating was used for 10 generations. Two methods of restricted maximum likelihood and Bayesian approach via Gibbs sampling were used for the estimation of genetic parameters. A bivariate model including maternal effects was used. The root mean square error for direct heritabilities was also calculated. 3. The results showed that when incidence rates of ascites increased from 5% to 30%, the heritability of AS increased from 0.013 and 0.005 to 0.110 and 0.162 for linear and threshold models, respectively. 4. Maternal effects were significant for both BW5 and AS. Genetic correlations were decreased by increasing incidence rates of ascites in the population from 0.678 and 0.587 at 5% level of ascites to 0.393 and -0.260 at 50% occurrence for linear and threshold models, respectively. 5. The RMSE of direct heritability from true values for BW5 was greater based on a linear-threshold model compared with the linear model of analysis (0.0092 vs. 0.0015). The RMSE of direct heritability from true values for AS was greater based on a linear-linear model (1.21 vs. 1.14). 6. In order to rank birds for ascites incidence, it is recommended to use a threshold model because it resulted in higher heritability estimates compared with the linear model and that BW5 could be one of the main components of selection goals.

  6. Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology

    PubMed Central

    Murakami, Yohei

    2014-01-01

    Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832

  7. Diagnostics for generalized linear hierarchical models in network meta-analysis.

    PubMed

    Zhao, Hong; Hodges, James S; Carlin, Bradley P

    2017-09-01

    Network meta-analysis (NMA) combines direct and indirect evidence comparing more than 2 treatments. Inconsistency arises when these 2 information sources differ. Previous work focuses on inconsistency detection, but little has been done on how to proceed after identifying inconsistency. The key issue is whether inconsistency changes an NMA's substantive conclusions. In this paper, we examine such discrepancies from a diagnostic point of view. Our methods seek to detect influential and outlying observations in NMA at a trial-by-arm level. These observations may have a large effect on the parameter estimates in NMA, or they may deviate markedly from other observations. We develop formal diagnostics for a Bayesian hierarchical model to check the effect of deleting any observation. Diagnostics are specified for generalized linear hierarchical NMA models and investigated for both published and simulated datasets. Results from our example dataset using either contrast- or arm-based models and from the simulated datasets indicate that the sources of inconsistency in NMA tend not to be influential, though results from the example dataset suggest that they are likely to be outliers. This mimics a familiar result from linear model theory, in which outliers with low leverage are not influential. Future extensions include incorporating baseline covariates and individual-level patient data. Copyright © 2017 John Wiley & Sons, Ltd.

  8. Predicting the geographical distribution of two invasive termite species from occurrence data.

    PubMed

    Tonini, Francesco; Divino, Fabio; Lasinio, Giovanna Jona; Hochmair, Hartwig H; Scheffrahn, Rudolf H

    2014-10-01

    Predicting the potential habitat of species under both current and future climate change scenarios is crucial for monitoring invasive species and understanding a species' response to different environmental conditions. Frequently, the only data available on a species is the location of its occurrence (presence-only data). Using occurrence records only, two models were used to predict the geographical distribution of two destructive invasive termite species, Coptotermes gestroi (Wasmann) and Coptotermes formosanus Shiraki. The first model uses a Bayesian linear logistic regression approach adjusted for presence-only data while the second one is the widely used maximum entropy approach (Maxent). Results show that the predicted distributions of both C. gestroi and C. formosanus are strongly linked to urban development. The impact of future scenarios such as climate warming and population growth on the biotic distribution of both termite species was also assessed. Future climate warming seems to affect their projected probability of presence to a lesser extent than population growth. The Bayesian logistic approach outperformed Maxent consistently in all models according to evaluation criteria such as model sensitivity and ecological realism. The importance of further studies for an explicit treatment of residual spatial autocorrelation and a more comprehensive comparison between both statistical approaches is suggested.

  9. A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

    DTIC Science & Technology

    2017-09-01

    efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components

  10. Bayesian Learning and the Psychology of Rule Induction

    ERIC Educational Resources Information Center

    Endress, Ansgar D.

    2013-01-01

    In recent years, Bayesian learning models have been applied to an increasing variety of domains. While such models have been criticized on theoretical grounds, the underlying assumptions and predictions are rarely made concrete and tested experimentally. Here, I use Frank and Tenenbaum's (2011) Bayesian model of rule-learning as a case study to…

  11. Properties of the Bayesian Knowledge Tracing Model

    ERIC Educational Resources Information Center

    van de Sande, Brett

    2013-01-01

    Bayesian Knowledge Tracing is used very widely to model student learning. It comes in two different forms: The first form is the Bayesian Knowledge Tracing "hidden Markov model" which predicts the probability of correct application of a skill as a function of the number of previous opportunities to apply that skill and the model…

  12. Bayesian Analysis of Longitudinal Data Using Growth Curve Models

    ERIC Educational Resources Information Center

    Zhang, Zhiyong; Hamagami, Fumiaki; Wang, Lijuan Lijuan; Nesselroade, John R.; Grimm, Kevin J.

    2007-01-01

    Bayesian methods for analyzing longitudinal data in social and behavioral research are recommended for their ability to incorporate prior information in estimating simple and complex models. We first summarize the basics of Bayesian methods before presenting an empirical example in which we fit a latent basis growth curve model to achievement data…

  13. Testing students' e-learning via Facebook through Bayesian structural equation modeling.

    PubMed

    Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad

    2017-01-01

    Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.

  14. Testing students’ e-learning via Facebook through Bayesian structural equation modeling

    PubMed Central

    Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad

    2017-01-01

    Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019

  15. Bayesian spatial transformation models with applications in neuroimaging data

    PubMed Central

    Miranda, Michelle F.; Zhu, Hongtu; Ibrahim, Joseph G.

    2013-01-01

    Summary The aim of this paper is to develop a class of spatial transformation models (STM) to spatially model the varying association between imaging measures in a three-dimensional (3D) volume (or 2D surface) and a set of covariates. Our STMs include a varying Box-Cox transformation model for dealing with the issue of non-Gaussian distributed imaging data and a Gaussian Markov Random Field model for incorporating spatial smoothness of the imaging data. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. Simulations and real data analysis demonstrate that the STM significantly outperforms the voxel-wise linear model with Gaussian noise in recovering meaningful geometric patterns. Our STM is able to reveal important brain regions with morphological changes in children with attention deficit hyperactivity disorder. PMID:24128143

  16. When mechanism matters: Bayesian forecasting using models of ecological diffusion

    USGS Publications Warehouse

    Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.

    2017-01-01

    Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.

  17. A new software for deformation source optimization, the Bayesian Earthquake Analysis Tool (BEAT)

    NASA Astrophysics Data System (ADS)

    Vasyura-Bathke, H.; Dutta, R.; Jonsson, S.; Mai, P. M.

    2017-12-01

    Modern studies of crustal deformation and the related source estimation, including magmatic and tectonic sources, increasingly use non-linear optimization strategies to estimate geometric and/or kinematic source parameters and often consider both jointly, geodetic and seismic data. Bayesian inference is increasingly being used for estimating posterior distributions of deformation source model parameters, given measured/estimated/assumed data and model uncertainties. For instance, some studies consider uncertainties of a layered medium and propagate these into source parameter uncertainties, while others use informative priors to reduce the model parameter space. In addition, innovative sampling algorithms have been developed to efficiently explore the high-dimensional parameter spaces. Compared to earlier studies, these improvements have resulted in overall more robust source model parameter estimates that include uncertainties. However, the computational burden of these methods is high and estimation codes are rarely made available along with the published results. Even if the codes are accessible, it is usually challenging to assemble them into a single optimization framework as they are typically coded in different programing languages. Therefore, further progress and future applications of these methods/codes are hampered, while reproducibility and validation of results has become essentially impossible. In the spirit of providing open-access and modular codes to facilitate progress and reproducible research in deformation source estimations, we undertook the effort of developing BEAT, a python package that comprises all the above-mentioned features in one single programing environment. The package builds on the pyrocko seismological toolbox (www.pyrocko.org), and uses the pymc3 module for Bayesian statistical model fitting. BEAT is an open-source package (https://github.com/hvasbath/beat), and we encourage and solicit contributions to the project. Here, we present our strategy for developing BEAT and show application examples; especially the effect of including the model prediction uncertainty of the velocity model in following source optimizations: full moment tensor, Mogi source, moderate strike-slip earth-quake.

  18. Bayesian naturalness, simplicity, and testability applied to the B ‑ L MSSM GUT

    NASA Astrophysics Data System (ADS)

    Fundira, Panashe; Purves, Austin

    2018-04-01

    Recent years have seen increased use of Bayesian model comparison to quantify notions such as naturalness, simplicity, and testability, especially in the area of supersymmetric model building. After demonstrating that Bayesian model comparison can resolve a paradox that has been raised in the literature concerning the naturalness of the proton mass, we apply Bayesian model comparison to GUTs, an area to which it has not been applied before. We find that the GUTs are substantially favored over the nonunifying puzzle model. Of the GUTs we consider, the B ‑ L MSSM GUT is the most favored, but the MSSM GUT is almost equally favored.

  19. Missing-value estimation using linear and non-linear regression with Bayesian gene selection.

    PubMed

    Zhou, Xiaobo; Wang, Xiaodong; Dougherty, Edward R

    2003-11-22

    Data from microarray experiments are usually in the form of large matrices of expression levels of genes under different experimental conditions. Owing to various reasons, there are frequently missing values. Estimating these missing values is important because they affect downstream analysis, such as clustering, classification and network design. Several methods of missing-value estimation are in use. The problem has two parts: (1) selection of genes for estimation and (2) design of an estimation rule. We propose Bayesian variable selection to obtain genes to be used for estimation, and employ both linear and nonlinear regression for the estimation rule itself. Fast implementation issues for these methods are discussed, including the use of QR decomposition for parameter estimation. The proposed methods are tested on data sets arising from hereditary breast cancer and small round blue-cell tumors. The results compare very favorably with currently used methods based on the normalized root-mean-square error. The appendix is available from http://gspsnap.tamu.edu/gspweb/zxb/missing_zxb/ (user: gspweb; passwd: gsplab).

  20. APPLICATION OF BAYESIAN MONTE CARLO ANALYSIS TO A LAGRANGIAN PHOTOCHEMICAL AIR QUALITY MODEL. (R824792)

    EPA Science Inventory

    Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...

  1. Predicting the multi-domain progression of Parkinson's disease: a Bayesian multivariate generalized linear mixed-effect model.

    PubMed

    Wang, Ming; Li, Zheng; Lee, Eun Young; Lewis, Mechelle M; Zhang, Lijun; Sterling, Nicholas W; Wagner, Daymond; Eslinger, Paul; Du, Guangwei; Huang, Xuemei

    2017-09-25

    It is challenging for current statistical models to predict clinical progression of Parkinson's disease (PD) because of the involvement of multi-domains and longitudinal data. Past univariate longitudinal or multivariate analyses from cross-sectional trials have limited power to predict individual outcomes or a single moment. The multivariate generalized linear mixed-effect model (GLMM) under the Bayesian framework was proposed to study multi-domain longitudinal outcomes obtained at baseline, 18-, and 36-month. The outcomes included motor, non-motor, and postural instability scores from the MDS-UPDRS, and demographic and standardized clinical data were utilized as covariates. The dynamic prediction was performed for both internal and external subjects using the samples from the posterior distributions of the parameter estimates and random effects, and also the predictive accuracy was evaluated based on the root of mean square error (RMSE), absolute bias (AB) and the area under the receiver operating characteristic (ROC) curve. First, our prediction model identified clinical data that were differentially associated with motor, non-motor, and postural stability scores. Second, the predictive accuracy of our model for the training data was assessed, and improved prediction was gained in particularly for non-motor (RMSE and AB: 2.89 and 2.20) compared to univariate analysis (RMSE and AB: 3.04 and 2.35). Third, the individual-level predictions of longitudinal trajectories for the testing data were performed, with ~80% observed values falling within the 95% credible intervals. Multivariate general mixed models hold promise to predict clinical progression of individual outcomes in PD. The data was obtained from Dr. Xuemei Huang's NIH grant R01 NS060722 , part of NINDS PD Biomarker Program (PDBP). All data was entered within 24 h of collection to the Data Management Repository (DMR), which is publically available ( https://pdbp.ninds.nih.gov/data-management ).

  2. An hourly PM10 diagnosis model for the Bilbao metropolitan area using a linear regression methodology.

    PubMed

    González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O

    2013-07-01

    There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.

  3. A Bayesian alternative for multi-objective ecohydrological model specification

    NASA Astrophysics Data System (ADS)

    Tang, Yating; Marshall, Lucy; Sharma, Ashish; Ajami, Hoori

    2018-01-01

    Recent studies have identified the importance of vegetation processes in terrestrial hydrologic systems. Process-based ecohydrological models combine hydrological, physical, biochemical and ecological processes of the catchments, and as such are generally more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov chain Monte Carlo (MCMC) techniques. The Bayesian approach offers an appealing alternative to traditional multi-objective hydrologic model calibrations by defining proper prior distributions that can be considered analogous to the ad-hoc weighting often prescribed in multi-objective calibration. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological modeling framework based on a traditional Pareto-based model calibration technique. In our study, a Pareto-based multi-objective optimization and a formal Bayesian framework are implemented in a conceptual ecohydrological model that combines a hydrological model (HYMOD) and a modified Bucket Grassland Model (BGM). Simulations focused on one objective (streamflow/LAI) and multiple objectives (streamflow and LAI) with different emphasis defined via the prior distribution of the model error parameters. Results show more reliable outputs for both predicted streamflow and LAI using Bayesian multi-objective calibration with specified prior distributions for error parameters based on results from the Pareto front in the ecohydrological modeling. The methodology implemented here provides insight into the usefulness of multiobjective Bayesian calibration for ecohydrologic systems and the importance of appropriate prior distributions in such approaches.

  4. Diagnostic accuracy of a bayesian latent group analysis for the detection of malingering-related poor effort.

    PubMed

    Ortega, Alonso; Labrenz, Stephan; Markowitsch, Hans J; Piefke, Martina

    2013-01-01

    In the last decade, different statistical techniques have been introduced to improve assessment of malingering-related poor effort. In this context, we have recently shown preliminary evidence that a Bayesian latent group model may help to optimize classification accuracy using a simulation research design. In the present study, we conducted two analyses. Firstly, we evaluated how accurately this Bayesian approach can distinguish between participants answering in an honest way (honest response group) and participants feigning cognitive impairment (experimental malingering group). Secondly, we tested the accuracy of our model in the differentiation between patients who had real cognitive deficits (cognitively impaired group) and participants who belonged to the experimental malingering group. All Bayesian analyses were conducted using the raw scores of a visual recognition forced-choice task (2AFC), the Test of Memory Malingering (TOMM, Trial 2), and the Word Memory Test (WMT, primary effort subtests). The first analysis showed 100% accuracy for the Bayesian model in distinguishing participants of both groups with all effort measures. The second analysis showed outstanding overall accuracy of the Bayesian model when estimates were obtained from the 2AFC and the TOMM raw scores. Diagnostic accuracy of the Bayesian model diminished when using the WMT total raw scores. Despite, overall diagnostic accuracy can still be considered excellent. The most plausible explanation for this decrement is the low performance in verbal recognition and fluency tasks of some patients of the cognitively impaired group. Additionally, the Bayesian model provides individual estimates, p(zi |D), of examinees' effort levels. In conclusion, both high classification accuracy levels and Bayesian individual estimates of effort may be very useful for clinicians when assessing for effort in medico-legal settings.

  5. Probabilistic Inference: Task Dependency and Individual Differences of Probability Weighting Revealed by Hierarchical Bayesian Modeling

    PubMed Central

    Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno

    2016-01-01

    Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision. PMID:27303323

  6. Estimating Tree Height-Diameter Models with the Bayesian Method

    PubMed Central

    Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei

    2014-01-01

    Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the “best” model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2. PMID:24711733

  7. Estimating tree height-diameter models with the Bayesian method.

    PubMed

    Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei

    2014-01-01

    Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the "best" model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2.

  8. Bayesian inversion of seismic and electromagnetic data for marine gas reservoir characterization using multi-chain Markov chain Monte Carlo sampling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan

    In this study we developed an efficient Bayesian inversion framework for interpreting marine seismic amplitude versus angle (AVA) and controlled source electromagnetic (CSEM) data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo (MCMC) sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis (DREAM) and Adaptive Metropolis (AM) samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and CSEM data. The multi-chain MCMC is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration,more » the approach is used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic AVA and CSEM joint inversion provides better estimation of reservoir saturations than the seismic AVA-only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated – reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less

  9. Bayesian inversion of seismic and electromagnetic data for marine gas reservoir characterization using multi-chain Markov chain Monte Carlo sampling

    NASA Astrophysics Data System (ADS)

    Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Bao, Jie; Swiler, Laura

    2017-12-01

    In this study we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach is used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated - reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.

  10. Accurate Biomass Estimation via Bayesian Adaptive Sampling

    NASA Technical Reports Server (NTRS)

    Wheeler, Kevin R.; Knuth, Kevin H.; Castle, Joseph P.; Lvov, Nikolay

    2005-01-01

    The following concepts were introduced: a) Bayesian adaptive sampling for solving biomass estimation; b) Characterization of MISR Rahman model parameters conditioned upon MODIS landcover. c) Rigorous non-parametric Bayesian approach to analytic mixture model determination. d) Unique U.S. asset for science product validation and verification.

  11. Bayesian inversion of marine CSEM data from the Scarborough gas field using a transdimensional 2-D parametrization

    NASA Astrophysics Data System (ADS)

    Ray, Anandaroop; Key, Kerry; Bodin, Thomas; Myer, David; Constable, Steven

    2014-12-01

    We apply a reversible-jump Markov chain Monte Carlo method to sample the Bayesian posterior model probability density function of 2-D seafloor resistivity as constrained by marine controlled source electromagnetic data. This density function of earth models conveys information on which parts of the model space are illuminated by the data. Whereas conventional gradient-based inversion approaches require subjective regularization choices to stabilize this highly non-linear and non-unique inverse problem and provide only a single solution with no model uncertainty information, the method we use entirely avoids model regularization. The result of our approach is an ensemble of models that can be visualized and queried to provide meaningful information about the sensitivity of the data to the subsurface, and the level of resolution of model parameters. We represent models in 2-D using a Voronoi cell parametrization. To make the 2-D problem practical, we use a source-receiver common midpoint approximation with 1-D forward modelling. Our algorithm is transdimensional and self-parametrizing where the number of resistivity cells within a 2-D depth section is variable, as are their positions and geometries. Two synthetic studies demonstrate the algorithm's use in the appraisal of a thin, segmented, resistive reservoir which makes for a challenging exploration target. As a demonstration example, we apply our method to survey data collected over the Scarborough gas field on the Northwest Australian shelf.

  12. Enhancements of Bayesian Blocks; Application to Large Light Curve Databases

    NASA Technical Reports Server (NTRS)

    Scargle, Jeff

    2015-01-01

    Bayesian Blocks are optimal piecewise linear representations (step function fits) of light-curves. The simple algorithm implementing this idea, using dynamic programming, has been extended to include more data modes and fitness metrics, multivariate analysis, and data on the circle (Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations, Scargle, Norris, Jackson and Chiang 2013, ApJ, 764, 167), as well as new results on background subtraction and refinement of the procedure for precise timing of transient events in sparse data. Example demonstrations will include exploratory analysis of the Kepler light curve archive in a search for "star-tickling" signals from extraterrestrial civilizations. (The Cepheid Galactic Internet, Learned, Kudritzki, Pakvasa1, and Zee, 2008, arXiv: 0809.0339; Walkowicz et al., in progress).

  13. Sparse Event Modeling with Hierarchical Bayesian Kernel Methods

    DTIC Science & Technology

    2016-01-05

    SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model , is able to model the rate of occurrence of...which adds specificity to the model and can make nonlinear data more manageable. Early results show that the 1. REPORT DATE (DD-MM-YYYY) 4. TITLE

  14. Non-linear Parameter Estimates from Non-stationary MEG Data

    PubMed Central

    Martínez-Vargas, Juan D.; López, Jose D.; Baker, Adam; Castellanos-Dominguez, German; Woolrich, Mark W.; Barnes, Gareth

    2016-01-01

    We demonstrate a method to estimate key electrophysiological parameters from resting state data. In this paper, we focus on the estimation of head-position parameters. The recovery of these parameters is especially challenging as they are non-linearly related to the measured field. In order to do this we use an empirical Bayesian scheme to estimate the cortical current distribution due to a range of laterally shifted head-models. We compare different methods of approaching this problem from the division of M/EEG data into stationary sections and performing separate source inversions, to explaining all of the M/EEG data with a single inversion. We demonstrate this through estimation of head position in both simulated and empirical resting state MEG data collected using a head-cast. PMID:27597815

  15. A Fast Surrogate-facilitated Data-driven Bayesian Approach to Uncertainty Quantification of a Regional Groundwater Flow Model with Structural Error

    NASA Astrophysics Data System (ADS)

    Xu, T.; Valocchi, A. J.; Ye, M.; Liang, F.

    2016-12-01

    Due to simplification and/or misrepresentation of the real aquifer system, numerical groundwater flow and solute transport models are usually subject to model structural error. During model calibration, the hydrogeological parameters may be overly adjusted to compensate for unknown structural error. This may result in biased predictions when models are used to forecast aquifer response to new forcing. In this study, we extend a fully Bayesian method [Xu and Valocchi, 2015] to calibrate a real-world, regional groundwater flow model. The method uses a data-driven error model to describe model structural error and jointly infers model parameters and structural error. In this study, Bayesian inference is facilitated using high performance computing and fast surrogate models. The surrogate models are constructed using machine learning techniques to emulate the response simulated by the computationally expensive groundwater model. We demonstrate in the real-world case study that explicitly accounting for model structural error yields parameter posterior distributions that are substantially different from those derived by the classical Bayesian calibration that does not account for model structural error. In addition, the Bayesian with error model method gives significantly more accurate prediction along with reasonable credible intervals.

  16. The Bayesian Revolution Approaches Psychological Development

    ERIC Educational Resources Information Center

    Shultz, Thomas R.

    2007-01-01

    This commentary reviews five articles that apply Bayesian ideas to psychological development, some with psychology experiments, some with computational modeling, and some with both experiments and modeling. The reviewed work extends the current Bayesian revolution into tasks often studied in children, such as causal learning and word learning, and…

  17. Random regression models using Legendre polynomials or linear splines for test-day milk yield of dairy Gyr (Bos indicus) cattle.

    PubMed

    Pereira, R J; Bignardi, A B; El Faro, L; Verneque, R S; Vercesi Filho, A E; Albuquerque, L G

    2013-01-01

    Studies investigating the use of random regression models for genetic evaluation of milk production in Zebu cattle are scarce. In this study, 59,744 test-day milk yield records from 7,810 first lactations of purebred dairy Gyr (Bos indicus) and crossbred (dairy Gyr × Holstein) cows were used to compare random regression models in which additive genetic and permanent environmental effects were modeled using orthogonal Legendre polynomials or linear spline functions. Residual variances were modeled considering 1, 5, or 10 classes of days in milk. Five classes fitted the changes in residual variances over the lactation adequately and were used for model comparison. The model that fitted linear spline functions with 6 knots provided the lowest sum of residual variances across lactation. On the other hand, according to the deviance information criterion (DIC) and bayesian information criterion (BIC), a model using third-order and fourth-order Legendre polynomials for additive genetic and permanent environmental effects, respectively, provided the best fit. However, the high rank correlation (0.998) between this model and that applying third-order Legendre polynomials for additive genetic and permanent environmental effects, indicates that, in practice, the same bulls would be selected by both models. The last model, which is less parameterized, is a parsimonious option for fitting dairy Gyr breed test-day milk yield records. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  18. Role of Statistical Random-Effects Linear Models in Personalized Medicine

    PubMed Central

    Diaz, Francisco J; Yeh, Hung-Wen; de Leon, Jose

    2012-01-01

    Some empirical studies and recent developments in pharmacokinetic theory suggest that statistical random-effects linear models are valuable tools that allow describing simultaneously patient populations as a whole and patients as individuals. This remarkable characteristic indicates that these models may be useful in the development of personalized medicine, which aims at finding treatment regimes that are appropriate for particular patients, not just appropriate for the average patient. In fact, published developments show that random-effects linear models may provide a solid theoretical framework for drug dosage individualization in chronic diseases. In particular, individualized dosages computed with these models by means of an empirical Bayesian approach may produce better results than dosages computed with some methods routinely used in therapeutic drug monitoring. This is further supported by published empirical and theoretical findings that show that random effects linear models may provide accurate representations of phase III and IV steady-state pharmacokinetic data, and may be useful for dosage computations. These models have applications in the design of clinical algorithms for drug dosage individualization in chronic diseases; in the computation of dose correction factors; computation of the minimum number of blood samples from a patient that are necessary for calculating an optimal individualized drug dosage in therapeutic drug monitoring; measure of the clinical importance of clinical, demographic, environmental or genetic covariates; study of drug-drug interactions in clinical settings; the implementation of computational tools for web-site-based evidence farming; design of pharmacogenomic studies; and in the development of a pharmacological theory of dosage individualization. PMID:23467392

  19. Incorporating approximation error in surrogate based Bayesian inversion

    NASA Astrophysics Data System (ADS)

    Zhang, J.; Zeng, L.; Li, W.; Wu, L.

    2015-12-01

    There are increasing interests in applying surrogates for inverse Bayesian modeling to reduce repetitive evaluations of original model. In this way, the computational cost is expected to be saved. However, the approximation error of surrogate model is usually overlooked. This is partly because that it is difficult to evaluate the approximation error for many surrogates. Previous studies have shown that, the direct combination of surrogates and Bayesian methods (e.g., Markov Chain Monte Carlo, MCMC) may lead to biased estimations when the surrogate cannot emulate the highly nonlinear original system. This problem can be alleviated by implementing MCMC in a two-stage manner. However, the computational cost is still high since a relatively large number of original model simulations are required. In this study, we illustrate the importance of incorporating approximation error in inverse Bayesian modeling. Gaussian process (GP) is chosen to construct the surrogate for its convenience in approximation error evaluation. Numerical cases of Bayesian experimental design and parameter estimation for contaminant source identification are used to illustrate this idea. It is shown that, once the surrogate approximation error is well incorporated into Bayesian framework, promising results can be obtained even when the surrogate is directly used, and no further original model simulations are required.

  20. Moving in Parallel Toward a Modern Modeling Epistemology: Bayes Factors and Frequentist Modeling Methods.

    PubMed

    Rodgers, Joseph Lee

    2016-01-01

    The Bayesian-frequentist debate typically portrays these statistical perspectives as opposing views. However, both Bayesian and frequentist statisticians have expanded their epistemological basis away from a singular focus on the null hypothesis, to a broader perspective involving the development and comparison of competing statistical/mathematical models. For frequentists, statistical developments such as structural equation modeling and multilevel modeling have facilitated this transition. For Bayesians, the Bayes factor has facilitated this transition. The Bayes factor is treated in articles within this issue of Multivariate Behavioral Research. The current presentation provides brief commentary on those articles and more extended discussion of the transition toward a modern modeling epistemology. In certain respects, Bayesians and frequentists share common goals.

  1. A Bayesian Model of the Memory Colour Effect.

    PubMed

    Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R

    2018-01-01

    According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects.

  2. A Bayesian Model of the Memory Colour Effect

    PubMed Central

    Olkkonen, Maria; Gegenfurtner, Karl R.

    2018-01-01

    According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects. PMID:29760874

  3. Using Bayesian Networks to Improve Knowledge Assessment

    ERIC Educational Resources Information Center

    Millan, Eva; Descalco, Luis; Castillo, Gladys; Oliveira, Paula; Diogo, Sandra

    2013-01-01

    In this paper, we describe the integration and evaluation of an existing generic Bayesian student model (GBSM) into an existing computerized testing system within the Mathematics Education Project (PmatE--Projecto Matematica Ensino) of the University of Aveiro. This generic Bayesian student model had been previously evaluated with simulated…

  4. Bayesian Posterior Odds Ratios: Statistical Tools for Collaborative Evaluations

    ERIC Educational Resources Information Center

    Hicks, Tyler; Rodríguez-Campos, Liliana; Choi, Jeong Hoon

    2018-01-01

    To begin statistical analysis, Bayesians quantify their confidence in modeling hypotheses with priors. A prior describes the probability of a certain modeling hypothesis apart from the data. Bayesians should be able to defend their choice of prior to a skeptical audience. Collaboration between evaluators and stakeholders could make their choices…

  5. Morphodynamic data assimilation used to understand changing coasts

    USGS Publications Warehouse

    Plant, Nathaniel G.; Long, Joseph W.

    2015-01-01

    Morphodynamic data assimilation blends observations with model predictions and comes in many forms, including linear regression, Kalman filter, brute-force parameter estimation, variational assimilation, and Bayesian analysis. Importantly, data assimilation can be used to identify sources of prediction errors that lead to improved fundamental understanding. Overall, models incorporating data assimilation yield better information to the people who must make decisions impacting safety and wellbeing in coastal regions that experience hazards due to storms, sea-level rise, and erosion. We present examples of data assimilation associated with morphologic change. We conclude that enough morphodynamic predictive capability is available now to be useful to people, and that we will increase our understanding and the level of detail of our predictions through assimilation of observations and numerical-statistical models.

  6. Bayesian estimation of a source term of radiation release with approximately known nuclide ratios

    NASA Astrophysics Data System (ADS)

    Tichý, Ondřej; Šmídl, Václav; Hofman, Radek

    2016-04-01

    We are concerned with estimation of a source term in case of an accidental release from a known location, e.g. a power plant. Usually, the source term of an accidental release of radiation comprises of a mixture of nuclide. The gamma dose rate measurements do not provide a direct information on the source term composition. However, physical properties of respective nuclide (deposition properties, decay half-life) can be used when uncertain information on nuclide ratios is available, e.g. from known reactor inventory. The proposed method is based on linear inverse model where the observation vector y arise as a linear combination y = Mx of a source-receptor-sensitivity (SRS) matrix M and the source term x. The task is to estimate the unknown source term x. The problem is ill-conditioned and further regularization is needed to obtain a reasonable solution. In this contribution, we assume that nuclide ratios of the release is known with some degree of uncertainty. This knowledge is used to form the prior covariance matrix of the source term x. Due to uncertainty in the ratios the diagonal elements of the covariance matrix are considered to be unknown. Positivity of the source term estimate is guaranteed by using multivariate truncated Gaussian distribution. Following Bayesian approach, we estimate all parameters of the model from the data so that y, M, and known ratios are the only inputs of the method. Since the inference of the model is intractable, we follow the Variational Bayes method yielding an iterative algorithm for estimation of all model parameters. Performance of the method is studied on simulated 6 hour power plant release where 3 nuclide are released and 2 nuclide ratios are approximately known. The comparison with method with unknown nuclide ratios will be given to prove the usefulness of the proposed approach. This research is supported by EEA/Norwegian Financial Mechanism under project MSMT-28477/2014 Source-Term Determination of Radionuclide Releases by Inverse Atmospheric Dispersion Modelling (STRADI).

  7. Mapping, Bayesian Geostatistical Analysis and Spatial Prediction of Lymphatic Filariasis Prevalence in Africa

    PubMed Central

    Slater, Hannah; Michael, Edwin

    2013-01-01

    There is increasing interest to control or eradicate the major neglected tropical diseases. Accurate modelling of the geographic distributions of parasitic infections will be crucial to this endeavour. We used 664 community level infection prevalence data collated from the published literature in conjunction with eight environmental variables, altitude and population density, and a multivariate Bayesian generalized linear spatial model that allows explicit accounting for spatial autocorrelation and incorporation of uncertainty in input data and model parameters, to construct the first spatially-explicit map describing LF prevalence distribution in Africa. We also ran the best-fit model against predictions made by the HADCM3 and CCCMA climate models for 2050 to predict the likely distributions of LF under future climate and population changes. We show that LF prevalence is strongly influenced by spatial autocorrelation between locations but is only weakly associated with environmental covariates. Infection prevalence, however, is found to be related to variations in population density. All associations with key environmental/demographic variables appear to be complex and non-linear. LF prevalence is predicted to be highly heterogenous across Africa, with high prevalences (>20%) estimated to occur primarily along coastal West and East Africa, and lowest prevalences predicted for the central part of the continent. Error maps, however, indicate a need for further surveys to overcome problems with data scarcity in the latter and other regions. Analysis of future changes in prevalence indicates that population growth rather than climate change per se will represent the dominant factor in the predicted increase/decrease and spread of LF on the continent. We indicate that these results could play an important role in aiding the development of strategies that are best able to achieve the goals of parasite elimination locally and globally in a manner that may also account for the effects of future climate change on parasitic infection. PMID:23951194

  8. Sub-seasonal-to-seasonal Reservoir Inflow Forecast using Bayesian Hierarchical Hidden Markov Model

    NASA Astrophysics Data System (ADS)

    Mukhopadhyay, S.; Arumugam, S.

    2017-12-01

    Sub-seasonal-to-seasonal (S2S) (15-90 days) streamflow forecasting is an emerging area of research that provides seamless information for reservoir operation from weather time scales to seasonal time scales. From an operational perspective, sub-seasonal inflow forecasts are highly valuable as these enable water managers to decide short-term releases (15-30 days), while holding water for seasonal needs (e.g., irrigation and municipal supply) and to meet end-of-the-season target storage at a desired level. We propose a Bayesian Hierarchical Hidden Markov Model (BHHMM) to develop S2S inflow forecasts for the Tennessee Valley Area (TVA) reservoir system. Here, the hidden states are predicted by relevant indices that influence the inflows at S2S time scale. The hidden Markov model also captures the both spatial and temporal hierarchy in predictors that operate at S2S time scale with model parameters being estimated as a posterior distribution using a Bayesian framework. We present our work in two steps, namely single site model and multi-site model. For proof of concept, we consider inflows to Douglas Dam, Tennessee, in the single site model. For multisite model we consider reservoirs in the upper Tennessee valley. Streamflow forecasts are issued and updated continuously every day at S2S time scale. We considered precipitation forecasts obtained from NOAA Climate Forecast System (CFSv2) GCM as predictors for developing S2S streamflow forecasts along with relevant indices for predicting hidden states. Spatial dependence of the inflow series of reservoirs are also preserved in the multi-site model. To circumvent the non-normality of the data, we consider the HMM in a Generalized Linear Model setting. Skill of the proposed approach is tested using split sample validation against a traditional multi-site canonical correlation model developed using the same set of predictors. From the posterior distribution of the inflow forecasts, we also highlight different system behavior under varied global and local scale climatic influences from the developed BHMM.

  9. BCM: toolkit for Bayesian analysis of Computational Models using samplers.

    PubMed

    Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A

    2016-10-21

    Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.

  10. Two Approaches to Calibration in Metrology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Campanelli, Mark

    2014-04-01

    Inferring mathematical relationships with quantified uncertainty from measurement data is common to computational science and metrology. Sufficient knowledge of measurement process noise enables Bayesian inference. Otherwise, an alternative approach is required, here termed compartmentalized inference, because collection of uncertain data and model inference occur independently. Bayesian parameterized model inference is compared to a Bayesian-compatible compartmentalized approach for ISO-GUM compliant calibration problems in renewable energy metrology. In either approach, model evidence can help reduce model discrepancy.

  11. Impact of relationships between test and training animals and among training animals on reliability of genomic prediction.

    PubMed

    Wu, X; Lund, M S; Sun, D; Zhang, Q; Su, G

    2015-10-01

    One of the factors affecting the reliability of genomic prediction is the relationship among the animals of interest. This study investigated the reliability of genomic prediction in various scenarios with regard to the relationship between test and training animals, and among animals within the training data set. Different training data sets were generated from EuroGenomics data and a group of Nordic Holstein bulls (born in 2005 and afterwards) as a common test data set. Genomic breeding values were predicted using a genomic best linear unbiased prediction model and a Bayesian mixture model. The results showed that a closer relationship between test and training animals led to a higher reliability of genomic predictions for the test animals, while a closer relationship among training animals resulted in a lower reliability. In addition, the Bayesian mixture model in general led to a slightly higher reliability of genomic prediction, especially for the scenario of distant relationships between training and test animals. Therefore, to prevent a decrease in reliability, constant updates of the training population with animals from more recent generations are required. Moreover, a training population consisting of less-related animals is favourable for reliability of genomic prediction. © 2015 Blackwell Verlag GmbH.

  12. Uncovering a latent multinomial: Analysis of mark-recapture data with misidentification

    USGS Publications Warehouse

    Link, W.A.; Yoshizaki, J.; Bailey, L.L.; Pollock, K.H.

    2010-01-01

    Natural tags based on DNA fingerprints or natural features of animals are now becoming very widely used in wildlife population biology. However, classic capture-recapture models do not allow for misidentification of animals which is a potentially very serious problem with natural tags. Statistical analysis of misidentification processes is extremely difficult using traditional likelihood methods but is easily handled using Bayesian methods. We present a general framework for Bayesian analysis of categorical data arising from a latent multinomial distribution. Although our work is motivated by a specific model for misidentification in closed population capture-recapture analyses, with crucial assumptions which may not always be appropriate, the methods we develop extend naturally to a variety of other models with similar structure. Suppose that observed frequencies f are a known linear transformation f = A???x of a latent multinomial variable x with cell probability vector ?? = ??(??). Given that full conditional distributions [?? | x] can be sampled, implementation of Gibbs sampling requires only that we can sample from the full conditional distribution [x | f, ??], which is made possible by knowledge of the null space of A???. We illustrate the approach using two data sets with individual misidentification, one simulated, the other summarizing recapture data for salamanders based on natural marks. ?? 2009, The International Biometric Society.

  13. Uncovering a Latent Multinomial: Analysis of Mark-Recapture Data with Misidentification

    USGS Publications Warehouse

    Link, W.A.; Yoshizaki, J.; Bailey, L.L.; Pollock, K.H.

    2009-01-01

    Natural tags based on DNA fingerprints or natural features of animals are now becoming very widely used in wildlife population biology. However, classic capture-recapture models do not allow for misidentification of animals which is a potentially very serious problem with natural tags. Statistical analysis of misidentification processes is extremely difficult using traditional likelihood methods but is easily handled using Bayesian methods. We present a general framework for Bayesian analysis of categorical data arising from a latent multinomial distribution. Although our work is motivated by a specific model for misidentification in closed population capture-recapture analyses, with crucial assumptions which may not always be appropriate, the methods we develop extend naturally to a variety of other models with similar structure. Suppose that observed frequencies f are a known linear transformation f=A'x of a latent multinomial variable x with cell probability vector pi= pi(theta). Given that full conditional distributions [theta | x] can be sampled, implementation of Gibbs sampling requires only that we can sample from the full conditional distribution [x | f, theta], which is made possible by knowledge of the null space of A'. We illustrate the approach using two data sets with individual misidentification, one simulated, the other summarizing recapture data for salamanders based on natural marks.

  14. A Bayesian approach to truncated data sets: An application to Malmquist bias in Supernova Cosmology

    NASA Astrophysics Data System (ADS)

    March, Marisa Cristina

    2018-01-01

    A problem commonly encountered in statistical analysis of data is that of truncated data sets. A truncated data set is one in which a number of data points are completely missing from a sample, this is in contrast to a censored sample in which partial information is missing from some data points. In astrophysics this problem is commonly seen in a magnitude limited survey such that the survey is incomplete at fainter magnitudes, that is, certain faint objects are simply not observed. The effect of this `missing data' is manifested as Malmquist bias and can result in biases in parameter inference if it is not accounted for. In Frequentist methodologies the Malmquist bias is often corrected for by analysing many simulations and computing the appropriate correction factors. One problem with this methodology is that the corrections are model dependent. In this poster we derive a Bayesian methodology for accounting for truncated data sets in problems of parameter inference and model selection. We first show the methodology for a simple Gaussian linear model and then go on to show the method for accounting for a truncated data set in the case for cosmological parameter inference with a magnitude limited supernova Ia survey.

  15. Bayesian data analysis in population ecology: motivations, methods, and benefits

    USGS Publications Warehouse

    Dorazio, Robert

    2016-01-01

    During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.

  16. [Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].

    PubMed

    Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L

    2017-03-10

    To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.

  17. A Preliminary Bayesian Analysis of Incomplete Longitudinal Data from a Small Sample: Methodological Advances in an International Comparative Study of Educational Inequality

    ERIC Educational Resources Information Center

    Hsieh, Chueh-An; Maier, Kimberly S.

    2009-01-01

    The capacity of Bayesian methods in estimating complex statistical models is undeniable. Bayesian data analysis is seen as having a range of advantages, such as an intuitive probabilistic interpretation of the parameters of interest, the efficient incorporation of prior information to empirical data analysis, model averaging and model selection.…

  18. Tutorial: Asteroseismic Stellar Modelling with AIMS

    NASA Astrophysics Data System (ADS)

    Lund, Mikkel N.; Reese, Daniel R.

    The goal of aims (Asteroseismic Inference on a Massive Scale) is to estimate stellar parameters and credible intervals/error bars in a Bayesian manner from a set of asteroseismic frequency data and so-called classical constraints. To achieve reliable parameter estimates and computational efficiency, it searches through a grid of pre-computed models using an MCMC algorithm—interpolation within the grid of models is performed by first tessellating the grid using a Delaunay triangulation and then doing a linear barycentric interpolation on matching simplexes. Inputs for the modelling consist of individual frequencies from peak-bagging, which can be complemented with classical spectroscopic constraints. aims is mostly written in Python with a modular structure to facilitate contributions from the community. Only a few computationally intensive parts have been rewritten in Fortran in order to speed up calculations.

  19. Sample size and classification error for Bayesian change-point models with unlabelled sub-groups and incomplete follow-up.

    PubMed

    White, Simon R; Muniz-Terrera, Graciela; Matthews, Fiona E

    2018-05-01

    Many medical (and ecological) processes involve the change of shape, whereby one trajectory changes into another trajectory at a specific time point. There has been little investigation into the study design needed to investigate these models. We consider the class of fixed effect change-point models with an underlying shape comprised two joined linear segments, also known as broken-stick models. We extend this model to include two sub-groups with different trajectories at the change-point, a change and no change class, and also include a missingness model to account for individuals with incomplete follow-up. Through a simulation study, we consider the relationship of sample size to the estimates of the underlying shape, the existence of a change-point, and the classification-error of sub-group labels. We use a Bayesian framework to account for the missing labels, and the analysis of each simulation is performed using standard Markov chain Monte Carlo techniques. Our simulation study is inspired by cognitive decline as measured by the Mini-Mental State Examination, where our extended model is appropriate due to the commonly observed mixture of individuals within studies who do or do not exhibit accelerated decline. We find that even for studies of modest size ( n = 500, with 50 individuals observed past the change-point) in the fixed effect setting, a change-point can be detected and reliably estimated across a range of observation-errors.

  20. Identifiability of PBPK Models with Applications to ...

    EPA Pesticide Factsheets

    Any statistical model should be identifiable in order for estimates and tests using it to be meaningful. We consider statistical analysis of physiologically-based pharmacokinetic (PBPK) models in which parameters cannot be estimated precisely from available data, and discuss different types of identifiability that occur in PBPK models and give reasons why they occur. We particularly focus on how the mathematical structure of a PBPK model and lack of appropriate data can lead to statistical models in which it is impossible to estimate at least some parameters precisely. Methods are reviewed which can determine whether a purely linear PBPK model is globally identifiable. We propose a theorem which determines when identifiability at a set of finite and specific values of the mathematical PBPK model (global discrete identifiability) implies identifiability of the statistical model. However, we are unable to establish conditions that imply global discrete identifiability, and conclude that the only safe approach to analysis of PBPK models involves Bayesian analysis with truncated priors. Finally, computational issues regarding posterior simulations of PBPK models are discussed. The methodology is very general and can be applied to numerous PBPK models which can be expressed as linear time-invariant systems. A real data set of a PBPK model for exposure to dimethyl arsinic acid (DMA(V)) is presented to illustrate the proposed methodology. We consider statistical analy

  1. Bayesian spatial transformation models with applications in neuroimaging data.

    PubMed

    Miranda, Michelle F; Zhu, Hongtu; Ibrahim, Joseph G

    2013-12-01

    The aim of this article is to develop a class of spatial transformation models (STM) to spatially model the varying association between imaging measures in a three-dimensional (3D) volume (or 2D surface) and a set of covariates. The proposed STM include a varying Box-Cox transformation model for dealing with the issue of non-Gaussian distributed imaging data and a Gaussian Markov random field model for incorporating spatial smoothness of the imaging data. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. Simulations and real data analysis demonstrate that the STM significantly outperforms the voxel-wise linear model with Gaussian noise in recovering meaningful geometric patterns. Our STM is able to reveal important brain regions with morphological changes in children with attention deficit hyperactivity disorder. © 2013, The International Biometric Society.

  2. Inferring phenomenological models of Markov processes from data

    NASA Astrophysics Data System (ADS)

    Rivera, Catalina; Nemenman, Ilya

    Microscopically accurate modeling of stochastic dynamics of biochemical networks is hard due to the extremely high dimensionality of the state space of such networks. Here we propose an algorithm for inference of phenomenological, coarse-grained models of Markov processes describing the network dynamics directly from data, without the intermediate step of microscopically accurate modeling. The approach relies on the linear nature of the Chemical Master Equation and uses Bayesian Model Selection for identification of parsimonious models that fit the data. When applied to synthetic data from the Kinetic Proofreading process (KPR), a common mechanism used by cells for increasing specificity of molecular assembly, the algorithm successfully uncovers the known coarse-grained description of the process. This phenomenological description has been notice previously, but this time it is derived in an automated manner by the algorithm. James S. McDonnell Foundation Grant No. 220020321.

  3. Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling

    NASA Astrophysics Data System (ADS)

    Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.

    2017-04-01

    Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.

  4. Model verification of large structural systems. [space shuttle model response

    NASA Technical Reports Server (NTRS)

    Lee, L. T.; Hasselman, T. K.

    1978-01-01

    A computer program for the application of parameter identification on the structural dynamic models of space shuttle and other large models with hundreds of degrees of freedom is described. Finite element, dynamic, analytic, and modal models are used to represent the structural system. The interface with math models is such that output from any structural analysis program applied to any structural configuration can be used directly. Processed data from either sine-sweep tests or resonant dwell tests are directly usable. The program uses measured modal data to condition the prior analystic model so as to improve the frequency match between model and test. A Bayesian estimator generates an improved analytical model and a linear estimator is used in an iterative fashion on highly nonlinear equations. Mass and stiffness scaling parameters are generated for an improved finite element model, and the optimum set of parameters is obtained in one step.

  5. A Bayesian model averaging method for improving SMT phrase table

    NASA Astrophysics Data System (ADS)

    Duan, Nan

    2013-03-01

    Previous methods on improving translation quality by employing multiple SMT models usually carry out as a second-pass decision procedure on hypotheses from multiple systems using extra features instead of using features in existing models in more depth. In this paper, we propose translation model generalization (TMG), an approach that updates probability feature values for the translation model being used based on the model itself and a set of auxiliary models, aiming to alleviate the over-estimation problem and enhance translation quality in the first-pass decoding phase. We validate our approach for translation models based on auxiliary models built by two different ways. We also introduce novel probability variance features into the log-linear models for further improvements. We conclude our approach can be developed independently and integrated into current SMT pipeline directly. We demonstrate BLEU improvements on the NIST Chinese-to-English MT tasks for single-system decodings.

  6. Dynamic Bayesian network modeling for longitudinal brain morphometry

    PubMed Central

    Chen, Rong; Resnick, Susan M; Davatzikos, Christos; Herskovits, Edward H

    2011-01-01

    Identifying interactions among brain regions from structural magnetic-resonance images presents one of the major challenges in computational neuroanatomy. We propose a Bayesian data-mining approach to the detection of longitudinal morphological changes in the human brain. Our method uses a dynamic Bayesian network to represent evolving inter-regional dependencies. The major advantage of dynamic Bayesian network modeling is that it can represent complicated interactions among temporal processes. We validated our approach by analyzing a simulated atrophy study, and found that this approach requires only a small number of samples to detect the ground-truth temporal model. We further applied dynamic Bayesian network modeling to a longitudinal study of normal aging and mild cognitive impairment — the Baltimore Longitudinal Study of Aging. We found that interactions among regional volume-change rates for the mild cognitive impairment group are different from those for the normal-aging group. PMID:21963916

  7. Variational learning and bits-back coding: an information-theoretic view to Bayesian learning.

    PubMed

    Honkela, Antti; Valpola, Harri

    2004-07-01

    The bits-back coding first introduced by Wallace in 1990 and later by Hinton and van Camp in 1993 provides an interesting link between Bayesian learning and information-theoretic minimum-description-length (MDL) learning approaches. The bits-back coding allows interpreting the cost function used in the variational Bayesian method called ensemble learning as a code length in addition to the Bayesian view of misfit of the posterior approximation and a lower bound of model evidence. Combining these two viewpoints provides interesting insights to the learning process and the functions of different parts of the model. In this paper, the problem of variational Bayesian learning of hierarchical latent variable models is used to demonstrate the benefits of the two views. The code-length interpretation provides new views to many parts of the problem such as model comparison and pruning and helps explain many phenomena occurring in learning.

  8. One dimensional models of temperature and composition in the transition zone from a bayesian inversion of surface waves

    NASA Astrophysics Data System (ADS)

    Drilleau, M.; Beucler, E.; Mocquet, A.; Verhoeven, O.; Burgos, G.; Capdeville, Y.; Montagner, J.

    2011-12-01

    The transition zone plays a key role in the dynamics of the Earth's mantle, especially for the exchanges between the upper and the lower mantles. Phase transitions, convective motions, hot upwelling and/or cold downwelling materials may make the 400 to 1000 km depth range very anisotropic and heterogeneous, both thermally and chemically. A classical procedure to infer the thermal state and the composition is to interpret 3D velocity perturbation models in terms of temperature and mineralogical composition, with respect to a global 1D model. However, the strength of heterogeneity and anisotropy can be so high that the concept of a one-dimensional reference seismic model might be addressed for this depth range. Some recent studies prefer to directly invert seismic travel times and normal modes catalogues in terms of temperature and composition. Bayesian approach allows to go beyond the classical computation of the best fit model by providing a quantitative measure of model uncertainty. We implement a non linear inverse approach (Monte Carlo Markov Chains) to interpret seismic data in terms of temperature, anisotropy and composition. Two different data sets are used and compared : surface wave waveforms and phase velocities (fundamental mode and the first overtones). A guideline of this method is to let the resolution power of the data govern the spatial resolution of the model. Up to now, the model parameters are the temperature field and the mineralogical composition ; other important effects, such as macroscopic anisotropy, will be taken into account in the near future. In order to reduce the computing time of the Monte Carlo procedure, polynomial Bézier curves are used for the parameterization. This choice allows for smoothly varying models and first-order discontinuities. Our Bayesian algorithm is tested with standard circular synthetic experiments and with more realistic simulations including 3D wave propagation effects (SEM). The test results enhance the ability of this approach to match the three-component waveforms and address the question of the mean radial interpretation of a 3D model. The method is also tested using real datasets, such as along the Vanuatu-California path.

  9. Efficient Bayesian parameter estimation with implicit sampling and surrogate modeling for a vadose zone hydrological problem

    NASA Astrophysics Data System (ADS)

    Liu, Y.; Pau, G. S. H.; Finsterle, S.

    2015-12-01

    Parameter inversion involves inferring the model parameter values based on sparse observations of some observables. To infer the posterior probability distributions of the parameters, Markov chain Monte Carlo (MCMC) methods are typically used. However, the large number of forward simulations needed and limited computational resources limit the complexity of the hydrological model we can use in these methods. In view of this, we studied the implicit sampling (IS) method, an efficient importance sampling technique that generates samples in the high-probability region of the posterior distribution and thus reduces the number of forward simulations that we need to run. For a pilot-point inversion of a heterogeneous permeability field based on a synthetic ponded infiltration experiment simu­lated with TOUGH2 (a subsurface modeling code), we showed that IS with linear map provides an accurate Bayesian description of the parameterized permeability field at the pilot points with just approximately 500 forward simulations. We further studied the use of surrogate models to improve the computational efficiency of parameter inversion. We implemented two reduced-order models (ROMs) for the TOUGH2 forward model. One is based on polynomial chaos expansion (PCE), of which the coefficients are obtained using the sparse Bayesian learning technique to mitigate the "curse of dimensionality" of the PCE terms. The other model is Gaussian process regression (GPR) for which different covariance, likelihood and inference models are considered. Preliminary results indicate that ROMs constructed based on the prior parameter space perform poorly. It is thus impractical to replace this hydrological model by a ROM directly in a MCMC method. However, the IS method can work with a ROM constructed for parameters in the close vicinity of the maximum a posteriori probability (MAP) estimate. We will discuss the accuracy and computational efficiency of using ROMs in the implicit sampling procedure for the hydrological problem considered. This work was supported, in part, by the U.S. Dept. of Energy under Contract No. DE-AC02-05CH11231

  10. Bayesian Calibration of Thermodynamic Databases and the Role of Kinetics

    NASA Astrophysics Data System (ADS)

    Wolf, A. S.; Ghiorso, M. S.

    2017-12-01

    Self-consistent thermodynamic databases of geologically relevant materials (like Berman, 1988; Holland and Powell, 1998, Stixrude & Lithgow-Bertelloni 2011) are crucial for simulating geological processes as well as interpreting rock samples from the field. These databases form the backbone of our understanding of how fluids and rocks interact at extreme planetary conditions. Considerable work is involved in their construction from experimental phase reaction data, as they must self-consistently describe the free energy surfaces (including relative offsets) of potentially hundreds of interacting phases. Standard database calibration methods typically utilize either linear programming or least squares regression. While both produce a viable model, they suffer from strong limitations on the training data (which must be filtered by hand), along with general ignorance of many of the sources of experimental uncertainty. We develop a new method for calibrating high P-T thermodynamic databases for use in geologic applications. The model is designed to handle pure solid endmember and free fluid phases and can be extended to include mixed solid solutions and melt phases. This new calibration effort utilizes Bayesian techniques to obtain optimal parameter values together with a full family of statistically acceptable models, summarized by the posterior. Unlike previous efforts, the Bayesian Logistic Uncertain Reaction (BLUR) model directly accounts for both measurement uncertainties and disequilibrium effects, by employing a kinetic reaction model whose parameters are empirically determined from the experiments themselves. Thus, along with the equilibrium free energy surfaces, we also provide rough estimates of the activation energies, entropies, and volumes for each reaction. As a first application, we demonstrate this new method on the three-phase aluminosilicate system, illustrating how it can produce superior estimates of the phase boundaries by incorporating constraints from all available data, while automatically handling variable data quality due to a combination of measurement errors and kinetic effects.

  11. Augmented switching linear dynamical system model for gas concentration estimation with MOX sensors in an open sampling system.

    PubMed

    Di Lello, Enrico; Trincavelli, Marco; Bruyninckx, Herman; De Laet, Tinne

    2014-07-11

    In this paper, we introduce a Bayesian time series model approach for gas concentration estimation using Metal Oxide (MOX) sensors in Open Sampling System (OSS). Our approach focuses on the compensation of the slow response of MOX sensors, while concurrently solving the problem of estimating the gas concentration in OSS. The proposed Augmented Switching Linear System model allows to include all the sources of uncertainty arising at each step of the problem in a single coherent probabilistic formulation. In particular, the problem of detecting on-line the current sensor dynamical regime and estimating the underlying gas concentration under environmental disturbances and noisy measurements is formulated and solved as a statistical inference problem. Our model improves, with respect to the state of the art, where system modeling approaches have been already introduced, but only provided an indirect relative measures proportional to the gas concentration and the problem of modeling uncertainty was ignored. Our approach is validated experimentally and the performances in terms of speed of and quality of the gas concentration estimation are compared with the ones obtained using a photo-ionization detector.

  12. Augmented Switching Linear Dynamical System Model for Gas Concentration Estimation with MOX Sensors in an Open Sampling System

    PubMed Central

    Di Lello, Enrico; Trincavelli, Marco; Bruyninckx, Herman; De Laet, Tinne

    2014-01-01

    In this paper, we introduce a Bayesian time series model approach for gas concentration estimation using Metal Oxide (MOX) sensors in Open Sampling System (OSS). Our approach focuses on the compensation of the slow response of MOX sensors, while concurrently solving the problem of estimating the gas concentration in OSS. The proposed Augmented Switching Linear System model allows to include all the sources of uncertainty arising at each step of the problem in a single coherent probabilistic formulation. In particular, the problem of detecting on-line the current sensor dynamical regime and estimating the underlying gas concentration under environmental disturbances and noisy measurements is formulated and solved as a statistical inference problem. Our model improves, with respect to the state of the art, where system modeling approaches have been already introduced, but only provided an indirect relative measures proportional to the gas concentration and the problem of modeling uncertainty was ignored. Our approach is validated experimentally and the performances in terms of speed of and quality of the gas concentration estimation are compared with the ones obtained using a photo-ionization detector. PMID:25019637

  13. Classifying emotion in Twitter using Bayesian network

    NASA Astrophysics Data System (ADS)

    Surya Asriadie, Muhammad; Syahrul Mubarok, Mohamad; Adiwijaya

    2018-03-01

    Language is used to express not only facts, but also emotions. Emotions are noticeable from behavior up to the social media statuses written by a person. Analysis of emotions in a text is done in a variety of media such as Twitter. This paper studies classification of emotions on twitter using Bayesian network because of its ability to model uncertainty and relationships between features. The result is two models based on Bayesian network which are Full Bayesian Network (FBN) and Bayesian Network with Mood Indicator (BNM). FBN is a massive Bayesian network where each word is treated as a node. The study shows the method used to train FBN is not very effective to create the best model and performs worse compared to Naive Bayes. F1-score for FBN is 53.71%, while for Naive Bayes is 54.07%. BNM is proposed as an alternative method which is based on the improvement of Multinomial Naive Bayes and has much lower computational complexity compared to FBN. Even though it’s not better compared to FBN, the resulting model successfully improves the performance of Multinomial Naive Bayes. F1-Score for Multinomial Naive Bayes model is 51.49%, while for BNM is 52.14%.

  14. Additive Genetic Variability and the Bayesian Alphabet

    PubMed Central

    Gianola, Daniel; de los Campos, Gustavo; Hill, William G.; Manfredi, Eduardo; Fernando, Rohan

    2009-01-01

    The use of all available molecular markers in statistical models for prediction of quantitative traits has led to what could be termed a genomic-assisted selection paradigm in animal and plant breeding. This article provides a critical review of some theoretical and statistical concepts in the context of genomic-assisted genetic evaluation of animals and crops. First, relationships between the (Bayesian) variance of marker effects in some regression models and additive genetic variance are examined under standard assumptions. Second, the connection between marker genotypes and resemblance between relatives is explored, and linkages between a marker-based model and the infinitesimal model are reviewed. Third, issues associated with the use of Bayesian models for marker-assisted selection, with a focus on the role of the priors, are examined from a theoretical angle. The sensitivity of a Bayesian specification that has been proposed (called “Bayes A”) with respect to priors is illustrated with a simulation. Methods that can solve potential shortcomings of some of these Bayesian regression procedures are discussed briefly. PMID:19620397

  15. Bayesian network modeling applied to coastal geomorphology: lessons learned from a decade of experimentation and application

    NASA Astrophysics Data System (ADS)

    Plant, N. G.; Thieler, E. R.; Gutierrez, B.; Lentz, E. E.; Zeigler, S. L.; Van Dongeren, A.; Fienen, M. N.

    2016-12-01

    We evaluate the strengths and weaknesses of Bayesian networks that have been used to address scientific and decision-support questions related to coastal geomorphology. We will provide an overview of coastal geomorphology research that has used Bayesian networks and describe what this approach can do and when it works (or fails to work). Over the past decade, Bayesian networks have been formulated to analyze the multi-variate structure and evolution of coastal morphology and associated human and ecological impacts. The approach relates observable system variables to each other by estimating discrete correlations. The resulting Bayesian-networks make predictions that propagate errors, conduct inference via Bayes rule, or both. In scientific applications, the model results are useful for hypothesis testing, using confidence estimates to gage the strength of tests while applications to coastal resource management are aimed at decision-support, where the probabilities of desired ecosystems outcomes are evaluated. The range of Bayesian-network applications to coastal morphology includes emulation of high-resolution wave transformation models to make oceanographic predictions, morphologic response to storms and/or sea-level rise, groundwater response to sea-level rise and morphologic variability, habitat suitability for endangered species, and assessment of monetary or human-life risk associated with storms. All of these examples are based on vast observational data sets, numerical model output, or both. We will discuss the progression of our experiments, which has included testing whether the Bayesian-network approach can be implemented and is appropriate for addressing basic and applied scientific problems and evaluating the hindcast and forecast skill of these implementations. We will present and discuss calibration/validation tests that are used to assess the robustness of Bayesian-network models and we will compare these results to tests of other models. This will demonstrate how Bayesian networks are used to extract new insights about coastal morphologic behavior, assess impacts to societal and ecological systems, and communicate probabilistic predictions to decision makers.

  16. A comment on priors for Bayesian occupancy models.

    PubMed

    Northrup, Joseph M; Gerber, Brian D

    2018-01-01

    Understanding patterns of species occurrence and the processes underlying these patterns is fundamental to the study of ecology. One of the more commonly used approaches to investigate species occurrence patterns is occupancy modeling, which can account for imperfect detection of a species during surveys. In recent years, there has been a proliferation of Bayesian modeling in ecology, which includes fitting Bayesian occupancy models. The Bayesian framework is appealing to ecologists for many reasons, including the ability to incorporate prior information through the specification of prior distributions on parameters. While ecologists almost exclusively intend to choose priors so that they are "uninformative" or "vague", such priors can easily be unintentionally highly informative. Here we report on how the specification of a "vague" normally distributed (i.e., Gaussian) prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation. Using both simulated data and empirical examples, we illustrate how this issue likely compromises inference about species-habitat relationships. While the extent to which these informative priors influence inference depends on the data set, researchers fitting Bayesian occupancy models should conduct sensitivity analyses to ensure intended inference, or employ less commonly used priors that are less informative (e.g., logistic or t prior distributions). We provide suggestions for addressing this issue in occupancy studies, and an online tool for exploring this issue under different contexts.

  17. Tmax Determined Using a Bayesian Estimation Deconvolution Algorithm Applied to Bolus Tracking Perfusion Imaging: A Digital Phantom Validation Study.

    PubMed

    Uwano, Ikuko; Sasaki, Makoto; Kudo, Kohsuke; Boutelier, Timothé; Kameda, Hiroyuki; Mori, Futoshi; Yamashita, Fumio

    2017-01-10

    The Bayesian estimation algorithm improves the precision of bolus tracking perfusion imaging. However, this algorithm cannot directly calculate Tmax, the time scale widely used to identify ischemic penumbra, because Tmax is a non-physiological, artificial index that reflects the tracer arrival delay (TD) and other parameters. We calculated Tmax from the TD and mean transit time (MTT) obtained by the Bayesian algorithm and determined its accuracy in comparison with Tmax obtained by singular value decomposition (SVD) algorithms. The TD and MTT maps were generated by the Bayesian algorithm applied to digital phantoms with time-concentration curves that reflected a range of values for various perfusion metrics using a global arterial input function. Tmax was calculated from the TD and MTT using constants obtained by a linear least-squares fit to Tmax obtained from the two SVD algorithms that showed the best benchmarks in a previous study. Correlations between the Tmax values obtained by the Bayesian and SVD methods were examined. The Bayesian algorithm yielded accurate TD and MTT values relative to the true values of the digital phantom. Tmax calculated from the TD and MTT values with the least-squares fit constants showed excellent correlation (Pearson's correlation coefficient = 0.99) and agreement (intraclass correlation coefficient = 0.99) with Tmax obtained from SVD algorithms. Quantitative analyses of Tmax values calculated from Bayesian-estimation algorithm-derived TD and MTT from a digital phantom correlated and agreed well with Tmax values determined using SVD algorithms.

  18. Final Report, DOE Early Career Award: Predictive modeling of complex physical systems: new tools for statistical inference, uncertainty quantification, and experimental design

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marzouk, Youssef

    Predictive simulation of complex physical systems increasingly rests on the interplay of experimental observations with computational models. Key inputs, parameters, or structural aspects of models may be incomplete or unknown, and must be developed from indirect and limited observations. At the same time, quantified uncertainties are needed to qualify computational predictions in the support of design and decision-making. In this context, Bayesian statistics provides a foundation for inference from noisy and limited data, but at prohibitive computional expense. This project intends to make rigorous predictive modeling *feasible* in complex physical systems, via accelerated and scalable tools for uncertainty quantification, Bayesianmore » inference, and experimental design. Specific objectives are as follows: 1. Develop adaptive posterior approximations and dimensionality reduction approaches for Bayesian inference in high-dimensional nonlinear systems. 2. Extend accelerated Bayesian methodologies to large-scale {\\em sequential} data assimilation, fully treating nonlinear models and non-Gaussian state and parameter distributions. 3. Devise efficient surrogate-based methods for Bayesian model selection and the learning of model structure. 4. Develop scalable simulation/optimization approaches to nonlinear Bayesian experimental design, for both parameter inference and model selection. 5. Demonstrate these inferential tools on chemical kinetic models in reacting flow, constructing and refining thermochemical and electrochemical models from limited data. Demonstrate Bayesian filtering on canonical stochastic PDEs and in the dynamic estimation of inhomogeneous subsurface properties and flow fields.« less

  19. Uncertainty Estimation in Tsunami Initial Condition From Rapid Bayesian Finite Fault Modeling

    NASA Astrophysics Data System (ADS)

    Benavente, R. F.; Dettmer, J.; Cummins, P. R.; Urrutia, A.; Cienfuegos, R.

    2017-12-01

    It is well known that kinematic rupture models for a given earthquake can present discrepancies even when similar datasets are employed in the inversion process. While quantifying this variability can be critical when making early estimates of the earthquake and triggered tsunami impact, "most likely models" are normally used for this purpose. In this work, we quantify the uncertainty of the tsunami initial condition for the great Illapel earthquake (Mw = 8.3, 2015, Chile). We focus on utilizing data and inversion methods that are suitable to rapid source characterization yet provide meaningful and robust results. Rupture models from teleseismic body and surface waves as well as W-phase are derived and accompanied by Bayesian uncertainty estimates from linearized inversion under positivity constraints. We show that robust and consistent features about the rupture kinematics appear when working within this probabilistic framework. Moreover, by using static dislocation theory, we translate the probabilistic slip distributions into seafloor deformation which we interpret as a tsunami initial condition. After considering uncertainty, our probabilistic seafloor deformation models obtained from different data types appear consistent with each other providing meaningful results. We also show that selecting just a single "representative" solution from the ensemble of initial conditions for tsunami propagation may lead to overestimating information content in the data. Our results suggest that rapid, probabilistic rupture models can play a significant role during emergency response by providing robust information about the extent of the disaster.

  20. Linear models for airborne-laser-scanning-based operational forest inventory with small field sample size and highly correlated LiDAR data

    USGS Publications Warehouse

    Junttila, Virpi; Kauranne, Tuomo; Finley, Andrew O.; Bradford, John B.

    2015-01-01

    Modern operational forest inventory often uses remotely sensed data that cover the whole inventory area to produce spatially explicit estimates of forest properties through statistical models. The data obtained by airborne light detection and ranging (LiDAR) correlate well with many forest inventory variables, such as the tree height, the timber volume, and the biomass. To construct an accurate model over thousands of hectares, LiDAR data must be supplemented with several hundred field sample measurements of forest inventory variables. This can be costly and time consuming. Different LiDAR-data-based and spatial-data-based sampling designs can reduce the number of field sample plots needed. However, problems arising from the features of the LiDAR data, such as a large number of predictors compared with the sample size (overfitting) or a strong correlation among predictors (multicollinearity), may decrease the accuracy and precision of the estimates and predictions. To overcome these problems, a Bayesian linear model with the singular value decomposition of predictors, combined with regularization, is proposed. The model performance in predicting different forest inventory variables is verified in ten inventory areas from two continents, where the number of field sample plots is reduced using different sampling designs. The results show that, with an appropriate field plot selection strategy and the proposed linear model, the total relative error of the predicted forest inventory variables is only 5%–15% larger using 50 field sample plots than the error of a linear model estimated with several hundred field sample plots when we sum up the error due to both the model noise variance and the model’s lack of fit.

  1. Detecting temporal change in freshwater fisheries surveys: statistical power and the important linkages between management questions and monitoring objectives

    USGS Publications Warehouse

    Wagner, Tyler; Irwin, Brian J.; James R. Bence,; Daniel B. Hayes,

    2016-01-01

    Monitoring to detect temporal trends in biological and habitat indices is a critical component of fisheries management. Thus, it is important that management objectives are linked to monitoring objectives. This linkage requires a definition of what constitutes a management-relevant “temporal trend.” It is also important to develop expectations for the amount of time required to detect a trend (i.e., statistical power) and for choosing an appropriate statistical model for analysis. We provide an overview of temporal trends commonly encountered in fisheries management, review published studies that evaluated statistical power of long-term trend detection, and illustrate dynamic linear models in a Bayesian context, as an additional analytical approach focused on shorter term change. We show that monitoring programs generally have low statistical power for detecting linear temporal trends and argue that often management should be focused on different definitions of trends, some of which can be better addressed by alternative analytical approaches.

  2. Development of an Integrated Team Training Design and Assessment Architecture to Support Adaptability in Healthcare Teams

    DTIC Science & Technology

    2016-10-01

    and implementation of embedded, adaptive feedback and performance assessment. The investigators also initiated work designing a Bayesian Belief ...training; Teamwork; Adaptive performance; Leadership; Simulation; Modeling; Bayesian belief networks (BBN) 16. SECURITY CLASSIFICATION OF: 17. LIMITATION...Trauma teams Team training Teamwork Adaptability Adaptive performance Leadership Simulation Modeling Bayesian belief networks (BBN) 6

  3. A Bayesian Network Approach to Modeling Learning Progressions and Task Performance. CRESST Report 776

    ERIC Educational Resources Information Center

    West, Patti; Rutstein, Daisy Wise; Mislevy, Robert J.; Liu, Junhui; Choi, Younyoung; Levy, Roy; Crawford, Aaron; DiCerbo, Kristen E.; Chappel, Kristina; Behrens, John T.

    2010-01-01

    A major issue in the study of learning progressions (LPs) is linking student performance on assessment tasks to the progressions. This report describes the challenges faced in making this linkage using Bayesian networks to model LPs in the field of computer networking. The ideas are illustrated with exemplar Bayesian networks built on Cisco…

  4. Stochastic static fault slip inversion from geodetic data with non-negativity and bounds constraints

    NASA Astrophysics Data System (ADS)

    Nocquet, J.-M.

    2018-04-01

    Despite surface displacements observed by geodesy are linear combinations of slip at faults in an elastic medium, determining the spatial distribution of fault slip remains a ill-posed inverse problem. A widely used approach to circumvent the illness of the inversion is to add regularization constraints in terms of smoothing and/or damping so that the linear system becomes invertible. However, the choice of regularization parameters is often arbitrary, and sometimes leads to significantly different results. Furthermore, the resolution analysis is usually empirical and cannot be made independently of the regularization. The stochastic approach of inverse problems (Tarantola & Valette 1982; Tarantola 2005) provides a rigorous framework where the a priori information about the searched parameters is combined with the observations in order to derive posterior probabilities of the unkown parameters. Here, I investigate an approach where the prior probability density function (pdf) is a multivariate Gaussian function, with single truncation to impose positivity of slip or double truncation to impose positivity and upper bounds on slip for interseismic modeling. I show that the joint posterior pdf is similar to the linear untruncated Gaussian case and can be expressed as a Truncated Multi-Variate Normal (TMVN) distribution. The TMVN form can then be used to obtain semi-analytical formulas for the single, two-dimensional or n-dimensional marginal pdf. The semi-analytical formula involves the product of a Gaussian by an integral term that can be evaluated using recent developments in TMVN probabilities calculations (e.g. Genz & Bretz 2009). Posterior mean and covariance can also be efficiently derived. I show that the Maximum Posterior (MAP) can be obtained using a Non-Negative Least-Squares algorithm (Lawson & Hanson 1974) for the single truncated case or using the Bounded-Variable Least-Squares algorithm (Stark & Parker 1995) for the double truncated case. I show that the case of independent uniform priors can be approximated using TMVN. The numerical equivalence to Bayesian inversions using Monte Carlo Markov Chain (MCMC) sampling is shown for a synthetic example and a real case for interseismic modeling in Central Peru. The TMVN method overcomes several limitations of the Bayesian approach using MCMC sampling. First, the need of computer power is largely reduced. Second, unlike Bayesian MCMC based approach, marginal pdf, mean, variance or covariance are obtained independently one from each other. Third, the probability and cumulative density functions can be obtained with any density of points. Finally, determining the Maximum Posterior (MAP) is extremely fast.

  5. Hybrid ICA-Bayesian network approach reveals distinct effective connectivity differences in schizophrenia.

    PubMed

    Kim, D; Burge, J; Lane, T; Pearlson, G D; Kiehl, K A; Calhoun, V D

    2008-10-01

    We utilized a discrete dynamic Bayesian network (dDBN) approach (Burge, J., Lane, T., Link, H., Qiu, S., Clark, V.P., 2007. Discrete dynamic Bayesian network analysis of fMRI data. Hum Brain Mapp.) to determine differences in brain regions between patients with schizophrenia and healthy controls on a measure of effective connectivity, termed the approximate conditional likelihood score (ACL) (Burge, J., Lane, T., 2005. Learning Class-Discriminative Dynamic Bayesian Networks. Proceedings of the International Conference on Machine Learning, Bonn, Germany, pp. 97-104.). The ACL score represents a class-discriminative measure of effective connectivity by measuring the relative likelihood of the correlation between brain regions in one group versus another. The algorithm is capable of finding non-linear relationships between brain regions because it uses discrete rather than continuous values and attempts to model temporal relationships with a first-order Markov and stationary assumption constraint (Papoulis, A., 1991. Probability, random variables, and stochastic processes. McGraw-Hill, New York.). Since Bayesian networks are overly sensitive to noisy data, we introduced an independent component analysis (ICA) filtering approach that attempted to reduce the noise found in fMRI data by unmixing the raw datasets into a set of independent spatial component maps. Components that represented noise were removed and the remaining components reconstructed into the dimensions of the original fMRI datasets. We applied the dDBN algorithm to a group of 35 patients with schizophrenia and 35 matched healthy controls using an ICA filtered and unfiltered approach. We determined that filtering the data significantly improved the magnitude of the ACL score. Patients showed the greatest ACL scores in several regions, most markedly the cerebellar vermis and hemispheres. Our findings suggest that schizophrenia patients exhibit weaker connectivity than healthy controls in multiple regions, including bilateral temporal, frontal, and cerebellar regions during an auditory paradigm.

  6. Genomic prediction using an iterative conditional expectation algorithm for a fast BayesC-like model.

    PubMed

    Dong, Linsong; Wang, Zhiyong

    2018-06-11

    Genomic prediction is feasible for estimating genomic breeding values because of dense genome-wide markers and credible statistical methods, such as Genomic Best Linear Unbiased Prediction (GBLUP) and various Bayesian methods. Compared with GBLUP, Bayesian methods propose more flexible assumptions for the distributions of SNP effects. However, most Bayesian methods are performed based on Markov chain Monte Carlo (MCMC) algorithms, leading to computational efficiency challenges. Hence, some fast Bayesian approaches, such as fast BayesB (fBayesB), were proposed to speed up the calculation. This study proposed another fast Bayesian method termed fast BayesC (fBayesC). The prior distribution of fBayesC assumes that a SNP with probability γ has a non-zero effect which comes from a normal density with a common variance. The simulated data from QTLMAS XII workshop and actual data on large yellow croaker were used to compare the predictive results of fBayesB, fBayesC and (MCMC-based) BayesC. The results showed that when γ was set as a small value, such as 0.01 in the simulated data or 0.001 in the actual data, fBayesB and fBayesC yielded lower prediction accuracies (abilities) than BayesC. In the actual data, fBayesC could yield very similar predictive abilities as BayesC when γ ≥ 0.01. When γ = 0.01, fBayesB could also yield similar results as fBayesC and BayesC. However, fBayesB could not yield an explicit result when γ ≥ 0.1, but a similar situation was not observed for fBayesC. Moreover, the computational speed of fBayesC was significantly faster than that of BayesC, making fBayesC a promising method for genomic prediction.

  7. Nonparametric Bayesian Modeling for Automated Database Schema Matching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ferragut, Erik M; Laska, Jason A

    2015-01-01

    The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.

  8. BN-FLEMOps pluvial - A probabilistic multi-variable loss estimation model for pluvial floods

    NASA Astrophysics Data System (ADS)

    Roezer, V.; Kreibich, H.; Schroeter, K.; Doss-Gollin, J.; Lall, U.; Merz, B.

    2017-12-01

    Pluvial flood events, such as in Copenhagen (Denmark) in 2011, Beijing (China) in 2012 or Houston (USA) in 2016, have caused severe losses to urban dwellings in recent years. These floods are caused by storm events with high rainfall rates well above the design levels of urban drainage systems, which lead to inundation of streets and buildings. A projected increase in frequency and intensity of heavy rainfall events in many areas and an ongoing urbanization may increase pluvial flood losses in the future. For an efficient risk assessment and adaptation to pluvial floods, a quantification of the flood risk is needed. Few loss models have been developed particularly for pluvial floods. These models usually use simple waterlevel- or rainfall-loss functions and come with very high uncertainties. To account for these uncertainties and improve the loss estimation, we present a probabilistic multi-variable loss estimation model for pluvial floods based on empirical data. The model was developed in a two-step process using a machine learning approach and a comprehensive database comprising 783 records of direct building and content damage of private households. The data was gathered through surveys after four different pluvial flood events in Germany between 2005 and 2014. In a first step, linear and non-linear machine learning algorithms, such as tree-based and penalized regression models were used to identify the most important loss influencing factors among a set of 55 candidate variables. These variables comprise hydrological and hydraulic aspects, early warning, precaution, building characteristics and the socio-economic status of the household. In a second step, the most important loss influencing variables were used to derive a probabilistic multi-variable pluvial flood loss estimation model based on Bayesian Networks. Two different networks were tested: a score-based network learned from the data and a network based on expert knowledge. Loss predictions are made through Bayesian inference using Markov chain Monte Carlo (MCMC) sampling. With the ability to cope with incomplete information and use expert knowledge, as well as inherently providing quantitative uncertainty information, it is shown that loss models based on BNs are superior to deterministic approaches for pluvial flood risk assessment.

  9. Development of dynamic Bayesian models for web application test management

    NASA Astrophysics Data System (ADS)

    Azarnova, T. V.; Polukhin, P. V.; Bondarenko, Yu V.; Kashirina, I. L.

    2018-03-01

    The mathematical apparatus of dynamic Bayesian networks is an effective and technically proven tool that can be used to model complex stochastic dynamic processes. According to the results of the research, mathematical models and methods of dynamic Bayesian networks provide a high coverage of stochastic tasks associated with error testing in multiuser software products operated in a dynamically changing environment. Formalized representation of the discrete test process as a dynamic Bayesian model allows us to organize the logical connection between individual test assets for multiple time slices. This approach gives an opportunity to present testing as a discrete process with set structural components responsible for the generation of test assets. Dynamic Bayesian network-based models allow us to combine in one management area individual units and testing components with different functionalities and a direct influence on each other in the process of comprehensive testing of various groups of computer bugs. The application of the proposed models provides an opportunity to use a consistent approach to formalize test principles and procedures, methods used to treat situational error signs, and methods used to produce analytical conclusions based on test results.

  10. Trans-dimensional and hierarchical Bayesian approaches toward rigorous estimation of seismic sources and structures in the Northeast Asia

    NASA Astrophysics Data System (ADS)

    Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean

    2016-04-01

    A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.

  11. The Type Ia Supernova Color-Magnitude Relation and Host Galaxy Dust: A Simple Hierarchical Bayesian Model

    NASA Astrophysics Data System (ADS)

    Mandel, Kaisey; Scolnic, Daniel; Shariff, Hikmatali; Foley, Ryan; Kirshner, Robert

    2017-01-01

    Inferring peak optical absolute magnitudes of Type Ia supernovae (SN Ia) from distance-independent measures such as their light curve shapes and colors underpins the evidence for cosmic acceleration. SN Ia with broader, slower declining optical light curves are more luminous (“broader-brighter”) and those with redder colors are dimmer. But the “redder-dimmer” color-luminosity relation widely used in cosmological SN Ia analyses confounds its two separate physical origins. An intrinsic correlation arises from the physics of exploding white dwarfs, while interstellar dust in the host galaxy also makes SN Ia appear dimmer and redder. Conventional SN Ia cosmology analyses currently use a simplistic linear regression of magnitude versus color and light curve shape, which does not model intrinsic SN Ia variations and host galaxy dust as physically distinct effects, resulting in low color-magnitude slopes. We construct a probabilistic generative model for the dusty distribution of extinguished absolute magnitudes and apparent colors as the convolution of an intrinsic SN Ia color-magnitude distribution and a host galaxy dust reddening-extinction distribution. If the intrinsic color-magnitude (MB vs. B-V) slope βint differs from the host galaxy dust law RB, this convolution results in a specific curve of mean extinguished absolute magnitude vs. apparent color. The derivative of this curve smoothly transitions from βint in the blue tail to RB in the red tail of the apparent color distribution. The conventional linear fit approximates this effective curve near the average apparent color, resulting in an apparent slope βapp between βint and RB. We incorporate these effects into a hierarchical Bayesian statistical model for SN Ia light curve measurements, and analyze a dataset of SALT2 optical light curve fits of 277 nearby SN Ia at z < 0.10. The conventional linear fit obtains βapp ≈ 3. Our model finds a βint = 2.2 ± 0.3 and a distinct dust law of RB = 3.7 ± 0.3, consistent with the average for Milky Way dust, while correcting a systematic distance bias of ~0.10 mag in the tails of the apparent color distribution. This research is supported by NSF grants AST-156854, AST-1211196, and NASA grant NNX15AJ55G.

  12. Bayesian Models Leveraging Bioactivity and Cytotoxicity Information for Drug Discovery

    PubMed Central

    Ekins, Sean; Reynolds, Robert C.; Kim, Hiyun; Koo, Mi-Sun; Ekonomidis, Marilyn; Talaue, Meliza; Paget, Steve D.; Woolhiser, Lisa K.; Lenaerts, Anne J.; Bunin, Barry A.; Connell, Nancy; Freundlich, Joel S.

    2013-01-01

    SUMMARY Identification of unique leads represents a significant challenge in drug discovery. This hurdle is magnified in neglected diseases such as tuberculosis. We have leveraged public high-throughput screening (HTS) data, to experimentally validate virtual screening approach employing Bayesian models built with bioactivity information (single-event model) as well as bioactivity and cytotoxicity information (dual-event model). We virtually screen a commercial library and experimentally confirm actives with hit rates exceeding typical HTS results by 1-2 orders of magnitude. The first dual-event Bayesian model identified compounds with antitubercular whole-cell activity and low mammalian cell cytotoxicity from a published set of antimalarials. The most potent hit exhibits the in vitro activity and in vitro/in vivo safety profile of a drug lead. These Bayesian models offer significant economies in time and cost to drug discovery. PMID:23521795

  13. Bayesian data analysis for newcomers.

    PubMed

    Kruschke, John K; Liddell, Torrin M

    2018-02-01

    This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.

  14. Evaluation of calibration efficacy under different levels of uncertainty

    DOE PAGES

    Heo, Yeonsook; Graziano, Diane J.; Guzowski, Leah; ...

    2014-06-10

    This study examines how calibration performs under different levels of uncertainty in model input data. It specifically assesses the efficacy of Bayesian calibration to enhance the reliability of EnergyPlus model predictions. A Bayesian approach can be used to update uncertain values of parameters, given measured energy-use data, and to quantify the associated uncertainty.We assess the efficacy of Bayesian calibration under a controlled virtual-reality setup, which enables rigorous validation of the accuracy of calibration results in terms of both calibrated parameter values and model predictions. Case studies demonstrate the performance of Bayesian calibration of base models developed from audit data withmore » differing levels of detail in building design, usage, and operation.« less

  15. Quantifying and Reducing Curve-Fitting Uncertainty in Isc

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Campanelli, Mark; Duck, Benjamin; Emery, Keith

    2015-06-14

    Current-voltage (I-V) curve measurements of photovoltaic (PV) devices are used to determine performance parameters and to establish traceable calibration chains. Measurement standards specify localized curve fitting methods, e.g., straight-line interpolation/extrapolation of the I-V curve points near short-circuit current, Isc. By considering such fits as statistical linear regressions, uncertainties in the performance parameters are readily quantified. However, the legitimacy of such a computed uncertainty requires that the model be a valid (local) representation of the I-V curve and that the noise be sufficiently well characterized. Using more data points often has the advantage of lowering the uncertainty. However, more data pointsmore » can make the uncertainty in the fit arbitrarily small, and this fit uncertainty misses the dominant residual uncertainty due to so-called model discrepancy. Using objective Bayesian linear regression for straight-line fits for Isc, we investigate an evidence-based method to automatically choose data windows of I-V points with reduced model discrepancy. We also investigate noise effects. Uncertainties, aligned with the Guide to the Expression of Uncertainty in Measurement (GUM), are quantified throughout.« less

  16. Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

    NASA Astrophysics Data System (ADS)

    Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

    2017-05-01

    Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.

  17. Quantifying and Reducing Curve-Fitting Uncertainty in Isc: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Campanelli, Mark; Duck, Benjamin; Emery, Keith

    Current-voltage (I-V) curve measurements of photovoltaic (PV) devices are used to determine performance parameters and to establish traceable calibration chains. Measurement standards specify localized curve fitting methods, e.g., straight-line interpolation/extrapolation of the I-V curve points near short-circuit current, Isc. By considering such fits as statistical linear regressions, uncertainties in the performance parameters are readily quantified. However, the legitimacy of such a computed uncertainty requires that the model be a valid (local) representation of the I-V curve and that the noise be sufficiently well characterized. Using more data points often has the advantage of lowering the uncertainty. However, more data pointsmore » can make the uncertainty in the fit arbitrarily small, and this fit uncertainty misses the dominant residual uncertainty due to so-called model discrepancy. Using objective Bayesian linear regression for straight-line fits for Isc, we investigate an evidence-based method to automatically choose data windows of I-V points with reduced model discrepancy. We also investigate noise effects. Uncertainties, aligned with the Guide to the Expression of Uncertainty in Measurement (GUM), are quantified throughout.« less

  18. EFFICIENT MODEL-FITTING AND MODEL-COMPARISON FOR HIGH-DIMENSIONAL BAYESIAN GEOSTATISTICAL MODELS. (R826887)

    EPA Science Inventory

    Geostatistical models are appropriate for spatially distributed data measured at irregularly spaced locations. We propose an efficient Markov chain Monte Carlo (MCMC) algorithm for fitting Bayesian geostatistical models with substantial numbers of unknown parameters to sizable...

  19. Bivariate Left-Censored Bayesian Model for Predicting Exposure: Preliminary Analysis of Worker Exposure during the Deepwater Horizon Oil Spill.

    PubMed

    Groth, Caroline; Banerjee, Sudipto; Ramachandran, Gurumurthy; Stenzel, Mark R; Sandler, Dale P; Blair, Aaron; Engel, Lawrence S; Kwok, Richard K; Stewart, Patricia A

    2017-01-01

    In April 2010, the Deepwater Horizon oil rig caught fire and exploded, releasing almost 5 million barrels of oil into the Gulf of Mexico over the ensuing 3 months. Thousands of oil spill workers participated in the spill response and clean-up efforts. The GuLF STUDY being conducted by the National Institute of Environmental Health Sciences is an epidemiological study to investigate potential adverse health effects among these oil spill clean-up workers. Many volatile chemicals were released from the oil into the air, including total hydrocarbons (THC), which is a composite of the volatile components of oil including benzene, toluene, ethylbenzene, xylene, and hexane (BTEXH). Our goal is to estimate exposure levels to these toxic chemicals for groups of oil spill workers in the study (hereafter called exposure groups, EGs) with likely comparable exposure distributions. A large number of air measurements were collected, but many EGs are characterized by datasets with a large percentage of censored measurements (below the analytic methods' limits of detection) and/or a limited number of measurements. We use THC for which there was less censoring to develop predictive linear models for specific BTEXH air exposures with higher degrees of censoring. We present a novel Bayesian hierarchical linear model that allows us to predict, for different EGs simultaneously, exposure levels of a second chemical while accounting for censoring in both THC and the chemical of interest. We illustrate the methodology by estimating exposure levels for several EGs on the Development Driller III, a rig vessel charged with drilling one of the relief wells. The model provided credible estimates in this example for geometric means, arithmetic means, variances, correlations, and regression coefficients for each group. This approach should be considered when estimating exposures in situations when multiple chemicals are correlated and have varying degrees of censoring. © The Author 2017. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.

  20. Universal Darwinism As a Process of Bayesian Inference.

    PubMed

    Campbell, John O

    2016-01-01

    Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.

  1. Universal Darwinism As a Process of Bayesian Inference

    PubMed Central

    Campbell, John O.

    2016-01-01

    Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an “experiment” in the external world environment, and the results of that “experiment” or the “surprise” entailed by predicted and actual outcomes of the “experiment.” Minimization of free energy implies that the implicit measure of “surprise” experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature. PMID:27375438

  2. A Bayesian hierarchical diffusion model decomposition of performance in Approach–Avoidance Tasks

    PubMed Central

    Krypotos, Angelos-Miltiadis; Beckers, Tom; Kindt, Merel; Wagenmakers, Eric-Jan

    2015-01-01

    Common methods for analysing response time (RT) tasks, frequently used across different disciplines of psychology, suffer from a number of limitations such as the failure to directly measure the underlying latent processes of interest and the inability to take into account the uncertainty associated with each individual's point estimate of performance. Here, we discuss a Bayesian hierarchical diffusion model and apply it to RT data. This model allows researchers to decompose performance into meaningful psychological processes and to account optimally for individual differences and commonalities, even with relatively sparse data. We highlight the advantages of the Bayesian hierarchical diffusion model decomposition by applying it to performance on Approach–Avoidance Tasks, widely used in the emotion and psychopathology literature. Model fits for two experimental data-sets demonstrate that the model performs well. The Bayesian hierarchical diffusion model overcomes important limitations of current analysis procedures and provides deeper insight in latent psychological processes of interest. PMID:25491372

  3. Multivariate Bayesian modeling of known and unknown causes of events--an application to biosurveillance.

    PubMed

    Shen, Yanna; Cooper, Gregory F

    2012-09-01

    This paper investigates Bayesian modeling of known and unknown causes of events in the context of disease-outbreak detection. We introduce a multivariate Bayesian approach that models multiple evidential features of every person in the population. This approach models and detects (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A contribution of this paper is that it introduces a multivariate Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has general applicability in domains where the space of known causes is incomplete. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  4. A Bayesian Approach to Surrogacy Assessment Using Principal Stratification in Clinical Trials

    PubMed Central

    Li, Yun; Taylor, Jeremy M.G.; Elliott, Michael R.

    2011-01-01

    Summary A surrogate marker (S) is a variable that can be measured earlier and often easier than the true endpoint (T) in a clinical trial. Most previous research has been devoted to developing surrogacy measures to quantify how well S can replace T or examining the use of S in predicting the effect of a treatment (Z). However, the research often requires one to fit models for the distribution of T given S and Z. It is well known that such models do not have causal interpretations because the models condition on a post-randomization variable S. In this paper, we directly model the relationship among T, S and Z using a potential outcomes framework introduced by Frangakis and Rubin (2002). We propose a Bayesian estimation method to evaluate the causal probabilities associated with the cross-classification of the potential outcomes of S and T when S and T are both binary. We use a log-linear model to directly model the association between the potential outcomes of S and T through the odds ratios. The quantities derived from this approach always have causal interpretations. However, this causal model is not identifiable from the data without additional assumptions. To reduce the non-identifiability problem and increase the precision of statistical inferences, we assume monotonicity and incorporate prior belief that is plausible in the surrogate context by using prior distributions. We also explore the relationship among the surrogacy measures based on traditional models and this counterfactual model. The method is applied to the data from a glaucoma treatment study. PMID:19673864

  5. Bayesian inversion of seismic and electromagnetic data for marine gas reservoir characterization using multi-chain Markov chain Monte Carlo sampling

    DOE PAGES

    Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan; ...

    2017-10-17

    In this paper we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach ismore » used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated — reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less

  6. Bayesian inversion of seismic and electromagnetic data for marine gas reservoir characterization using multi-chain Markov chain Monte Carlo sampling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan

    In this paper we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach ismore » used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated — reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less

  7. On combination of strict Bayesian principles with model reduction technique or how stochastic model calibration can become feasible for large-scale applications

    NASA Astrophysics Data System (ADS)

    Oladyshkin, S.; Schroeder, P.; Class, H.; Nowak, W.

    2013-12-01

    Predicting underground carbon dioxide (CO2) storage represents a challenging problem in a complex dynamic system. Due to lacking information about reservoir parameters, quantification of uncertainties may become the dominant question in risk assessment. Calibration on past observed data from pilot-scale test injection can improve the predictive power of the involved geological, flow, and transport models. The current work performs history matching to pressure time series from a pilot storage site operated in Europe, maintained during an injection period. Simulation of compressible two-phase flow and transport (CO2/brine) in the considered site is computationally very demanding, requiring about 12 days of CPU time for an individual model run. For that reason, brute-force approaches for calibration are not feasible. In the current work, we explore an advanced framework for history matching based on the arbitrary polynomial chaos expansion (aPC) and strict Bayesian principles. The aPC [1] offers a drastic but accurate stochastic model reduction. Unlike many previous chaos expansions, it can handle arbitrary probability distribution shapes of uncertain parameters, and can therefore handle directly the statistical information appearing during the matching procedure. We capture the dependence of model output on these multipliers with the expansion-based reduced model. In our study we keep the spatial heterogeneity suggested by geophysical methods, but consider uncertainty in the magnitude of permeability trough zone-wise permeability multipliers. Next combined the aPC with Bootstrap filtering (a brute-force but fully accurate Bayesian updating mechanism) in order to perform the matching. In comparison to (Ensemble) Kalman Filters, our method accounts for higher-order statistical moments and for the non-linearity of both the forward model and the inversion, and thus allows a rigorous quantification of calibrated model uncertainty. The usually high computational costs of accurate filtering become very feasible for our suggested aPC-based calibration framework. However, the power of aPC-based Bayesian updating strongly depends on the accuracy of prior information. In the current study, the prior assumptions on the model parameters were not satisfactory and strongly underestimate the reservoir pressure. Thus, the aPC-based response surface used in Bootstrap filtering is fitted to a distant and poorly chosen region within the parameter space. Thanks to the iterative procedure suggested in [2] we overcome this drawback with small computational costs. The iteration successively improves the accuracy of the expansion around the current estimation of the posterior distribution. The final result is a calibrated model of the site that can be used for further studies, with an excellent match to the data. References [1] Oladyshkin S. and Nowak W. Data-driven uncertainty quantification using the arbitrary polynomial chaos expansion. Reliability Engineering and System Safety, 106:179-190, 2012. [2] Oladyshkin S., Class H., Nowak W. Bayesian updating via Bootstrap filtering combined with data-driven polynomial chaos expansions: methodology and application to history matching for carbon dioxide storage in geological formations. Computational Geosciences, 17 (4), 671-687, 2013.

  8. Model Comparison of Bayesian Semiparametric and Parametric Structural Equation Models

    ERIC Educational Resources Information Center

    Song, Xin-Yuan; Xia, Ye-Mao; Pan, Jun-Hao; Lee, Sik-Yum

    2011-01-01

    Structural equation models have wide applications. One of the most important issues in analyzing structural equation models is model comparison. This article proposes a Bayesian model comparison statistic, namely the "L[subscript nu]"-measure for both semiparametric and parametric structural equation models. For illustration purposes, we consider…

  9. Scale Mixture Models with Applications to Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Qin, Zhaohui S.; Damien, Paul; Walker, Stephen

    2003-11-01

    Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.

  10. Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis

    ERIC Educational Resources Information Center

    Ansari, Asim; Iyengar, Raghuram

    2006-01-01

    We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…

  11. Next Steps in Bayesian Structural Equation Models: Comments on, Variations of, and Extensions to Muthen and Asparouhov (2012)

    ERIC Educational Resources Information Center

    Rindskopf, David

    2012-01-01

    Muthen and Asparouhov (2012) made a strong case for the advantages of Bayesian methodology in factor analysis and structural equation models. I show additional extensions and adaptations of their methods and show how non-Bayesians can take advantage of many (though not all) of these advantages by using interval restrictions on parameters. By…

  12. Careful with Those Priors: A Note on Bayesian Estimation in Two-Parameter Logistic Item Response Theory Models

    ERIC Educational Resources Information Center

    Marcoulides, Katerina M.

    2018-01-01

    This study examined the use of Bayesian analysis methods for the estimation of item parameters in a two-parameter logistic item response theory model. Using simulated data under various design conditions with both informative and non-informative priors, the parameter recovery of Bayesian analysis methods were examined. Overall results showed that…

  13. A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.

    ERIC Educational Resources Information Center

    Glas, Cees A. W.; Meijer, Rob R.

    A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…

  14. A Tutorial Introduction to Bayesian Models of Cognitive Development

    ERIC Educational Resources Information Center

    Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei

    2011-01-01

    We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…

  15. Modeling of Academic Achievement of Primary School Students in Ethiopia Using Bayesian Multilevel Approach

    ERIC Educational Resources Information Center

    Sebro, Negusse Yohannes; Goshu, Ayele Taye

    2017-01-01

    This study aims to explore Bayesian multilevel modeling to investigate variations of average academic achievement of grade eight school students. A sample of 636 students is randomly selected from 26 private and government schools by a two-stage stratified sampling design. Bayesian method is used to estimate the fixed and random effects. Input and…

  16. A Simulation Study Comparison of Bayesian Estimation with Conventional Methods for Estimating Unknown Change Points

    ERIC Educational Resources Information Center

    Wang, Lijuan; McArdle, John J.

    2008-01-01

    The main purpose of this research is to evaluate the performance of a Bayesian approach for estimating unknown change points using Monte Carlo simulations. The univariate and bivariate unknown change point mixed models were presented and the basic idea of the Bayesian approach for estimating the models was discussed. The performance of Bayesian…

  17. Estimation of parameter uncertainty for an activated sludge model using Bayesian inference: a comparison with the frequentist method.

    PubMed

    Zonta, Zivko J; Flotats, Xavier; Magrí, Albert

    2014-08-01

    The procedure commonly used for the assessment of the parameters included in activated sludge models (ASMs) relies on the estimation of their optimal value within a confidence region (i.e. frequentist inference). Once optimal values are estimated, parameter uncertainty is computed through the covariance matrix. However, alternative approaches based on the consideration of the model parameters as probability distributions (i.e. Bayesian inference), may be of interest. The aim of this work is to apply (and compare) both Bayesian and frequentist inference methods when assessing uncertainty for an ASM-type model, which considers intracellular storage and biomass growth, simultaneously. Practical identifiability was addressed exclusively considering respirometric profiles based on the oxygen uptake rate and with the aid of probabilistic global sensitivity analysis. Parameter uncertainty was thus estimated according to both the Bayesian and frequentist inferential procedures. Results were compared in order to evidence the strengths and weaknesses of both approaches. Since it was demonstrated that Bayesian inference could be reduced to a frequentist approach under particular hypotheses, the former can be considered as a more generalist methodology. Hence, the use of Bayesian inference is encouraged for tackling inferential issues in ASM environments.

  18. Bayesian estimation of differential transcript usage from RNA-seq data.

    PubMed

    Papastamoulis, Panagiotis; Rattray, Magnus

    2017-11-27

    Next generation sequencing allows the identification of genes consisting of differentially expressed transcripts, a term which usually refers to changes in the overall expression level. A specific type of differential expression is differential transcript usage (DTU) and targets changes in the relative within gene expression of a transcript. The contribution of this paper is to: (a) extend the use of cjBitSeq to the DTU context, a previously introduced Bayesian model which is originally designed for identifying changes in overall expression levels and (b) propose a Bayesian version of DRIMSeq, a frequentist model for inferring DTU. cjBitSeq is a read based model and performs fully Bayesian inference by MCMC sampling on the space of latent state of each transcript per gene. BayesDRIMSeq is a count based model and estimates the Bayes Factor of a DTU model against a null model using Laplace's approximation. The proposed models are benchmarked against the existing ones using a recent independent simulation study as well as a real RNA-seq dataset. Our results suggest that the Bayesian methods exhibit similar performance with DRIMSeq in terms of precision/recall but offer better calibration of False Discovery Rate.

  19. Development of uncertainty-based work injury model using Bayesian structural equation modelling.

    PubMed

    Chatterjee, Snehamoy

    2014-01-01

    This paper proposed a Bayesian method-based structural equation model (SEM) of miners' work injury for an underground coal mine in India. The environmental and behavioural variables for work injury were identified and causal relationships were developed. For Bayesian modelling, prior distributions of SEM parameters are necessary to develop the model. In this paper, two approaches were adopted to obtain prior distribution for factor loading parameters and structural parameters of SEM. In the first approach, the prior distributions were considered as a fixed distribution function with specific parameter values, whereas, in the second approach, prior distributions of the parameters were generated from experts' opinions. The posterior distributions of these parameters were obtained by applying Bayesian rule. The Markov Chain Monte Carlo sampling in the form Gibbs sampling was applied for sampling from the posterior distribution. The results revealed that all coefficients of structural and measurement model parameters are statistically significant in experts' opinion-based priors, whereas, two coefficients are not statistically significant when fixed prior-based distributions are applied. The error statistics reveals that Bayesian structural model provides reasonably good fit of work injury with high coefficient of determination (0.91) and less mean squared error as compared to traditional SEM.

  20. BUMPER: the Bayesian User-friendly Model for Palaeo-Environmental Reconstruction

    NASA Astrophysics Data System (ADS)

    Holden, Phil; Birks, John; Brooks, Steve; Bush, Mark; Hwang, Grace; Matthews-Bird, Frazer; Valencia, Bryan; van Woesik, Robert

    2017-04-01

    We describe the Bayesian User-friendly Model for Palaeo-Environmental Reconstruction (BUMPER), a Bayesian transfer function for inferring past climate and other environmental variables from microfossil assemblages. The principal motivation for a Bayesian approach is that the palaeoenvironment is treated probabilistically, and can be updated as additional data become available. Bayesian approaches therefore provide a reconstruction-specific quantification of the uncertainty in the data and in the model parameters. BUMPER is fully self-calibrating, straightforward to apply, and computationally fast, requiring 2 seconds to build a 100-taxon model from a 100-site training-set on a standard personal computer. We apply the model's probabilistic framework to generate thousands of artificial training-sets under ideal assumptions. We then use these to demonstrate both the general applicability of the model and the sensitivity of reconstructions to the characteristics of the training-set, considering assemblage richness, taxon tolerances, and the number of training sites. We demonstrate general applicability to real data, considering three different organism types (chironomids, diatoms, pollen) and different reconstructed variables. In all of these applications an identically configured model is used, the only change being the input files that provide the training-set environment and taxon-count data.

  1. Application of bayesian networks to real-time flood risk estimation

    NASA Astrophysics Data System (ADS)

    Garrote, L.; Molina, M.; Blasco, G.

    2003-04-01

    This paper presents the application of a computational paradigm taken from the field of artificial intelligence - the bayesian network - to model the behaviour of hydrologic basins during floods. The final goal of this research is to develop representation techniques for hydrologic simulation models in order to define, develop and validate a mechanism, supported by a software environment, oriented to build decision models for the prediction and management of river floods in real time. The emphasis is placed on providing decision makers with tools to incorporate their knowledge of basin behaviour, usually formulated in terms of rainfall-runoff models, in the process of real-time decision making during floods. A rainfall-runoff model is only a step in the process of decision making. If a reliable rainfall forecast is available and the rainfall-runoff model is well calibrated, decisions can be based mainly on model results. However, in most practical situations, uncertainties in rainfall forecasts or model performance have to be incorporated in the decision process. The computation paradigm adopted for the simulation of hydrologic processes is the bayesian network. A bayesian network is a directed acyclic graph that represents causal influences between linked variables. Under this representation, uncertain qualitative variables are related through causal relations quantified with conditional probabilities. The solution algorithm allows the computation of the expected probability distribution of unknown variables conditioned to the observations. An approach to represent hydrologic processes by bayesian networks with temporal and spatial extensions is presented in this paper, together with a methodology for the development of bayesian models using results produced by deterministic hydrologic simulation models

  2. Inference of reaction rate parameters based on summary statistics from experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Khalil, Mohammad; Chowdhary, Kamaljit Singh; Safta, Cosmin

    Here, we present the results of an application of Bayesian inference and maximum entropy methods for the estimation of the joint probability density for the Arrhenius rate para meters of the rate coefficient of the H 2/O 2-mechanism chain branching reaction H + O 2 → OH + O. Available published data is in the form of summary statistics in terms of nominal values and error bars of the rate coefficient of this reaction at a number of temperature values obtained from shock-tube experiments. Our approach relies on generating data, in this case OH concentration profiles, consistent with the givenmore » summary statistics, using Approximate Bayesian Computation methods and a Markov Chain Monte Carlo procedure. The approach permits the forward propagation of parametric uncertainty through the computational model in a manner that is consistent with the published statistics. A consensus joint posterior on the parameters is obtained by pooling the posterior parameter densities given each consistent data set. To expedite this process, we construct efficient surrogates for the OH concentration using a combination of Pad'e and polynomial approximants. These surrogate models adequately represent forward model observables and their dependence on input parameters and are computationally efficient to allow their use in the Bayesian inference procedure. We also utilize Gauss-Hermite quadrature with Gaussian proposal probability density functions for moment computation resulting in orders of magnitude speedup in data likelihood evaluation. Despite the strong non-linearity in the model, the consistent data sets all res ult in nearly Gaussian conditional parameter probability density functions. The technique also accounts for nuisance parameters in the form of Arrhenius parameters of other rate coefficients with prescribed uncertainty. The resulting pooled parameter probability density function is propagated through stoichiometric hydrogen-air auto-ignition computations to illustrate the need to account for correlation among the Arrhenius rate parameters of one reaction and across rate parameters of different reactions.« less

  3. Inference of reaction rate parameters based on summary statistics from experiments

    DOE PAGES

    Khalil, Mohammad; Chowdhary, Kamaljit Singh; Safta, Cosmin; ...

    2016-10-15

    Here, we present the results of an application of Bayesian inference and maximum entropy methods for the estimation of the joint probability density for the Arrhenius rate para meters of the rate coefficient of the H 2/O 2-mechanism chain branching reaction H + O 2 → OH + O. Available published data is in the form of summary statistics in terms of nominal values and error bars of the rate coefficient of this reaction at a number of temperature values obtained from shock-tube experiments. Our approach relies on generating data, in this case OH concentration profiles, consistent with the givenmore » summary statistics, using Approximate Bayesian Computation methods and a Markov Chain Monte Carlo procedure. The approach permits the forward propagation of parametric uncertainty through the computational model in a manner that is consistent with the published statistics. A consensus joint posterior on the parameters is obtained by pooling the posterior parameter densities given each consistent data set. To expedite this process, we construct efficient surrogates for the OH concentration using a combination of Pad'e and polynomial approximants. These surrogate models adequately represent forward model observables and their dependence on input parameters and are computationally efficient to allow their use in the Bayesian inference procedure. We also utilize Gauss-Hermite quadrature with Gaussian proposal probability density functions for moment computation resulting in orders of magnitude speedup in data likelihood evaluation. Despite the strong non-linearity in the model, the consistent data sets all res ult in nearly Gaussian conditional parameter probability density functions. The technique also accounts for nuisance parameters in the form of Arrhenius parameters of other rate coefficients with prescribed uncertainty. The resulting pooled parameter probability density function is propagated through stoichiometric hydrogen-air auto-ignition computations to illustrate the need to account for correlation among the Arrhenius rate parameters of one reaction and across rate parameters of different reactions.« less

  4. Prospects and Potential Uses of Genomic Prediction of Key Performance Traits in Tetraploid Potato.

    PubMed

    Stich, Benjamin; Van Inghelandt, Delphine

    2018-01-01

    Genomic prediction is a routine tool in breeding programs of most major animal and plant species. However, its usefulness for potato breeding has not yet been evaluated in detail. The objectives of this study were to (i) examine the prospects of genomic prediction of key performance traits in a diversity panel of tetraploid potato modeling additive, dominance, and epistatic effects, (ii) investigate the effects of size and make up of training set, number of test environments and molecular markers on prediction accuracy, and (iii) assess the effect of including markers from candidate genes on the prediction accuracy. With genomic best linear unbiased prediction (GBLUP), BayesA, BayesCπ, and Bayesian LASSO, four different prediction methods were used for genomic prediction of relative area under disease progress curve after a Phytophthora infestans infection, plant maturity, maturity corrected resistance, tuber starch content, tuber starch yield (TSY), and tuber yield (TY) of 184 tetraploid potato clones or subsets thereof genotyped with the SolCAP 8.3k SNP array. The cross-validated prediction accuracies with GBLUP and the three Bayesian approaches for the six evaluated traits ranged from about 0.5 to about 0.8. For traits with a high expected genetic complexity, such as TSY and TY, we observed an 8% higher prediction accuracy using a model with additive and dominance effects compared with a model with additive effects only. Our results suggest that for oligogenic traits in general and when diagnostic markers are available in particular, the use of Bayesian methods for genomic prediction is highly recommended and that the diagnostic markers should be modeled as fixed effects. The evaluation of the relative performance of genomic prediction vs. phenotypic selection indicated that the former is superior, assuming cycle lengths and selection intensities that are possible to realize in commercial potato breeding programs.

  5. Prospects and Potential Uses of Genomic Prediction of Key Performance Traits in Tetraploid Potato

    PubMed Central

    Stich, Benjamin; Van Inghelandt, Delphine

    2018-01-01

    Genomic prediction is a routine tool in breeding programs of most major animal and plant species. However, its usefulness for potato breeding has not yet been evaluated in detail. The objectives of this study were to (i) examine the prospects of genomic prediction of key performance traits in a diversity panel of tetraploid potato modeling additive, dominance, and epistatic effects, (ii) investigate the effects of size and make up of training set, number of test environments and molecular markers on prediction accuracy, and (iii) assess the effect of including markers from candidate genes on the prediction accuracy. With genomic best linear unbiased prediction (GBLUP), BayesA, BayesCπ, and Bayesian LASSO, four different prediction methods were used for genomic prediction of relative area under disease progress curve after a Phytophthora infestans infection, plant maturity, maturity corrected resistance, tuber starch content, tuber starch yield (TSY), and tuber yield (TY) of 184 tetraploid potato clones or subsets thereof genotyped with the SolCAP 8.3k SNP array. The cross-validated prediction accuracies with GBLUP and the three Bayesian approaches for the six evaluated traits ranged from about 0.5 to about 0.8. For traits with a high expected genetic complexity, such as TSY and TY, we observed an 8% higher prediction accuracy using a model with additive and dominance effects compared with a model with additive effects only. Our results suggest that for oligogenic traits in general and when diagnostic markers are available in particular, the use of Bayesian methods for genomic prediction is highly recommended and that the diagnostic markers should be modeled as fixed effects. The evaluation of the relative performance of genomic prediction vs. phenotypic selection indicated that the former is superior, assuming cycle lengths and selection intensities that are possible to realize in commercial potato breeding programs. PMID:29563919

  6. Bayesian Modeling of a Human MMORPG Player

    NASA Astrophysics Data System (ADS)

    Synnaeve, Gabriel; Bessière, Pierre

    2011-03-01

    This paper describes an application of Bayesian programming to the control of an autonomous avatar in a multiplayer role-playing game (the example is based on World of Warcraft). We model a particular task, which consists of choosing what to do and to select which target in a situation where allies and foes are present. We explain the model in Bayesian programming and show how we could learn the conditional probabilities from data gathered during human-played sessions.

  7. A comment on priors for Bayesian occupancy models

    PubMed Central

    Gerber, Brian D.

    2018-01-01

    Understanding patterns of species occurrence and the processes underlying these patterns is fundamental to the study of ecology. One of the more commonly used approaches to investigate species occurrence patterns is occupancy modeling, which can account for imperfect detection of a species during surveys. In recent years, there has been a proliferation of Bayesian modeling in ecology, which includes fitting Bayesian occupancy models. The Bayesian framework is appealing to ecologists for many reasons, including the ability to incorporate prior information through the specification of prior distributions on parameters. While ecologists almost exclusively intend to choose priors so that they are “uninformative” or “vague”, such priors can easily be unintentionally highly informative. Here we report on how the specification of a “vague” normally distributed (i.e., Gaussian) prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation. Using both simulated data and empirical examples, we illustrate how this issue likely compromises inference about species-habitat relationships. While the extent to which these informative priors influence inference depends on the data set, researchers fitting Bayesian occupancy models should conduct sensitivity analyses to ensure intended inference, or employ less commonly used priors that are less informative (e.g., logistic or t prior distributions). We provide suggestions for addressing this issue in occupancy studies, and an online tool for exploring this issue under different contexts. PMID:29481554

  8. Application of linear mixed-effects model with LASSO to identify metal components associated with cardiac autonomic responses among welders: a repeated measures study

    PubMed Central

    Zhang, Jinming; Cavallari, Jennifer M; Fang, Shona C; Weisskopf, Marc G; Lin, Xihong; Mittleman, Murray A; Christiani, David C

    2017-01-01

    Background Environmental and occupational exposure to metals is ubiquitous worldwide, and understanding the hazardous metal components in this complex mixture is essential for environmental and occupational regulations. Objective To identify hazardous components from metal mixtures that are associated with alterations in cardiac autonomic responses. Methods Urinary concentrations of 16 types of metals were examined and ‘acceleration capacity’ (AC) and ‘deceleration capacity’ (DC), indicators of cardiac autonomic effects, were quantified from ECG recordings among 54 welders. We fitted linear mixed-effects models with least absolute shrinkage and selection operator (LASSO) to identify metal components that are associated with AC and DC. The Bayesian Information Criterion was used as the criterion for model selection procedures. Results Mercury and chromium were selected for DC analysis, whereas mercury, chromium and manganese were selected for AC analysis through the LASSO approach. When we fitted the linear mixed-effects models with ‘selected’ metal components only, the effect of mercury remained significant. Every 1 µg/L increase in urinary mercury was associated with −0.58 ms (−1.03, –0.13) changes in DC and 0.67 ms (0.25, 1.10) changes in AC. Conclusion Our study suggests that exposure to several metals is associated with impaired cardiac autonomic functions. Our findings should be replicated in future studies with larger sample sizes. PMID:28663305

  9. The Bayesian boom: good thing or bad?

    PubMed Central

    Hahn, Ulrike

    2014-01-01

    A series of high-profile critiques of Bayesian models of cognition have recently sparked controversy. These critiques question the contribution of rational, normative considerations in the study of cognition. The present article takes central claims from these critiques and evaluates them in light of specific models. Closer consideration of actual examples of Bayesian treatments of different cognitive phenomena allows one to defuse these critiques showing that they cannot be sustained across the diversity of applications of the Bayesian framework for cognitive modeling. More generally, there is nothing in the Bayesian framework that would inherently give rise to the deficits that these critiques perceive, suggesting they have been framed at the wrong level of generality. At the same time, the examples are used to demonstrate the different ways in which consideration of rationality uniquely benefits both theory and practice in the study of cognition. PMID:25152738

  10. Calibrating Bayesian Network Representations of Social-Behavioral Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Whitney, Paul D.; Walsh, Stephen J.

    2010-04-08

    While human behavior has long been studied, recent and ongoing advances in computational modeling present opportunities for recasting research outcomes in human behavior. In this paper we describe how Bayesian networks can represent outcomes of human behavior research. We demonstrate a Bayesian network that represents political radicalization research – and show a corresponding visual representation of aspects of this research outcome. Since Bayesian networks can be quantitatively compared with external observations, the representation can also be used for empirical assessments of the research which the network summarizes. For a political radicalization model based on published research, we show this empiricalmore » comparison with data taken from the Minorities at Risk Organizational Behaviors database.« less

  11. Bayesian analysis of anisotropic cosmologies: Bianchi VIIh and WMAP

    NASA Astrophysics Data System (ADS)

    McEwen, J. D.; Josset, T.; Feeney, S. M.; Peiris, H. V.; Lasenby, A. N.

    2013-12-01

    We perform a definitive analysis of Bianchi VIIh cosmologies with Wilkinson Microwave Anisotropy Probe (WMAP) observations of the cosmic microwave background (CMB) temperature anisotropies. Bayesian analysis techniques are developed to study anisotropic cosmologies using full-sky and partial-sky masked CMB temperature data. We apply these techniques to analyse the full-sky internal linear combination (ILC) map and a partial-sky masked W-band map of WMAP 9 yr observations. In addition to the physically motivated Bianchi VIIh model, we examine phenomenological models considered in previous studies, in which the Bianchi VIIh parameters are decoupled from the standard cosmological parameters. In the two phenomenological models considered, Bayes factors of 1.7 and 1.1 units of log-evidence favouring a Bianchi component are found in full-sky ILC data. The corresponding best-fitting Bianchi maps recovered are similar for both phenomenological models and are very close to those found in previous studies using earlier WMAP data releases. However, no evidence for a phenomenological Bianchi component is found in the partial-sky W-band data. In the physical Bianchi VIIh model, we find no evidence for a Bianchi component: WMAP data thus do not favour Bianchi VIIh cosmologies over the standard Λ cold dark matter (ΛCDM) cosmology. It is not possible to discount Bianchi VIIh cosmologies in favour of ΛCDM completely, but we are able to constrain the vorticity of physical Bianchi VIIh cosmologies at (ω/H)0 < 8.6 × 10-10 with 95 per cent confidence.

  12. A comprehensive probabilistic analysis model of oil pipelines network based on Bayesian network

    NASA Astrophysics Data System (ADS)

    Zhang, C.; Qin, T. X.; Jiang, B.; Huang, C.

    2018-02-01

    Oil pipelines network is one of the most important facilities of energy transportation. But oil pipelines network accident may result in serious disasters. Some analysis models for these accidents have been established mainly based on three methods, including event-tree, accident simulation and Bayesian network. Among these methods, Bayesian network is suitable for probabilistic analysis. But not all the important influencing factors are considered and the deployment rule of the factors has not been established. This paper proposed a probabilistic analysis model of oil pipelines network based on Bayesian network. Most of the important influencing factors, including the key environment condition and emergency response are considered in this model. Moreover, the paper also introduces a deployment rule for these factors. The model can be used in probabilistic analysis and sensitive analysis of oil pipelines network accident.

  13. An evaluation of Bayesian techniques for controlling model complexity and selecting inputs in a neural network for short-term load forecasting.

    PubMed

    Hippert, Henrique S; Taylor, James W

    2010-04-01

    Artificial neural networks have frequently been proposed for electricity load forecasting because of their capabilities for the nonlinear modelling of large multivariate data sets. Modelling with neural networks is not an easy task though; two of the main challenges are defining the appropriate level of model complexity, and choosing the input variables. This paper evaluates techniques for automatic neural network modelling within a Bayesian framework, as applied to six samples containing daily load and weather data for four different countries. We analyse input selection as carried out by the Bayesian 'automatic relevance determination', and the usefulness of the Bayesian 'evidence' for the selection of the best structure (in terms of number of neurones), as compared to methods based on cross-validation. Copyright 2009 Elsevier Ltd. All rights reserved.

  14. Bayesian dynamic modeling of time series of dengue disease case counts.

    PubMed

    Martínez-Bello, Daniel Adyro; López-Quílez, Antonio; Torres-Prieto, Alexander

    2017-07-01

    The aim of this study is to model the association between weekly time series of dengue case counts and meteorological variables, in a high-incidence city of Colombia, applying Bayesian hierarchical dynamic generalized linear models over the period January 2008 to August 2015. Additionally, we evaluate the model's short-term performance for predicting dengue cases. The methodology shows dynamic Poisson log link models including constant or time-varying coefficients for the meteorological variables. Calendar effects were modeled using constant or first- or second-order random walk time-varying coefficients. The meteorological variables were modeled using constant coefficients and first-order random walk time-varying coefficients. We applied Markov Chain Monte Carlo simulations for parameter estimation, and deviance information criterion statistic (DIC) for model selection. We assessed the short-term predictive performance of the selected final model, at several time points within the study period using the mean absolute percentage error. The results showed the best model including first-order random walk time-varying coefficients for calendar trend and first-order random walk time-varying coefficients for the meteorological variables. Besides the computational challenges, interpreting the results implies a complete analysis of the time series of dengue with respect to the parameter estimates of the meteorological effects. We found small values of the mean absolute percentage errors at one or two weeks out-of-sample predictions for most prediction points, associated with low volatility periods in the dengue counts. We discuss the advantages and limitations of the dynamic Poisson models for studying the association between time series of dengue disease and meteorological variables. The key conclusion of the study is that dynamic Poisson models account for the dynamic nature of the variables involved in the modeling of time series of dengue disease, producing useful models for decision-making in public health.

  15. Bayesian Inference for the Stereotype Regression Model: Application to a Case-control Study of Prostate Cancer

    PubMed Central

    Ahn, Jaeil; Mukherjee, Bhramar; Banerjee, Mousumi; Cooney, Kathleen A.

    2011-01-01

    Summary The stereotype regression model for categorical outcomes, proposed by Anderson (1984) is nested between the baseline category logits and adjacent category logits model with proportional odds structure. The stereotype model is more parsimonious than the ordinary baseline-category (or multinomial logistic) model due to a product representation of the log odds-ratios in terms of a common parameter corresponding to each predictor and category specific scores. The model could be used for both ordered and unordered outcomes. For ordered outcomes, the stereotype model allows more flexibility than the popular proportional odds model in capturing highly subjective ordinal scaling which does not result from categorization of a single latent variable, but are inherently multidimensional in nature. As pointed out by Greenland (1994), an additional advantage of the stereotype model is that it provides unbiased and valid inference under outcome-stratified sampling as in case-control studies. In addition, for matched case-control studies, the stereotype model is amenable to classical conditional likelihood principle, whereas there is no reduction due to sufficiency under the proportional odds model. In spite of these attractive features, the model has been applied less, as there are issues with maximum likelihood estimation and likelihood based testing approaches due to non-linearity and lack of identifiability of the parameters. We present comprehensive Bayesian inference and model comparison procedure for this class of models as an alternative to the classical frequentist approach. We illustrate our methodology by analyzing data from The Flint Men’s Health Study, a case-control study of prostate cancer in African-American men aged 40 to 79 years. We use clinical staging of prostate cancer in terms of Tumors, Nodes and Metastatsis (TNM) as the categorical response of interest. PMID:19731262

  16. An evaluation of the Bayesian approach to fitting the N-mixture model for use with pseudo-replicated count data

    USGS Publications Warehouse

    Toribo, S.G.; Gray, B.R.; Liang, S.

    2011-01-01

    The N-mixture model proposed by Royle in 2004 may be used to approximate the abundance and detection probability of animal species in a given region. In 2006, Royle and Dorazio discussed the advantages of using a Bayesian approach in modelling animal abundance and occurrence using a hierarchical N-mixture model. N-mixture models assume replication on sampling sites, an assumption that may be violated when the site is not closed to changes in abundance during the survey period or when nominal replicates are defined spatially. In this paper, we studied the robustness of a Bayesian approach to fitting the N-mixture model for pseudo-replicated count data. Our simulation results showed that the Bayesian estimates for abundance and detection probability are slightly biased when the actual detection probability is small and are sensitive to the presence of extra variability within local sites.

  17. Bayesian analysis of CCDM models

    NASA Astrophysics Data System (ADS)

    Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.

    2017-09-01

    Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.

  18. Refining value-at-risk estimates using a Bayesian Markov-switching GJR-GARCH copula-EVT model.

    PubMed

    Sampid, Marius Galabe; Hasim, Haslifah M; Dai, Hongsheng

    2018-01-01

    In this paper, we propose a model for forecasting Value-at-Risk (VaR) using a Bayesian Markov-switching GJR-GARCH(1,1) model with skewed Student's-t innovation, copula functions and extreme value theory. A Bayesian Markov-switching GJR-GARCH(1,1) model that identifies non-constant volatility over time and allows the GARCH parameters to vary over time following a Markov process, is combined with copula functions and EVT to formulate the Bayesian Markov-switching GJR-GARCH(1,1) copula-EVT VaR model, which is then used to forecast the level of risk on financial asset returns. We further propose a new method for threshold selection in EVT analysis, which we term the hybrid method. Empirical and back-testing results show that the proposed VaR models capture VaR reasonably well in periods of calm and in periods of crisis.

  19. Comparing interval estimates for small sample ordinal CFA models

    PubMed Central

    Natesan, Prathiba

    2015-01-01

    Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research. PMID:26579002

  20. Comparing interval estimates for small sample ordinal CFA models.

    PubMed

    Natesan, Prathiba

    2015-01-01

    Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research.

  1. Iterative Assessment of Statistically-Oriented and Standard Algorithms for Determining Muscle Onset with Intramuscular Electromyography.

    PubMed

    Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A

    2017-12-01

    The onset of muscle activity, as measured by electromyography (EMG), is a commonly applied metric in biomechanics. Intramuscular EMG is often used to examine deep musculature and there are currently no studies examining the effectiveness of algorithms for intramuscular EMG onset. The present study examines standard surface EMG onset algorithms (linear envelope, Teager-Kaiser Energy Operator, and sample entropy) and novel algorithms (time series mean-variance analysis, sequential/batch processing with parametric and nonparametric methods, and Bayesian changepoint analysis). Thirteen male and 5 female subjects had intramuscular EMG collected during isolated biceps brachii and vastus lateralis contractions, resulting in 103 trials. EMG onset was visually determined twice by 3 blinded reviewers. Since the reliability of visual onset was high (ICC (1,1) : 0.92), the mean of the 6 visual assessments was contrasted with the algorithmic approaches. Poorly performing algorithms were stepwise eliminated via (1) root mean square error analysis, (2) algorithm failure to identify onset/premature onset, (3) linear regression analysis, and (4) Bland-Altman plots. The top performing algorithms were all based on Bayesian changepoint analysis of rectified EMG and were statistically indistinguishable from visual analysis. Bayesian changepoint analysis has the potential to produce more reliable, accurate, and objective intramuscular EMG onset results than standard methodologies.

  2. a Novel Discrete Optimal Transport Method for Bayesian Inverse Problems

    NASA Astrophysics Data System (ADS)

    Bui-Thanh, T.; Myers, A.; Wang, K.; Thiery, A.

    2017-12-01

    We present the Augmented Ensemble Transform (AET) method for generating approximate samples from a high-dimensional posterior distribution as a solution to Bayesian inverse problems. Solving large-scale inverse problems is critical for some of the most relevant and impactful scientific endeavors of our time. Therefore, constructing novel methods for solving the Bayesian inverse problem in more computationally efficient ways can have a profound impact on the science community. This research derives the novel AET method for exploring a posterior by solving a sequence of linear programming problems, resulting in a series of transport maps which map prior samples to posterior samples, allowing for the computation of moments of the posterior. We show both theoretical and numerical results, indicating this method can offer superior computational efficiency when compared to other SMC methods. Most of this efficiency is derived from matrix scaling methods to solve the linear programming problem and derivative-free optimization for particle movement. We use this method to determine inter-well connectivity in a reservoir and the associated uncertainty related to certain parameters. The attached file shows the difference between the true parameter and the AET parameter in an example 3D reservoir problem. The error is within the Morozov discrepancy allowance with lower computational cost than other particle methods.

  3. Bayesian demography 250 years after Bayes

    PubMed Central

    Bijak, Jakub; Bryant, John

    2016-01-01

    Bayesian statistics offers an alternative to classical (frequentist) statistics. It is distinguished by its use of probability distributions to describe uncertain quantities, which leads to elegant solutions to many difficult statistical problems. Although Bayesian demography, like Bayesian statistics more generally, is around 250 years old, only recently has it begun to flourish. The aim of this paper is to review the achievements of Bayesian demography, address some misconceptions, and make the case for wider use of Bayesian methods in population studies. We focus on three applications: demographic forecasts, limited data, and highly structured or complex models. The key advantages of Bayesian methods are the ability to integrate information from multiple sources and to describe uncertainty coherently. Bayesian methods also allow for including additional (prior) information next to the data sample. As such, Bayesian approaches are complementary to many traditional methods, which can be productively re-expressed in Bayesian terms. PMID:26902889

  4. Estimating neural response functions from fMRI

    PubMed Central

    Kumar, Sukhbinder; Penny, William

    2014-01-01

    This paper proposes a methodology for estimating Neural Response Functions (NRFs) from fMRI data. These NRFs describe non-linear relationships between experimental stimuli and neuronal population responses. The method is based on a two-stage model comprising an NRF and a Hemodynamic Response Function (HRF) that are simultaneously fitted to fMRI data using a Bayesian optimization algorithm. This algorithm also produces a model evidence score, providing a formal model comparison method for evaluating alternative NRFs. The HRF is characterized using previously established “Balloon” and BOLD signal models. We illustrate the method with two example applications based on fMRI studies of the auditory system. In the first, we estimate the time constants of repetition suppression and facilitation, and in the second we estimate the parameters of population receptive fields in a tonotopic mapping study. PMID:24847246

  5. A two-stage model in a Bayesian framework to estimate a survival endpoint in the presence of confounding by indication.

    PubMed

    Bellera, Carine; Proust-Lima, Cécile; Joseph, Lawrence; Richaud, Pierre; Taylor, Jeremy; Sandler, Howard; Hanley, James; Mathoulin-Pélissier, Simone

    2018-04-01

    Background Biomarker series can indicate disease progression and predict clinical endpoints. When a treatment is prescribed depending on the biomarker, confounding by indication might be introduced if the treatment modifies the marker profile and risk of failure. Objective Our aim was to highlight the flexibility of a two-stage model fitted within a Bayesian Markov Chain Monte Carlo framework. For this purpose, we monitored the prostate-specific antigens in prostate cancer patients treated with external beam radiation therapy. In the presence of rising prostate-specific antigens after external beam radiation therapy, salvage hormone therapy can be prescribed to reduce both the prostate-specific antigens concentration and the risk of clinical failure, an illustration of confounding by indication. We focused on the assessment of the prognostic value of hormone therapy and prostate-specific antigens trajectory on the risk of failure. Methods We used a two-stage model within a Bayesian framework to assess the role of the prostate-specific antigens profile on clinical failure while accounting for a secondary treatment prescribed by indication. We modeled prostate-specific antigens using a hierarchical piecewise linear trajectory with a random changepoint. Residual prostate-specific antigens variability was expressed as a function of prostate-specific antigens concentration. Covariates in the survival model included hormone therapy, baseline characteristics, and individual predictions of the prostate-specific antigens nadir and timing and prostate-specific antigens slopes before and after the nadir as provided by the longitudinal process. Results We showed positive associations between an increased prostate-specific antigens nadir, an earlier changepoint and a steeper post-nadir slope with an increased risk of failure. Importantly, we highlighted a significant benefit of hormone therapy, an effect that was not observed when the prostate-specific antigens trajectory was not accounted for in the survival model. Conclusion Our modeling strategy was particularly flexible and accounted for multiple complex features of longitudinal and survival data, including the presence of a random changepoint and a time-dependent covariate.

  6. A Bayesian approach to meta-analysis of plant pathology studies.

    PubMed

    Mila, A L; Ngugi, H K

    2011-01-01

    Bayesian statistical methods are used for meta-analysis in many disciplines, including medicine, molecular biology, and engineering, but have not yet been applied for quantitative synthesis of plant pathology studies. In this paper, we illustrate the key concepts of Bayesian statistics and outline the differences between Bayesian and classical (frequentist) methods in the way parameters describing population attributes are considered. We then describe a Bayesian approach to meta-analysis and present a plant pathological example based on studies evaluating the efficacy of plant protection products that induce systemic acquired resistance for the management of fire blight of apple. In a simple random-effects model assuming a normal distribution of effect sizes and no prior information (i.e., a noninformative prior), the results of the Bayesian meta-analysis are similar to those obtained with classical methods. Implementing the same model with a Student's t distribution and a noninformative prior for the effect sizes, instead of a normal distribution, yields similar results for all but acibenzolar-S-methyl (Actigard) which was evaluated only in seven studies in this example. Whereas both the classical (P = 0.28) and the Bayesian analysis with a noninformative prior (95% credibility interval [CRI] for the log response ratio: -0.63 to 0.08) indicate a nonsignificant effect for Actigard, specifying a t distribution resulted in a significant, albeit variable, effect for this product (CRI: -0.73 to -0.10). These results confirm the sensitivity of the analytical outcome (i.e., the posterior distribution) to the choice of prior in Bayesian meta-analyses involving a limited number of studies. We review some pertinent literature on more advanced topics, including modeling of among-study heterogeneity, publication bias, analyses involving a limited number of studies, and methods for dealing with missing data, and show how these issues can be approached in a Bayesian framework. Bayesian meta-analysis can readily include information not easily incorporated in classical methods, and allow for a full evaluation of competing models. Given the power and flexibility of Bayesian methods, we expect them to become widely adopted for meta-analysis of plant pathology studies.

  7. Introduction to Bayesian statistical approaches to compositional analyses of transgenic crops 1. Model validation and setting the stage.

    PubMed

    Harrison, Jay M; Breeze, Matthew L; Harrigan, George G

    2011-08-01

    Statistical comparisons of compositional data generated on genetically modified (GM) crops and their near-isogenic conventional (non-GM) counterparts typically rely on classical significance testing. This manuscript presents an introduction to Bayesian methods for compositional analysis along with recommendations for model validation. The approach is illustrated using protein and fat data from two herbicide tolerant GM soybeans (MON87708 and MON87708×MON89788) and a conventional comparator grown in the US in 2008 and 2009. Guidelines recommended by the US Food and Drug Administration (FDA) in conducting Bayesian analyses of clinical studies on medical devices were followed. This study is the first Bayesian approach to GM and non-GM compositional comparisons. The evaluation presented here supports a conclusion that a Bayesian approach to analyzing compositional data can provide meaningful and interpretable results. We further describe the importance of method validation and approaches to model checking if Bayesian approaches to compositional data analysis are to be considered viable by scientists involved in GM research and regulation. Copyright © 2011 Elsevier Inc. All rights reserved.

  8. Bayesian analysis of the dynamic cosmic web in the SDSS galaxy survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leclercq, Florent; Wandelt, Benjamin; Jasche, Jens, E-mail: florent.leclercq@polytechnique.org, E-mail: jasche@iap.fr, E-mail: wandelt@iap.fr

    Recent application of the Bayesian algorithm \\textsc(borg) to the Sloan Digital Sky Survey (SDSS) main sample galaxies resulted in the physical inference of the formation history of the observed large-scale structure from its origin to the present epoch. In this work, we use these inferences as inputs for a detailed probabilistic cosmic web-type analysis. To do so, we generate a large set of data-constrained realizations of the large-scale structure using a fast, fully non-linear gravitational model. We then perform a dynamic classification of the cosmic web into four distinct components (voids, sheets, filaments, and clusters) on the basis of themore » tidal field. Our inference framework automatically and self-consistently propagates typical observational uncertainties to web-type classification. As a result, this study produces accurate cosmographic classification of large-scale structure elements in the SDSS volume. By also providing the history of these structure maps, the approach allows an analysis of the origin and growth of the early traces of the cosmic web present in the initial density field and of the evolution of global quantities such as the volume and mass filling fractions of different structures. For the problem of web-type classification, the results described in this work constitute the first connection between theory and observations at non-linear scales including a physical model of structure formation and the demonstrated capability of uncertainty quantification. A connection between cosmology and information theory using real data also naturally emerges from our probabilistic approach. Our results constitute quantitative chrono-cosmography of the complex web-like patterns underlying the observed galaxy distribution.« less

  9. A surrogate-based sensitivity quantification and Bayesian inversion of a regional groundwater flow model

    NASA Astrophysics Data System (ADS)

    Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor

    2018-02-01

    Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.

  10. Predicting the Future as Bayesian Inference: People Combine Prior Knowledge with Observations when Estimating Duration and Extent

    ERIC Educational Resources Information Center

    Griffiths, Thomas L.; Tenenbaum, Joshua B.

    2011-01-01

    Predicting the future is a basic problem that people have to solve every day and a component of planning, decision making, memory, and causal reasoning. In this article, we present 5 experiments testing a Bayesian model of predicting the duration or extent of phenomena from their current state. This Bayesian model indicates how people should…

  11. Variations on Bayesian Prediction and Inference

    DTIC Science & Technology

    2016-05-09

    inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle

  12. Surface wave tomography of the European crust and upper mantle from ambient seismic noise

    NASA Astrophysics Data System (ADS)

    LU, Y.; Stehly, L.; Paul, A.

    2017-12-01

    We present a high-resolution 3-D Shear wave velocity model of the European crust and upper mantle derived from ambient seismic noise tomography. In this study, we collect 4 years of continuous vertical-component seismic recordings from 1293 broadband stations across Europe (10W-35E, 30N-75N). We analyze group velocity dispersion from 5s to 150s for cross-correlations of more than 0.8 million virtual source-receiver pairs. 2-D group velocity maps are estimated using adaptive parameterization to accommodate the strong heterogeneity of path coverage. 3-D velocity model is obtained by merging 1-D models inverted at each pixel through a two-step data-driven inversion algorithm: a non-linear Bayesian Monte Carlo inversion, followed by a linearized inversion. Resulting S-wave velocity model and Moho depth are compared with previous geophysical studies: 1) The crustal model and Moho depth show striking agreement with active seismic imaging results. Moreover, it even provides new valuable information such as a strong difference of the European Moho along two seismic profiles in the Western Alps (Cifalps and ECORS-CROP). 2) The upper mantle model displays strong similarities with published models even at 150km deep, which is usually imaged using earthquake records.

  13. A rational model of function learning.

    PubMed

    Lucas, Christopher G; Griffiths, Thomas L; Williams, Joseph J; Kalish, Michael L

    2015-10-01

    Theories of how people learn relationships between continuous variables have tended to focus on two possibilities: one, that people are estimating explicit functions, or two that they are performing associative learning supported by similarity. We provide a rational analysis of function learning, drawing on work on regression in machine learning and statistics. Using the equivalence of Bayesian linear regression and Gaussian processes, which provide a probabilistic basis for similarity-based function learning, we show that learning explicit rules and using similarity can be seen as two views of one solution to this problem. We use this insight to define a rational model of human function learning that combines the strengths of both approaches and accounts for a wide variety of experimental results.

  14. FRIT characterized hierarchical kernel memory arrangement for multiband palmprint recognition

    NASA Astrophysics Data System (ADS)

    Kisku, Dakshina R.; Gupta, Phalguni; Sing, Jamuna K.

    2015-10-01

    In this paper, we present a hierarchical kernel associative memory (H-KAM) based computational model with Finite Ridgelet Transform (FRIT) representation for multispectral palmprint recognition. To characterize a multispectral palmprint image, the Finite Ridgelet Transform is used to achieve a very compact and distinctive representation of linear singularities while it also captures the singularities along lines and edges. The proposed system makes use of Finite Ridgelet Transform to represent multispectral palmprint image and it is then modeled by Kernel Associative Memories. Finally, the recognition scheme is thoroughly tested with a benchmarking multispectral palmprint database CASIA. For recognition purpose a Bayesian classifier is used. The experimental results exhibit robustness of the proposed system under different wavelengths of palm image.

  15. Posterior Predictive Model Checking in Bayesian Networks

    ERIC Educational Resources Information Center

    Crawford, Aaron

    2014-01-01

    This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex…

  16. Metrics for evaluating performance and uncertainty of Bayesian network models

    Treesearch

    Bruce G. Marcot

    2012-01-01

    This paper presents a selected set of existing and new metrics for gauging Bayesian network model performance and uncertainty. Selected existing and new metrics are discussed for conducting model sensitivity analysis (variance reduction, entropy reduction, case file simulation); evaluating scenarios (influence analysis); depicting model complexity (numbers of model...

  17. Technical note: Bayesian calibration of dynamic ruminant nutrition models.

    PubMed

    Reed, K F; Arhonditsis, G B; France, J; Kebreab, E

    2016-08-01

    Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  18. Bayesian logistic regression approaches to predict incorrect DRG assignment.

    PubMed

    Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural

    2018-05-07

    Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.

  19. Assessment of parametric uncertainty for groundwater reactive transport modeling,

    USGS Publications Warehouse

    Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun

    2014-01-01

    The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.

  20. Evaluating and improving the representation of heteroscedastic errors in hydrological models

    NASA Astrophysics Data System (ADS)

    McInerney, D. J.; Thyer, M. A.; Kavetski, D.; Kuczera, G. A.

    2013-12-01

    Appropriate representation of residual errors in hydrological modelling is essential for accurate and reliable probabilistic predictions. In particular, residual errors of hydrological models are often heteroscedastic, with large errors associated with high rainfall and runoff events. Recent studies have shown that using a weighted least squares (WLS) approach - where the magnitude of residuals are assumed to be linearly proportional to the magnitude of the flow - captures some of this heteroscedasticity. In this study we explore a range of Bayesian approaches for improving the representation of heteroscedasticity in residual errors. We compare several improved formulations of the WLS approach, the well-known Box-Cox transformation and the more recent log-sinh transformation. Our results confirm that these approaches are able to stabilize the residual error variance, and that it is possible to improve the representation of heteroscedasticity compared with the linear WLS approach. We also find generally good performance of the Box-Cox and log-sinh transformations, although as indicated in earlier publications, the Box-Cox transform sometimes produces unrealistically large prediction limits. Our work explores the trade-offs between these different uncertainty characterization approaches, investigates how their performance varies across diverse catchments and models, and recommends practical approaches suitable for large-scale applications.

  1. The development of a combined mathematical model to forecast the incidence of hepatitis E in Shanghai, China.

    PubMed

    Ren, Hong; Li, Jian; Yuan, Zheng-An; Hu, Jia-Yu; Yu, Yan; Lu, Yi-Han

    2013-09-08

    Sporadic hepatitis E has become an important public health concern in China. Accurate forecasting of the incidence of hepatitis E is needed to better plan future medical needs. Few mathematical models can be used because hepatitis E morbidity data has both linear and nonlinear patterns. We developed a combined mathematical model using an autoregressive integrated moving average model (ARIMA) and a back propagation neural network (BPNN) to forecast the incidence of hepatitis E. The morbidity data of hepatitis E in Shanghai from 2000 to 2012 were retrieved from the China Information System for Disease Control and Prevention. The ARIMA-BPNN combined model was trained with 144 months of morbidity data from January 2000 to December 2011, validated with 12 months of data January 2012 to December 2012, and then employed to forecast hepatitis E incidence January 2013 to December 2013 in Shanghai. Residual analysis, Root Mean Square Error (RMSE), normalized Bayesian Information Criterion (BIC), and stationary R square methods were used to compare the goodness-of-fit among ARIMA models. The Bayesian regularization back-propagation algorithm was used to train the network. The mean error rate (MER) was used to assess the validity of the combined model. A total of 7,489 hepatitis E cases was reported in Shanghai from 2000 to 2012. Goodness-of-fit (stationary R2=0.531, BIC= -4.768, Ljung-Box Q statistics=15.59, P=0.482) and parameter estimates were used to determine the best-fitting model as ARIMA (0,1,1)×(0,1,1)12. Predicted morbidity values in 2012 from best-fitting ARIMA model and actual morbidity data from 2000 to 2011 were used to further construct the combined model. The MER of the ARIMA model and the ARIMA-BPNN combined model were 0.250 and 0.176, respectively. The forecasted incidence of hepatitis E in 2013 was 0.095 to 0.372 per 100,000 population. There was a seasonal variation with a peak during January-March and a nadir during August-October. Time series analysis suggested a seasonal pattern of hepatitis E morbidity in Shanghai, China. An ARIMA-BPNN combined model was used to fit the linear and nonlinear patterns of time series data, and accurately forecast hepatitis E infections.

  2. Prospective evaluation of a Bayesian model to predict organizational change.

    PubMed

    Molfenter, Todd; Gustafson, Dave; Kilo, Chuck; Bhattacharya, Abhik; Olsson, Jesper

    2005-01-01

    This research examines a subjective Bayesian model's ability to predict organizational change outcomes and sustainability of those outcomes for project teams participating in a multi-organizational improvement collaborative.

  3. A unified Bayesian semiparametric approach to assess discrimination ability in survival analysis

    PubMed Central

    Zhao, Lili; Feng, Dai; Chen, Guoan; Taylor, Jeremy M.G.

    2015-01-01

    Summary The discriminatory ability of a marker for censored survival data is routinely assessed by the time-dependent ROC curve and the c-index. The time-dependent ROC curve evaluates the ability of a biomarker to predict whether a patient lives past a particular time t. The c-index measures the global concordance of the marker and the survival time regardless of the time point. We propose a Bayesian semiparametric approach to estimate these two measures. The proposed estimators are based on the conditional distribution of the survival time given the biomarker and the empirical biomarker distribution. The conditional distribution is estimated by a linear dependent Dirichlet process mixture model. The resulting ROC curve is smooth as it is estimated by a mixture of parametric functions. The proposed c-index estimator is shown to be more efficient than the commonly used Harrell's c-index since it uses all pairs of data rather than only informative pairs. The proposed estimators are evaluated through simulations and illustrated using a lung cancer dataset. PMID:26676324

  4. The ambient dose equivalent at flight altitudes: a fit to a large set of data using a Bayesian approach.

    PubMed

    Wissmann, F; Reginatto, M; Möller, T

    2010-09-01

    The problem of finding a simple, generally applicable description of worldwide measured ambient dose equivalent rates at aviation altitudes between 8 and 12 km is difficult to solve due to the large variety of functional forms and parametrisations that are possible. We present an approach that uses Bayesian statistics and Monte Carlo methods to fit mathematical models to a large set of data and to compare the different models. About 2500 data points measured in the periods 1997-1999 and 2003-2006 were used. Since the data cover wide ranges of barometric altitude, vertical cut-off rigidity and phases in the solar cycle 23, we developed functions which depend on these three variables. Whereas the dependence on the vertical cut-off rigidity is described by an exponential, the dependences on barometric altitude and solar activity may be approximated by linear functions in the ranges under consideration. Therefore, a simple Taylor expansion was used to define different models and to investigate the relevance of the different expansion coefficients. With the method presented here, it is possible to obtain probability distributions for each expansion coefficient and thus to extract reliable uncertainties even for the dose rate evaluated. The resulting function agrees well with new measurements made at fixed geographic positions and during long haul flights covering a wide range of latitudes.

  5. Bayesian truncation errors in chiral effective field theory: model checking and accounting for correlations

    NASA Astrophysics Data System (ADS)

    Melendez, Jordan; Wesolowski, Sarah; Furnstahl, Dick

    2017-09-01

    Chiral effective field theory (EFT) predictions are necessarily truncated at some order in the EFT expansion, which induces an error that must be quantified for robust statistical comparisons to experiment. A Bayesian model yields posterior probability distribution functions for these errors based on expectations of naturalness encoded in Bayesian priors and the observed order-by-order convergence pattern of the EFT. As a general example of a statistical approach to truncation errors, the model was applied to chiral EFT for neutron-proton scattering using various semi-local potentials of Epelbaum, Krebs, and Meißner (EKM). Here we discuss how our model can learn correlation information from the data and how to perform Bayesian model checking to validate that the EFT is working as advertised. Supported in part by NSF PHY-1614460 and DOE NUCLEI SciDAC DE-SC0008533.

  6. Reliability analysis using an exponential power model with bathtub-shaped failure rate function: a Bayes study.

    PubMed

    Shehla, Romana; Khan, Athar Ali

    2016-01-01

    Models with bathtub-shaped hazard function have been widely accepted in the field of reliability and medicine and are particularly useful in reliability related decision making and cost analysis. In this paper, the exponential power model capable of assuming increasing as well as bathtub-shape, is studied. This article makes a Bayesian study of the same model and simultaneously shows how posterior simulations based on Markov chain Monte Carlo algorithms can be straightforward and routine in R. The study is carried out for complete as well as censored data, under the assumption of weakly-informative priors for the parameters. In addition to this, inference interest focuses on the posterior distribution of non-linear functions of the parameters. Also, the model has been extended to include continuous explanatory variables and R-codes are well illustrated. Two real data sets are considered for illustrative purposes.

  7. A FAST BAYESIAN METHOD FOR UPDATING AND FORECASTING HOURLY OZONE LEVELS

    EPA Science Inventory

    A Bayesian hierarchical space-time model is proposed by combining information from real-time ambient AIRNow air monitoring data, and output from a computer simulation model known as the Community Multi-scale Air Quality (Eta-CMAQ) forecast model. A model validation analysis shows...

  8. Bayesian methods for characterizing unknown parameters of material models

    DOE PAGES

    Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.

    2016-02-04

    A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less

  9. Bayesian methods for characterizing unknown parameters of material models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.

    A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less

  10. Bayesian methods in reliability

    NASA Astrophysics Data System (ADS)

    Sander, P.; Badoux, R.

    1991-11-01

    The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.

  11. Bayesian accounts of covert selective attention: A tutorial review.

    PubMed

    Vincent, Benjamin T

    2015-05-01

    Decision making and optimal observer models offer an important theoretical approach to the study of covert selective attention. While their probabilistic formulation allows quantitative comparison to human performance, the models can be complex and their insights are not always immediately apparent. Part 1 establishes the theoretical appeal of the Bayesian approach, and introduces the way in which probabilistic approaches can be applied to covert search paradigms. Part 2 presents novel formulations of Bayesian models of 4 important covert attention paradigms, illustrating optimal observer predictions over a range of experimental manipulations. Graphical model notation is used to present models in an accessible way and Supplementary Code is provided to help bridge the gap between model theory and practical implementation. Part 3 reviews a large body of empirical and modelling evidence showing that many experimental phenomena in the domain of covert selective attention are a set of by-products. These effects emerge as the result of observers conducting Bayesian inference with noisy sensory observations, prior expectations, and knowledge of the generative structure of the stimulus environment.

  12. Bayesian Action–Perception Computational Model: Interaction of Production and Recognition of Cursive Letters

    PubMed Central

    Gilet, Estelle; Diard, Julien; Bessière, Pierre

    2011-01-01

    In this paper, we study the collaboration of perception and action representations involved in cursive letter recognition and production. We propose a mathematical formulation for the whole perception–action loop, based on probabilistic modeling and Bayesian inference, which we call the Bayesian Action–Perception (BAP) model. Being a model of both perception and action processes, the purpose of this model is to study the interaction of these processes. More precisely, the model includes a feedback loop from motor production, which implements an internal simulation of movement. Motor knowledge can therefore be involved during perception tasks. In this paper, we formally define the BAP model and show how it solves the following six varied cognitive tasks using Bayesian inference: i) letter recognition (purely sensory), ii) writer recognition, iii) letter production (with different effectors), iv) copying of trajectories, v) copying of letters, and vi) letter recognition (with internal simulation of movements). We present computer simulations of each of these cognitive tasks, and discuss experimental predictions and theoretical developments. PMID:21674043

  13. New Insights into Handling Missing Values in Environmental Epidemiological Studies

    PubMed Central

    Roda, Célina; Nicolis, Ioannis; Momas, Isabelle; Guihenneuc, Chantal

    2014-01-01

    Missing data are unavoidable in environmental epidemiologic surveys. The aim of this study was to compare methods for handling large amounts of missing values: omission of missing values, single and multiple imputations (through linear regression or partial least squares regression), and a fully Bayesian approach. These methods were applied to the PARIS birth cohort, where indoor domestic pollutant measurements were performed in a random sample of babies' dwellings. A simulation study was conducted to assess performances of different approaches with a high proportion of missing values (from 50% to 95%). Different simulation scenarios were carried out, controlling the true value of the association (odds ratio of 1.0, 1.2, and 1.4), and varying the health outcome prevalence. When a large amount of data is missing, omitting these missing data reduced statistical power and inflated standard errors, which affected the significance of the association. Single imputation underestimated the variability, and considerably increased risk of type I error. All approaches were conservative, except the Bayesian joint model. In the case of a common health outcome, the fully Bayesian approach is the most efficient approach (low root mean square error, reasonable type I error, and high statistical power). Nevertheless for a less prevalent event, the type I error is increased and the statistical power is reduced. The estimated posterior distribution of the OR is useful to refine the conclusion. Among the methods handling missing values, no approach is absolutely the best but when usual approaches (e.g. single imputation) are not sufficient, joint modelling approach of missing process and health association is more efficient when large amounts of data are missing. PMID:25226278

  14. Efficient reconstruction method for ground layer adaptive optics with mixed natural and laser guide stars.

    PubMed

    Wagner, Roland; Helin, Tapio; Obereder, Andreas; Ramlau, Ronny

    2016-02-20

    The imaging quality of modern ground-based telescopes such as the planned European Extremely Large Telescope is affected by atmospheric turbulence. In consequence, they heavily depend on stable and high-performance adaptive optics (AO) systems. Using measurements of incoming light from guide stars, an AO system compensates for the effects of turbulence by adjusting so-called deformable mirror(s) (DMs) in real time. In this paper, we introduce a novel reconstruction method for ground layer adaptive optics. In the literature, a common approach to this problem is to use Bayesian inference in order to model the specific noise structure appearing due to spot elongation. This approach leads to large coupled systems with high computational effort. Recently, fast solvers of linear order, i.e., with computational complexity O(n), where n is the number of DM actuators, have emerged. However, the quality of such methods typically degrades in low flux conditions. Our key contribution is to achieve the high quality of the standard Bayesian approach while at the same time maintaining the linear order speed of the recent solvers. Our method is based on performing a separate preprocessing step before applying the cumulative reconstructor (CuReD). The efficiency and performance of the new reconstructor are demonstrated using the OCTOPUS, the official end-to-end simulation environment of the ESO for extremely large telescopes. For more specific simulations we also use the MOST toolbox.

  15. Mapping brucellosis increases relative to elk density using hierarchical Bayesian models

    USGS Publications Warehouse

    Cross, Paul C.; Heisey, Dennis M.; Scurlock, Brandon M.; Edwards, William H.; Brennan, Angela; Ebinger, Michael R.

    2010-01-01

    The relationship between host density and parasite transmission is central to the effectiveness of many disease management strategies. Few studies, however, have empirically estimated this relationship particularly in large mammals. We applied hierarchical Bayesian methods to a 19-year dataset of over 6400 brucellosis tests of adult female elk (Cervus elaphus) in northwestern Wyoming. Management captures that occurred from January to March were over two times more likely to be seropositive than hunted elk that were killed in September to December, while accounting for site and year effects. Areas with supplemental feeding grounds for elk had higher seroprevalence in 1991 than other regions, but by 2009 many areas distant from the feeding grounds were of comparable seroprevalence. The increases in brucellosis seroprevalence were correlated with elk densities at the elk management unit, or hunt area, scale (mean 2070 km2; range = [95–10237]). The data, however, could not differentiate among linear and non-linear effects of host density. Therefore, control efforts that focus on reducing elk densities at a broad spatial scale were only weakly supported. Additional research on how a few, large groups within a region may be driving disease dynamics is needed for more targeted and effective management interventions. Brucellosis appears to be expanding its range into new regions and elk populations, which is likely to further complicate the United States brucellosis eradication program. This study is an example of how the dynamics of host populations can affect their ability to serve as disease reservoirs.

  16. A systematic review of Bayesian articles in psychology: The last 25 years.

    PubMed

    van de Schoot, Rens; Winter, Sonja D; Ryan, Oisín; Zondervan-Zwijnenburg, Mariëlle; Depaoli, Sarah

    2017-06-01

    Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  17. A Gibbs sampler for Bayesian analysis of site-occupancy data

    USGS Publications Warehouse

    Dorazio, Robert M.; Rodriguez, Daniel Taylor

    2012-01-01

    1. A Bayesian analysis of site-occupancy data containing covariates of species occurrence and species detection probabilities is usually completed using Markov chain Monte Carlo methods in conjunction with software programs that can implement those methods for any statistical model, not just site-occupancy models. Although these software programs are quite flexible, considerable experience is often required to specify a model and to initialize the Markov chain so that summaries of the posterior distribution can be estimated efficiently and accurately. 2. As an alternative to these programs, we develop a Gibbs sampler for Bayesian analysis of site-occupancy data that include covariates of species occurrence and species detection probabilities. This Gibbs sampler is based on a class of site-occupancy models in which probabilities of species occurrence and detection are specified as probit-regression functions of site- and survey-specific covariate measurements. 3. To illustrate the Gibbs sampler, we analyse site-occupancy data of the blue hawker, Aeshna cyanea (Odonata, Aeshnidae), a common dragonfly species in Switzerland. Our analysis includes a comparison of results based on Bayesian and classical (non-Bayesian) methods of inference. We also provide code (based on the R software program) for conducting Bayesian and classical analyses of site-occupancy data.

  18. Stochastic determination of matrix determinants

    NASA Astrophysics Data System (ADS)

    Dorn, Sebastian; Enßlin, Torsten A.

    2015-07-01

    Matrix determinants play an important role in data analysis, in particular when Gaussian processes are involved. Due to currently exploding data volumes, linear operations—matrices—acting on the data are often not accessible directly but are only represented indirectly in form of a computer routine. Such a routine implements the transformation a data vector undergoes under matrix multiplication. While efficient probing routines to estimate a matrix's diagonal or trace, based solely on such computationally affordable matrix-vector multiplications, are well known and frequently used in signal inference, there is no stochastic estimate for its determinant. We introduce a probing method for the logarithm of a determinant of a linear operator. Our method rests upon a reformulation of the log-determinant by an integral representation and the transformation of the involved terms into stochastic expressions. This stochastic determinant determination enables large-size applications in Bayesian inference, in particular evidence calculations, model comparison, and posterior determination.

  19. Sparse Coding and Counting for Robust Visual Tracking

    PubMed Central

    Liu, Risheng; Wang, Jing; Shang, Xiaoke; Wang, Yiyang; Su, Zhixun; Cai, Yu

    2016-01-01

    In this paper, we propose a novel sparse coding and counting method under Bayesian framework for visual tracking. In contrast to existing methods, the proposed method employs the combination of L0 and L1 norm to regularize the linear coefficients of incrementally updated linear basis. The sparsity constraint enables the tracker to effectively handle difficult challenges, such as occlusion or image corruption. To achieve real-time processing, we propose a fast and efficient numerical algorithm for solving the proposed model. Although it is an NP-hard problem, the proposed accelerated proximal gradient (APG) approach is guaranteed to converge to a solution quickly. Besides, we provide a closed solution of combining L0 and L1 regularized representation to obtain better sparsity. Experimental results on challenging video sequences demonstrate that the proposed method achieves state-of-the-art results both in accuracy and speed. PMID:27992474

  20. Stochastic determination of matrix determinants.

    PubMed

    Dorn, Sebastian; Ensslin, Torsten A

    2015-07-01

    Matrix determinants play an important role in data analysis, in particular when Gaussian processes are involved. Due to currently exploding data volumes, linear operations-matrices-acting on the data are often not accessible directly but are only represented indirectly in form of a computer routine. Such a routine implements the transformation a data vector undergoes under matrix multiplication. While efficient probing routines to estimate a matrix's diagonal or trace, based solely on such computationally affordable matrix-vector multiplications, are well known and frequently used in signal inference, there is no stochastic estimate for its determinant. We introduce a probing method for the logarithm of a determinant of a linear operator. Our method rests upon a reformulation of the log-determinant by an integral representation and the transformation of the involved terms into stochastic expressions. This stochastic determinant determination enables large-size applications in Bayesian inference, in particular evidence calculations, model comparison, and posterior determination.

Top