Model Comparison of Bayesian Semiparametric and Parametric Structural Equation Models
ERIC Educational Resources Information Center
Song, Xin-Yuan; Xia, Ye-Mao; Pan, Jun-Hao; Lee, Sik-Yum
2011-01-01
Structural equation models have wide applications. One of the most important issues in analyzing structural equation models is model comparison. This article proposes a Bayesian model comparison statistic, namely the "L[subscript nu]"-measure for both semiparametric and parametric structural equation models. For illustration purposes, we consider…
Bayesian naturalness, simplicity, and testability applied to the B ‑ L MSSM GUT
NASA Astrophysics Data System (ADS)
Fundira, Panashe; Purves, Austin
2018-04-01
Recent years have seen increased use of Bayesian model comparison to quantify notions such as naturalness, simplicity, and testability, especially in the area of supersymmetric model building. After demonstrating that Bayesian model comparison can resolve a paradox that has been raised in the literature concerning the naturalness of the proton mass, we apply Bayesian model comparison to GUTs, an area to which it has not been applied before. We find that the GUTs are substantially favored over the nonunifying puzzle model. Of the GUTs we consider, the B ‑ L MSSM GUT is the most favored, but the MSSM GUT is almost equally favored.
NASA Astrophysics Data System (ADS)
Li, L.; Xu, C.-Y.; Engeland, K.
2012-04-01
With respect to model calibration, parameter estimation and analysis of uncertainty sources, different approaches have been used in hydrological models. Bayesian method is one of the most widely used methods for uncertainty assessment of hydrological models, which incorporates different sources of information into a single analysis through Bayesian theorem. However, none of these applications can well treat the uncertainty in extreme flows of hydrological models' simulations. This study proposes a Bayesian modularization method approach in uncertainty assessment of conceptual hydrological models by considering the extreme flows. It includes a comprehensive comparison and evaluation of uncertainty assessments by a new Bayesian modularization method approach and traditional Bayesian models using the Metropolis Hasting (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions are used in combination with traditional Bayesian: the AR (1) plus Normal and time period independent model (Model 1), the AR (1) plus Normal and time period dependent model (Model 2) and the AR (1) plus multi-normal model (Model 3). The results reveal that (1) the simulations derived from Bayesian modularization method are more accurate with the highest Nash-Sutcliffe efficiency value, and (2) the Bayesian modularization method performs best in uncertainty estimates of entire flows and in terms of the application and computational efficiency. The study thus introduces a new approach for reducing the extreme flow's effect on the discharge uncertainty assessment of hydrological models via Bayesian. Keywords: extreme flow, uncertainty assessment, Bayesian modularization, hydrological model, WASMOD
Mertens, Ulf Kai; Voss, Andreas; Radev, Stefan
2018-01-01
We give an overview of the basic principles of approximate Bayesian computation (ABC), a class of stochastic methods that enable flexible and likelihood-free model comparison and parameter estimation. Our new open-source software called ABrox is used to illustrate ABC for model comparison on two prominent statistical tests, the two-sample t-test and the Levene-Test. We further highlight the flexibility of ABC compared to classical Bayesian hypothesis testing by computing an approximate Bayes factor for two multinomial processing tree models. Last but not least, throughout the paper, we introduce ABrox using the accompanied graphical user interface.
On the Adequacy of Bayesian Evaluations of Categorization Models: Reply to Vanpaemel and Lee (2012)
ERIC Educational Resources Information Center
Wills, Andy J.; Pothos, Emmanuel M.
2012-01-01
Vanpaemel and Lee (2012) argued, and we agree, that the comparison of formal models can be facilitated by Bayesian methods. However, Bayesian methods neither precede nor supplant our proposals (Wills & Pothos, 2012), as Bayesian methods can be applied both to our proposals and to their polar opposites. Furthermore, the use of Bayesian methods to…
NASA Astrophysics Data System (ADS)
Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn
2013-04-01
SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.
ERIC Educational Resources Information Center
Wu, Haiyan
2013-01-01
General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…
Onisko, Agnieszka; Druzdzel, Marek J; Austin, R Marshall
2016-01-01
Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan-Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches.
Bayesian inference based on dual generalized order statistics from the exponentiated Weibull model
NASA Astrophysics Data System (ADS)
Al Sobhi, Mashail M.
2015-02-01
Bayesian estimation for the two parameters and the reliability function of the exponentiated Weibull model are obtained based on dual generalized order statistics (DGOS). Also, Bayesian prediction bounds for future DGOS from exponentiated Weibull model are obtained. The symmetric and asymmetric loss functions are considered for Bayesian computations. The Markov chain Monte Carlo (MCMC) methods are used for computing the Bayes estimates and prediction bounds. The results have been specialized to the lower record values. Comparisons are made between Bayesian and maximum likelihood estimators via Monte Carlo simulation.
Geostatistical models are appropriate for spatially distributed data measured at irregularly spaced locations. We propose an efficient Markov chain Monte Carlo (MCMC) algorithm for fitting Bayesian geostatistical models with substantial numbers of unknown parameters to sizable...
An introduction to using Bayesian linear regression with clinical data.
Baldwin, Scott A; Larson, Michael J
2017-11-01
Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Harrison, Jay M; Breeze, Matthew L; Harrigan, George G
2011-08-01
Statistical comparisons of compositional data generated on genetically modified (GM) crops and their near-isogenic conventional (non-GM) counterparts typically rely on classical significance testing. This manuscript presents an introduction to Bayesian methods for compositional analysis along with recommendations for model validation. The approach is illustrated using protein and fat data from two herbicide tolerant GM soybeans (MON87708 and MON87708×MON89788) and a conventional comparator grown in the US in 2008 and 2009. Guidelines recommended by the US Food and Drug Administration (FDA) in conducting Bayesian analyses of clinical studies on medical devices were followed. This study is the first Bayesian approach to GM and non-GM compositional comparisons. The evaluation presented here supports a conclusion that a Bayesian approach to analyzing compositional data can provide meaningful and interpretable results. We further describe the importance of method validation and approaches to model checking if Bayesian approaches to compositional data analysis are to be considered viable by scientists involved in GM research and regulation. Copyright © 2011 Elsevier Inc. All rights reserved.
Bayesian Inference for Functional Dynamics Exploring in fMRI Data.
Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing
2016-01-01
This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
ERIC Educational Resources Information Center
Wang, Lijuan; McArdle, John J.
2008-01-01
The main purpose of this research is to evaluate the performance of a Bayesian approach for estimating unknown change points using Monte Carlo simulations. The univariate and bivariate unknown change point mixed models were presented and the basic idea of the Bayesian approach for estimating the models was discussed. The performance of Bayesian…
Strelioff, Christopher C; Crutchfield, James P; Hübler, Alfred W
2007-07-01
Markov chains are a natural and well understood tool for describing one-dimensional patterns in time or space. We show how to infer kth order Markov chains, for arbitrary k , from finite data by applying Bayesian methods to both parameter estimation and model-order selection. Extending existing results for multinomial models of discrete data, we connect inference to statistical mechanics through information-theoretic (type theory) techniques. We establish a direct relationship between Bayesian evidence and the partition function which allows for straightforward calculation of the expectation and variance of the conditional relative entropy and the source entropy rate. Finally, we introduce a method that uses finite data-size scaling with model-order comparison to infer the structure of out-of-class processes.
Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models.
Daunizeau, J; Friston, K J; Kiebel, S J
2009-11-01
In this paper, we describe a general variational Bayesian approach for approximate inference on nonlinear stochastic dynamic models. This scheme extends established approximate inference on hidden-states to cover: (i) nonlinear evolution and observation functions, (ii) unknown parameters and (precision) hyperparameters and (iii) model comparison and prediction under uncertainty. Model identification or inversion entails the estimation of the marginal likelihood or evidence of a model. This difficult integration problem can be finessed by optimising a free-energy bound on the evidence using results from variational calculus. This yields a deterministic update scheme that optimises an approximation to the posterior density on the unknown model variables. We derive such a variational Bayesian scheme in the context of nonlinear stochastic dynamic hierarchical models, for both model identification and time-series prediction. The computational complexity of the scheme is comparable to that of an extended Kalman filter, which is critical when inverting high dimensional models or long time-series. Using Monte-Carlo simulations, we assess the estimation efficiency of this variational Bayesian approach using three stochastic variants of chaotic dynamic systems. We also demonstrate the model comparison capabilities of the method, its self-consistency and its predictive power.
Rodgers, Joseph Lee
2016-01-01
The Bayesian-frequentist debate typically portrays these statistical perspectives as opposing views. However, both Bayesian and frequentist statisticians have expanded their epistemological basis away from a singular focus on the null hypothesis, to a broader perspective involving the development and comparison of competing statistical/mathematical models. For frequentists, statistical developments such as structural equation modeling and multilevel modeling have facilitated this transition. For Bayesians, the Bayes factor has facilitated this transition. The Bayes factor is treated in articles within this issue of Multivariate Behavioral Research. The current presentation provides brief commentary on those articles and more extended discussion of the transition toward a modern modeling epistemology. In certain respects, Bayesians and frequentists share common goals.
Efficient Posterior Probability Mapping Using Savage-Dickey Ratios
Penny, William D.; Ridgway, Gerard R.
2013-01-01
Statistical Parametric Mapping (SPM) is the dominant paradigm for mass-univariate analysis of neuroimaging data. More recently, a Bayesian approach termed Posterior Probability Mapping (PPM) has been proposed as an alternative. PPM offers two advantages: (i) inferences can be made about effect size thus lending a precise physiological meaning to activated regions, (ii) regions can be declared inactive. This latter facility is most parsimoniously provided by PPMs based on Bayesian model comparisons. To date these comparisons have been implemented by an Independent Model Optimization (IMO) procedure which separately fits null and alternative models. This paper proposes a more computationally efficient procedure based on Savage-Dickey approximations to the Bayes factor, and Taylor-series approximations to the voxel-wise posterior covariance matrices. Simulations show the accuracy of this Savage-Dickey-Taylor (SDT) method to be comparable to that of IMO. Results on fMRI data show excellent agreement between SDT and IMO for second-level models, and reasonable agreement for first-level models. This Savage-Dickey test is a Bayesian analogue of the classical SPM-F and allows users to implement model comparison in a truly interactive manner. PMID:23533640
Bayesian Network Meta-Analysis for Unordered Categorical Outcomes with Incomplete Data
ERIC Educational Resources Information Center
Schmid, Christopher H.; Trikalinos, Thomas A.; Olkin, Ingram
2014-01-01
We develop a Bayesian multinomial network meta-analysis model for unordered (nominal) categorical outcomes that allows for partially observed data in which exact event counts may not be known for each category. This model properly accounts for correlations of counts in mutually exclusive categories and enables proper comparison and ranking of…
Computational statistics using the Bayesian Inference Engine
NASA Astrophysics Data System (ADS)
Weinberg, Martin D.
2013-09-01
This paper introduces the Bayesian Inference Engine (BIE), a general parallel, optimized software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organize and reuse expensive derived data. The BIE is the first platform for computational statistics designed explicitly to enable Bayesian update and model comparison for astronomical problems. Bayesian update is based on the representation of high-dimensional posterior distributions using metric-ball-tree based kernel density estimation. Among its algorithmic offerings, the BIE emphasizes hybrid tempered Markov chain Monte Carlo schemes that robustly sample multimodal posterior distributions in high-dimensional parameter spaces. Moreover, the BIE implements a full persistence or serialization system that stores the full byte-level image of the running inference and previously characterized posterior distributions for later use. Two new algorithms to compute the marginal likelihood from the posterior distribution, developed for and implemented in the BIE, enable model comparison for complex models and data sets. Finally, the BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. It includes an extensible object-oriented and easily extended framework that implements every aspect of the Bayesian inference. By providing a variety of statistical algorithms for all phases of the inference problem, a scientist may explore a variety of approaches with a single model and data implementation. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU General Public License.
Nested Sampling for Bayesian Model Comparison in the Context of Salmonella Disease Dynamics
Dybowski, Richard; McKinley, Trevelyan J.; Mastroeni, Pietro; Restif, Olivier
2013-01-01
Understanding the mechanisms underlying the observed dynamics of complex biological systems requires the statistical assessment and comparison of multiple alternative models. Although this has traditionally been done using maximum likelihood-based methods such as Akaike's Information Criterion (AIC), Bayesian methods have gained in popularity because they provide more informative output in the form of posterior probability distributions. However, comparison between multiple models in a Bayesian framework is made difficult by the computational cost of numerical integration over large parameter spaces. A new, efficient method for the computation of posterior probabilities has recently been proposed and applied to complex problems from the physical sciences. Here we demonstrate how nested sampling can be used for inference and model comparison in biological sciences. We present a reanalysis of data from experimental infection of mice with Salmonella enterica showing the distribution of bacteria in liver cells. In addition to confirming the main finding of the original analysis, which relied on AIC, our approach provides: (a) integration across the parameter space, (b) estimation of the posterior parameter distributions (with visualisations of parameter correlations), and (c) estimation of the posterior predictive distributions for goodness-of-fit assessments of the models. The goodness-of-fit results suggest that alternative mechanistic models and a relaxation of the quasi-stationary assumption should be considered. PMID:24376528
Calibrating Bayesian Network Representations of Social-Behavioral Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Whitney, Paul D.; Walsh, Stephen J.
2010-04-08
While human behavior has long been studied, recent and ongoing advances in computational modeling present opportunities for recasting research outcomes in human behavior. In this paper we describe how Bayesian networks can represent outcomes of human behavior research. We demonstrate a Bayesian network that represents political radicalization research – and show a corresponding visual representation of aspects of this research outcome. Since Bayesian networks can be quantitatively compared with external observations, the representation can also be used for empirical assessments of the research which the network summarizes. For a political radicalization model based on published research, we show this empiricalmore » comparison with data taken from the Minorities at Risk Organizational Behaviors database.« less
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Model Comparison of Nonlinear Structural Equation Models with Fixed Covariates.
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan
2003-01-01
Proposed a new nonlinear structural equation model with fixed covariates to deal with some complicated substantive theory and developed a Bayesian path sampling procedure for model comparison. Illustrated the approach with an illustrative example using data from an international study. (SLD)
Variational learning and bits-back coding: an information-theoretic view to Bayesian learning.
Honkela, Antti; Valpola, Harri
2004-07-01
The bits-back coding first introduced by Wallace in 1990 and later by Hinton and van Camp in 1993 provides an interesting link between Bayesian learning and information-theoretic minimum-description-length (MDL) learning approaches. The bits-back coding allows interpreting the cost function used in the variational Bayesian method called ensemble learning as a code length in addition to the Bayesian view of misfit of the posterior approximation and a lower bound of model evidence. Combining these two viewpoints provides interesting insights to the learning process and the functions of different parts of the model. In this paper, the problem of variational Bayesian learning of hierarchical latent variable models is used to demonstrate the benefits of the two views. The code-length interpretation provides new views to many parts of the problem such as model comparison and pruning and helps explain many phenomena occurring in learning.
NASA Astrophysics Data System (ADS)
Lee, K. David; Wiesenfeld, Eric; Gelfand, Andrew
2007-04-01
One of the greatest challenges in modern combat is maintaining a high level of timely Situational Awareness (SA). In many situations, computational complexity and accuracy considerations make the development and deployment of real-time, high-level inference tools very difficult. An innovative hybrid framework that combines Bayesian inference, in the form of Bayesian Networks, and Possibility Theory, in the form of Fuzzy Logic systems, has recently been introduced to provide a rigorous framework for high-level inference. In previous research, the theoretical basis and benefits of the hybrid approach have been developed. However, lacking is a concrete experimental comparison of the hybrid framework with traditional fusion methods, to demonstrate and quantify this benefit. The goal of this research, therefore, is to provide a statistical analysis on the comparison of the accuracy and performance of hybrid network theory, with pure Bayesian and Fuzzy systems and an inexact Bayesian system approximated using Particle Filtering. To accomplish this task, domain specific models will be developed under these different theoretical approaches and then evaluated, via Monte Carlo Simulation, in comparison to situational ground truth to measure accuracy and fidelity. Following this, a rigorous statistical analysis of the performance results will be performed, to quantify the benefit of hybrid inference to other fusion tools.
Bayesian data analysis in population ecology: motivations, methods, and benefits
Dorazio, Robert
2016-01-01
During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.
Bayesian Computation for Log-Gaussian Cox Processes: A Comparative Analysis of Methods
Teng, Ming; Nathoo, Farouk S.; Johnson, Timothy D.
2017-01-01
The Log-Gaussian Cox Process is a commonly used model for the analysis of spatial point pattern data. Fitting this model is difficult because of its doubly-stochastic property, i.e., it is an hierarchical combination of a Poisson process at the first level and a Gaussian Process at the second level. Various methods have been proposed to estimate such a process, including traditional likelihood-based approaches as well as Bayesian methods. We focus here on Bayesian methods and several approaches that have been considered for model fitting within this framework, including Hamiltonian Monte Carlo, the Integrated nested Laplace approximation, and Variational Bayes. We consider these approaches and make comparisons with respect to statistical and computational efficiency. These comparisons are made through several simulation studies as well as through two applications, the first examining ecological data and the second involving neuroimaging data. PMID:29200537
Steingroever, Helen; Pachur, Thorsten; Šmíra, Martin; Lee, Michael D
2018-06-01
The Iowa Gambling Task (IGT) is one of the most popular experimental paradigms for comparing complex decision-making across groups. Most commonly, IGT behavior is analyzed using frequentist tests to compare performance across groups, and to compare inferred parameters of cognitive models developed for the IGT. Here, we present a Bayesian alternative based on Bayesian repeated-measures ANOVA for comparing performance, and a suite of three complementary model-based methods for assessing the cognitive processes underlying IGT performance. The three model-based methods involve Bayesian hierarchical parameter estimation, Bayes factor model comparison, and Bayesian latent-mixture modeling. We illustrate these Bayesian methods by applying them to test the extent to which differences in intuitive versus deliberate decision style are associated with differences in IGT performance. The results show that intuitive and deliberate decision-makers behave similarly on the IGT, and the modeling analyses consistently suggest that both groups of decision-makers rely on similar cognitive processes. Our results challenge the notion that individual differences in intuitive and deliberate decision styles have a broad impact on decision-making. They also highlight the advantages of Bayesian methods, especially their ability to quantify evidence in favor of the null hypothesis, and that they allow model-based analyses to incorporate hierarchical and latent-mixture structures.
Estimating Tree Height-Diameter Models with the Bayesian Method
Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei
2014-01-01
Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the “best” model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2. PMID:24711733
Estimating tree height-diameter models with the Bayesian method.
Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei
2014-01-01
Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the "best" model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2.
Bayesian model checking: A comparison of tests
NASA Astrophysics Data System (ADS)
Lucy, L. B.
2018-06-01
Two procedures for checking Bayesian models are compared using a simple test problem based on the local Hubble expansion. Over four orders of magnitude, p-values derived from a global goodness-of-fit criterion for posterior probability density functions agree closely with posterior predictive p-values. The former can therefore serve as an effective proxy for the difficult-to-calculate posterior predictive p-values.
Spertus, Jacob V; Normand, Sharon-Lise T
2018-04-23
High-dimensional data provide many potential confounders that may bolster the plausibility of the ignorability assumption in causal inference problems. Propensity score methods are powerful causal inference tools, which are popular in health care research and are particularly useful for high-dimensional data. Recent interest has surrounded a Bayesian treatment of propensity scores in order to flexibly model the treatment assignment mechanism and summarize posterior quantities while incorporating variance from the treatment model. We discuss methods for Bayesian propensity score analysis of binary treatments, focusing on modern methods for high-dimensional Bayesian regression and the propagation of uncertainty. We introduce a novel and simple estimator for the average treatment effect that capitalizes on conjugacy of the beta and binomial distributions. Through simulations, we show the utility of horseshoe priors and Bayesian additive regression trees paired with our new estimator, while demonstrating the importance of including variance from the treatment regression model. An application to cardiac stent data with almost 500 confounders and 9000 patients illustrates approaches and facilitates comparison with existing alternatives. As measured by a falsifiability endpoint, we improved confounder adjustment compared with past observational research of the same problem. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls that have plagued previous theoretical movements.
Sequential Inverse Problems Bayesian Principles and the Logistic Map Example
NASA Astrophysics Data System (ADS)
Duan, Lian; Farmer, Chris L.; Moroz, Irene M.
2010-09-01
Bayesian statistics provides a general framework for solving inverse problems, but is not without interpretation and implementation problems. This paper discusses difficulties arising from the fact that forward models are always in error to some extent. Using a simple example based on the one-dimensional logistic map, we argue that, when implementation problems are minimal, the Bayesian framework is quite adequate. In this paper the Bayesian Filter is shown to be able to recover excellent state estimates in the perfect model scenario (PMS) and to distinguish the PMS from the imperfect model scenario (IMS). Through a quantitative comparison of the way in which the observations are assimilated in both the PMS and the IMS scenarios, we suggest that one can, sometimes, measure the degree of imperfection.
NASA Astrophysics Data System (ADS)
Melendez, Jordan; Wesolowski, Sarah; Furnstahl, Dick
2017-09-01
Chiral effective field theory (EFT) predictions are necessarily truncated at some order in the EFT expansion, which induces an error that must be quantified for robust statistical comparisons to experiment. A Bayesian model yields posterior probability distribution functions for these errors based on expectations of naturalness encoded in Bayesian priors and the observed order-by-order convergence pattern of the EFT. As a general example of a statistical approach to truncation errors, the model was applied to chiral EFT for neutron-proton scattering using various semi-local potentials of Epelbaum, Krebs, and Meißner (EKM). Here we discuss how our model can learn correlation information from the data and how to perform Bayesian model checking to validate that the EFT is working as advertised. Supported in part by NSF PHY-1614460 and DOE NUCLEI SciDAC DE-SC0008533.
NASA Astrophysics Data System (ADS)
Gilet, Estelle; Diard, Julien; Palluel-Germain, Richard; Bessière, Pierre
2011-03-01
This paper is about modeling perception-action loops and, more precisely, the study of the influence of motor knowledge during perception tasks. We use the Bayesian Action-Perception (BAP) model, which deals with the sensorimotor loop involved in reading and writing cursive isolated letters and includes an internal simulation of movement loop. By using this probabilistic model we simulate letter recognition, both with and without internal motor simulation. Comparison of their performance yields an experimental prediction, which we set forth.
Integrated Data Analysis for Fusion: A Bayesian Tutorial for Fusion Diagnosticians
NASA Astrophysics Data System (ADS)
Dinklage, Andreas; Dreier, Heiko; Fischer, Rainer; Gori, Silvio; Preuss, Roland; Toussaint, Udo von
2008-03-01
Integrated Data Analysis (IDA) offers a unified way of combining information relevant to fusion experiments. Thereby, IDA meets with typical issues arising in fusion data analysis. In IDA, all information is consistently formulated as probability density functions quantifying uncertainties in the analysis within the Bayesian probability theory. For a single diagnostic, IDA allows the identification of faulty measurements and improvements in the setup. For a set of diagnostics, IDA gives joint error distributions allowing the comparison and integration of different diagnostics results. Validation of physics models can be performed by model comparison techniques. Typical data analysis applications benefit from IDA capabilities of nonlinear error propagation, the inclusion of systematic effects and the comparison of different physics models. Applications range from outlier detection, background discrimination, model assessment and design of diagnostics. In order to cope with next step fusion device requirements, appropriate techniques are explored for fast analysis applications.
Bayesian accounts of covert selective attention: A tutorial review.
Vincent, Benjamin T
2015-05-01
Decision making and optimal observer models offer an important theoretical approach to the study of covert selective attention. While their probabilistic formulation allows quantitative comparison to human performance, the models can be complex and their insights are not always immediately apparent. Part 1 establishes the theoretical appeal of the Bayesian approach, and introduces the way in which probabilistic approaches can be applied to covert search paradigms. Part 2 presents novel formulations of Bayesian models of 4 important covert attention paradigms, illustrating optimal observer predictions over a range of experimental manipulations. Graphical model notation is used to present models in an accessible way and Supplementary Code is provided to help bridge the gap between model theory and practical implementation. Part 3 reviews a large body of empirical and modelling evidence showing that many experimental phenomena in the domain of covert selective attention are a set of by-products. These effects emerge as the result of observers conducting Bayesian inference with noisy sensory observations, prior expectations, and knowledge of the generative structure of the stimulus environment.
A Comparison of Imputation Methods for Bayesian Factor Analysis Models
ERIC Educational Resources Information Center
Merkle, Edgar C.
2011-01-01
Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…
Impact assessment of extreme storm events using a Bayesian network
den Heijer, C.(Kees); Knipping, Dirk T.J.A.; Plant, Nathaniel G.; van Thiel de Vries, Jaap S. M.; Baart, Fedor; van Gelder, Pieter H. A. J. M.
2012-01-01
This paper describes an investigation on the usefulness of Bayesian Networks in the safety assessment of dune coasts. A network has been created that predicts the erosion volume based on hydraulic boundary conditions and a number of cross-shore profile indicators. Field measurement data along a large part of the Dutch coast has been used to train the network. Corresponding storm impact on the dunes was calculated with an empirical dune erosion model named duros+. Comparison between the Bayesian Network predictions and the original duros+ results, here considered as observations, results in a skill up to 0.88, provided that the training data covers the range of predictions. Hence, the predictions from a deterministic model (duros+) can be captured in a probabilistic model (Bayesian Network) such that both the process knowledge and uncertainties can be included in impact and vulnerability assessments.
NASA Astrophysics Data System (ADS)
Kozoderov, V. V.; Kondranin, T. V.; Dmitriev, E. V.
2017-12-01
The basic model for the recognition of natural and anthropogenic objects using their spectral and textural features is described in the problem of hyperspectral air-borne and space-borne imagery processing. The model is based on improvements of the Bayesian classifier that is a computational procedure of statistical decision making in machine-learning methods of pattern recognition. The principal component method is implemented to decompose the hyperspectral measurements on the basis of empirical orthogonal functions. Application examples are shown of various modifications of the Bayesian classifier and Support Vector Machine method. Examples are provided of comparing these classifiers and a metrical classifier that operates on finding the minimal Euclidean distance between different points and sets in the multidimensional feature space. A comparison is also carried out with the " K-weighted neighbors" method that is close to the nonparametric Bayesian classifier.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dirian, Yves; Foffa, Stefano; Kunz, Martin
We present a comprehensive and updated comparison with cosmological observations of two non-local modifications of gravity previously introduced by our group, the so called RR and RT models. We implement the background evolution and the cosmological perturbations of the models in a modified Boltzmann code, using CLASS. We then test the non-local models against the Planck 2015 TT, TE, EE and Cosmic Microwave Background (CMB) lensing data, isotropic and anisotropic Baryonic Acoustic Oscillations (BAO) data, JLA supernovae, H {sub 0} measurements and growth rate data, and we perform Bayesian parameter estimation. We then compare the RR, RT and ΛCDM models,more » using the Savage-Dickey method. We find that the RT model and ΛCDM perform equally well, while the performance of the RR model with respect to ΛCDM depends on whether or not we include a prior on H {sub 0} based on local measurements.« less
NASA Astrophysics Data System (ADS)
Abiriand Bhekisipho Twala, Olufunminiyi
2017-08-01
In this paper, a multilayer feedforward neural network with Bayesian regularization constitutive model is developed for alloy 316L during high strain rate and high temperature plastic deformation. The input variables are strain rate, temperature and strain while the output value is the flow stress of the material. The results show that the use of Bayesian regularized technique reduces the potential of overfitting and overtraining. The prediction quality of the model is thereby improved. The model predictions are in good agreement with experimental measurements. The measurement data used for the network training and model comparison were taken from relevant literature. The developed model is robust as it can be generalized to deformation conditions slightly below or above the training dataset.
Zonta, Zivko J; Flotats, Xavier; Magrí, Albert
2014-08-01
The procedure commonly used for the assessment of the parameters included in activated sludge models (ASMs) relies on the estimation of their optimal value within a confidence region (i.e. frequentist inference). Once optimal values are estimated, parameter uncertainty is computed through the covariance matrix. However, alternative approaches based on the consideration of the model parameters as probability distributions (i.e. Bayesian inference), may be of interest. The aim of this work is to apply (and compare) both Bayesian and frequentist inference methods when assessing uncertainty for an ASM-type model, which considers intracellular storage and biomass growth, simultaneously. Practical identifiability was addressed exclusively considering respirometric profiles based on the oxygen uptake rate and with the aid of probabilistic global sensitivity analysis. Parameter uncertainty was thus estimated according to both the Bayesian and frequentist inferential procedures. Results were compared in order to evidence the strengths and weaknesses of both approaches. Since it was demonstrated that Bayesian inference could be reduced to a frequentist approach under particular hypotheses, the former can be considered as a more generalist methodology. Hence, the use of Bayesian inference is encouraged for tackling inferential issues in ASM environments.
Comparisons of Means Using Exploratory and Confirmatory Approaches
ERIC Educational Resources Information Center
Kuiper, Rebecca M.; Hoijtink, Herbert
2010-01-01
This article discusses comparisons of means using exploratory and confirmatory approaches. Three methods are discussed: hypothesis testing, model selection based on information criteria, and Bayesian model selection. Throughout the article, an example is used to illustrate and evaluate the two approaches and the three methods. We demonstrate that…
Posterior Predictive Bayesian Phylogenetic Model Selection
Lewis, Paul O.; Xie, Wangang; Chen, Ming-Hui; Fan, Yu; Kuo, Lynn
2014-01-01
We present two distinctly different posterior predictive approaches to Bayesian phylogenetic model selection and illustrate these methods using examples from green algal protein-coding cpDNA sequences and flowering plant rDNA sequences. The Gelfand–Ghosh (GG) approach allows dissection of an overall measure of model fit into components due to posterior predictive variance (GGp) and goodness-of-fit (GGg), which distinguishes this method from the posterior predictive P-value approach. The conditional predictive ordinate (CPO) method provides a site-specific measure of model fit useful for exploratory analyses and can be combined over sites yielding the log pseudomarginal likelihood (LPML) which is useful as an overall measure of model fit. CPO provides a useful cross-validation approach that is computationally efficient, requiring only a sample from the posterior distribution (no additional simulation is required). Both GG and CPO add new perspectives to Bayesian phylogenetic model selection based on the predictive abilities of models and complement the perspective provided by the marginal likelihood (including Bayes Factor comparisons) based solely on the fit of competing models to observed data. [Bayesian; conditional predictive ordinate; CPO; L-measure; LPML; model selection; phylogenetics; posterior predictive.] PMID:24193892
Sparse Bayesian Learning for Identifying Imaging Biomarkers in AD Prediction
Shen, Li; Qi, Yuan; Kim, Sungeun; Nho, Kwangsik; Wan, Jing; Risacher, Shannon L.; Saykin, Andrew J.
2010-01-01
We apply sparse Bayesian learning methods, automatic relevance determination (ARD) and predictive ARD (PARD), to Alzheimer’s disease (AD) classification to make accurate prediction and identify critical imaging markers relevant to AD at the same time. ARD is one of the most successful Bayesian feature selection methods. PARD is a powerful Bayesian feature selection method, and provides sparse models that is easy to interpret. PARD selects the model with the best estimate of the predictive performance instead of choosing the one with the largest marginal model likelihood. Comparative study with support vector machine (SVM) shows that ARD/PARD in general outperform SVM in terms of prediction accuracy. Additional comparison with surface-based general linear model (GLM) analysis shows that regions with strongest signals are identified by both GLM and ARD/PARD. While GLM P-map returns significant regions all over the cortex, ARD/PARD provide a small number of relevant and meaningful imaging markers with predictive power, including both cortical and subcortical measures. PMID:20879451
A Gibbs sampler for Bayesian analysis of site-occupancy data
Dorazio, Robert M.; Rodriguez, Daniel Taylor
2012-01-01
1. A Bayesian analysis of site-occupancy data containing covariates of species occurrence and species detection probabilities is usually completed using Markov chain Monte Carlo methods in conjunction with software programs that can implement those methods for any statistical model, not just site-occupancy models. Although these software programs are quite flexible, considerable experience is often required to specify a model and to initialize the Markov chain so that summaries of the posterior distribution can be estimated efficiently and accurately. 2. As an alternative to these programs, we develop a Gibbs sampler for Bayesian analysis of site-occupancy data that include covariates of species occurrence and species detection probabilities. This Gibbs sampler is based on a class of site-occupancy models in which probabilities of species occurrence and detection are specified as probit-regression functions of site- and survey-specific covariate measurements. 3. To illustrate the Gibbs sampler, we analyse site-occupancy data of the blue hawker, Aeshna cyanea (Odonata, Aeshnidae), a common dragonfly species in Switzerland. Our analysis includes a comparison of results based on Bayesian and classical (non-Bayesian) methods of inference. We also provide code (based on the R software program) for conducting Bayesian and classical analyses of site-occupancy data.
Adaptive selection and validation of models of complex systems in the presence of uncertainty
DOE Office of Scientific and Technical Information (OSTI.GOV)
Farrell-Maupin, Kathryn; Oden, J. T.
This study describes versions of OPAL, the Occam-Plausibility Algorithm in which the use of Bayesian model plausibilities is replaced with information theoretic methods, such as the Akaike Information Criterion and the Bayes Information Criterion. Applications to complex systems of coarse-grained molecular models approximating atomistic models of polyethylene materials are described. All of these model selection methods take into account uncertainties in the model, the observational data, the model parameters, and the predicted quantities of interest. A comparison of the models chosen by Bayesian model selection criteria and those chosen by the information-theoretic criteria is given.
Adaptive selection and validation of models of complex systems in the presence of uncertainty
Farrell-Maupin, Kathryn; Oden, J. T.
2017-08-01
This study describes versions of OPAL, the Occam-Plausibility Algorithm in which the use of Bayesian model plausibilities is replaced with information theoretic methods, such as the Akaike Information Criterion and the Bayes Information Criterion. Applications to complex systems of coarse-grained molecular models approximating atomistic models of polyethylene materials are described. All of these model selection methods take into account uncertainties in the model, the observational data, the model parameters, and the predicted quantities of interest. A comparison of the models chosen by Bayesian model selection criteria and those chosen by the information-theoretic criteria is given.
Bayesian model comparison and parameter inference in systems biology using nested sampling.
Pullen, Nick; Morris, Richard J
2014-01-01
Inferring parameters for models of biological processes is a current challenge in systems biology, as is the related problem of comparing competing models that explain the data. In this work we apply Skilling's nested sampling to address both of these problems. Nested sampling is a Bayesian method for exploring parameter space that transforms a multi-dimensional integral to a 1D integration over likelihood space. This approach focuses on the computation of the marginal likelihood or evidence. The ratio of evidences of different models leads to the Bayes factor, which can be used for model comparison. We demonstrate how nested sampling can be used to reverse-engineer a system's behaviour whilst accounting for the uncertainty in the results. The effect of missing initial conditions of the variables as well as unknown parameters is investigated. We show how the evidence and the model ranking can change as a function of the available data. Furthermore, the addition of data from extra variables of the system can deliver more information for model comparison than increasing the data from one variable, thus providing a basis for experimental design.
Multilevel modeling of single-case data: A comparison of maximum likelihood and Bayesian estimation.
Moeyaert, Mariola; Rindskopf, David; Onghena, Patrick; Van den Noortgate, Wim
2017-12-01
The focus of this article is to describe Bayesian estimation, including construction of prior distributions, and to compare parameter recovery under the Bayesian framework (using weakly informative priors) and the maximum likelihood (ML) framework in the context of multilevel modeling of single-case experimental data. Bayesian estimation results were found similar to ML estimation results in terms of the treatment effect estimates, regardless of the functional form and degree of information included in the prior specification in the Bayesian framework. In terms of the variance component estimates, both the ML and Bayesian estimation procedures result in biased and less precise variance estimates when the number of participants is small (i.e., 3). By increasing the number of participants to 5 or 7, the relative bias is close to 5% and more precise estimates are obtained for all approaches, except for the inverse-Wishart prior using the identity matrix. When a more informative prior was added, more precise estimates for the fixed effects and random effects were obtained, even when only 3 participants were included. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
A Bayesian estimation of a stochastic predator-prey model of economic fluctuations
NASA Astrophysics Data System (ADS)
Dibeh, Ghassan; Luchinsky, Dmitry G.; Luchinskaya, Daria D.; Smelyanskiy, Vadim N.
2007-06-01
In this paper, we develop a Bayesian framework for the empirical estimation of the parameters of one of the best known nonlinear models of the business cycle: The Marx-inspired model of a growth cycle introduced by R. M. Goodwin. The model predicts a series of closed cycles representing the dynamics of labor's share and the employment rate in the capitalist economy. The Bayesian framework is used to empirically estimate a modified Goodwin model. The original model is extended in two ways. First, we allow for exogenous periodic variations of the otherwise steady growth rates of the labor force and productivity per worker. Second, we allow for stochastic variations of those parameters. The resultant modified Goodwin model is a stochastic predator-prey model with periodic forcing. The model is then estimated using a newly developed Bayesian estimation method on data sets representing growth cycles in France and Italy during the years 1960-2005. Results show that inference of the parameters of the stochastic Goodwin model can be achieved. The comparison of the dynamics of the Goodwin model with the inferred values of parameters demonstrates quantitative agreement with the growth cycle empirical data.
Cross-view gait recognition using joint Bayesian
NASA Astrophysics Data System (ADS)
Li, Chao; Sun, Shouqian; Chen, Xiaoyu; Min, Xin
2017-07-01
Human gait, as a soft biometric, helps to recognize people by walking. To further improve the recognition performance under cross-view condition, we propose Joint Bayesian to model the view variance. We evaluated our prosed method with the largest population (OULP) dataset which makes our result reliable in a statically way. As a result, we confirmed our proposed method significantly outperformed state-of-the-art approaches for both identification and verification tasks. Finally, sensitivity analysis on the number of training subjects was conducted, we find Joint Bayesian could achieve competitive results even with a small subset of training subjects (100 subjects). For further comparison, experimental results, learning models, and test codes are available.
Learning Instance-Specific Predictive Models
Visweswaran, Shyam; Cooper, Gregory F.
2013-01-01
This paper introduces a Bayesian algorithm for constructing predictive models from data that are optimized to predict a target variable well for a particular instance. This algorithm learns Markov blanket models, carries out Bayesian model averaging over a set of models to predict a target variable of the instance at hand, and employs an instance-specific heuristic to locate a set of suitable models to average over. We call this method the instance-specific Markov blanket (ISMB) algorithm. The ISMB algorithm was evaluated on 21 UCI data sets using five different performance measures and its performance was compared to that of several commonly used predictive algorithms, including nave Bayes, C4.5 decision tree, logistic regression, neural networks, k-Nearest Neighbor, Lazy Bayesian Rules, and AdaBoost. Over all the data sets, the ISMB algorithm performed better on average on all performance measures against all the comparison algorithms. PMID:25045325
Model selection and assessment for multi-species occupancy models
Broms, Kristin M.; Hooten, Mevin B.; Fitzpatrick, Ryan M.
2016-01-01
While multi-species occupancy models (MSOMs) are emerging as a popular method for analyzing biodiversity data, formal checking and validation approaches for this class of models have lagged behind. Concurrent with the rise in application of MSOMs among ecologists, a quiet regime shift is occurring in Bayesian statistics where predictive model comparison approaches are experiencing a resurgence. Unlike single-species occupancy models that use integrated likelihoods, MSOMs are usually couched in a Bayesian framework and contain multiple levels. Standard model checking and selection methods are often unreliable in this setting and there is only limited guidance in the ecological literature for this class of models. We examined several different contemporary Bayesian hierarchical approaches for checking and validating MSOMs and applied these methods to a freshwater aquatic study system in Colorado, USA, to better understand the diversity and distributions of plains fishes. Our findings indicated distinct differences among model selection approaches, with cross-validation techniques performing the best in terms of prediction.
Spatial distribution of psychotic disorders in an urban area of France: an ecological study.
Pignon, Baptiste; Schürhoff, Franck; Baudin, Grégoire; Ferchiou, Aziz; Richard, Jean-Romain; Saba, Ghassen; Leboyer, Marion; Kirkbride, James B; Szöke, Andrei
2016-05-18
Previous analyses of neighbourhood variations of non-affective psychotic disorders (NAPD) have focused mainly on incidence. However, prevalence studies provide important insights on factors associated with disease evolution as well as for healthcare resource allocation. This study aimed to investigate the distribution of prevalent NAPD cases in an urban area in France. The number of cases in each neighbourhood was modelled as a function of potential confounders and ecological variables, namely: migrant density, economic deprivation and social fragmentation. This was modelled using statistical models of increasing complexity: frequentist models (using Poisson and negative binomial regressions), and several Bayesian models. For each model, assumptions validity were checked and compared as to how this fitted to the data, in order to test for possible spatial variation in prevalence. Data showed significant overdispersion (invalidating the Poisson regression model) and residual autocorrelation (suggesting the need to use Bayesian models). The best Bayesian model was Leroux's model (i.e. a model with both strong correlation between neighbouring areas and weaker correlation between areas further apart), with economic deprivation as an explanatory variable (OR = 1.13, 95% CI [1.02-1.25]). In comparison with frequentist methods, the Bayesian model showed a better fit. The number of cases showed non-random spatial distribution and was linked to economic deprivation.
Multinomial Bayesian learning for modeling classical and nonclassical receptive field properties.
Hosoya, Haruo
2012-08-01
We study the interplay of Bayesian inference and natural image learning in a hierarchical vision system, in relation to the response properties of early visual cortex. We particularly focus on a Bayesian network with multinomial variables that can represent discrete feature spaces similar to hypercolumns combining minicolumns, enforce sparsity of activation to learn efficient representations, and explain divisive normalization. We demonstrate that maximal-likelihood learning using sampling-based Bayesian inference gives rise to classical receptive field properties similar to V1 simple cells and V2 cells, while inference performed on the trained network yields nonclassical context-dependent response properties such as cross-orientation suppression and filling in. Comparison with known physiological properties reveals some qualitative and quantitative similarities.
Efficient Matrix Models for Relational Learning
2009-10-01
74 4.5.3 Comparison to pLSI- pHITS . . . . . . . . . . . . . . . . . . . . 76 5 Hierarchical Bayesian Collective...Behaviour of Newton vs. Stochastic Newton on a three-factor model. 4.5.3 Comparison to pLSI- pHITS Caveat: Collective Matrix Factorization makes no guarantees...leads to better results; and another where a co-clustering model, pLSI- pHITS , has the advantage. pLSI- pHITS [24] is a relational clustering technique
BayeSED: A General Approach to Fitting the Spectral Energy Distribution of Galaxies
NASA Astrophysics Data System (ADS)
Han, Yunkun; Han, Zhanwen
2014-11-01
We present a newly developed version of BayeSED, a general Bayesian approach to the spectral energy distribution (SED) fitting of galaxies. The new BayeSED code has been systematically tested on a mock sample of galaxies. The comparison between the estimated and input values of the parameters shows that BayeSED can recover the physical parameters of galaxies reasonably well. We then applied BayeSED to interpret the SEDs of a large Ks -selected sample of galaxies in the COSMOS/UltraVISTA field with stellar population synthesis models. Using the new BayeSED code, a Bayesian model comparison of stellar population synthesis models has been performed for the first time. We found that the 2003 model by Bruzual & Charlot, statistically speaking, has greater Bayesian evidence than the 2005 model by Maraston for the Ks -selected sample. In addition, while setting the stellar metallicity as a free parameter obviously increases the Bayesian evidence of both models, varying the initial mass function has a notable effect only on the Maraston model. Meanwhile, the physical parameters estimated with BayeSED are found to be generally consistent with those obtained using the popular grid-based FAST code, while the former parameters exhibit more natural distributions. Based on the estimated physical parameters of the galaxies in the sample, we qualitatively classified the galaxies in the sample into five populations that may represent galaxies at different evolution stages or in different environments. We conclude that BayeSED could be a reliable and powerful tool for investigating the formation and evolution of galaxies from the rich multi-wavelength observations currently available. A binary version of the BayeSED code parallelized with Message Passing Interface is publicly available at https://bitbucket.org/hanyk/bayesed.
BayeSED: A GENERAL APPROACH TO FITTING THE SPECTRAL ENERGY DISTRIBUTION OF GALAXIES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Han, Yunkun; Han, Zhanwen, E-mail: hanyk@ynao.ac.cn, E-mail: zhanwenhan@ynao.ac.cn
2014-11-01
We present a newly developed version of BayeSED, a general Bayesian approach to the spectral energy distribution (SED) fitting of galaxies. The new BayeSED code has been systematically tested on a mock sample of galaxies. The comparison between the estimated and input values of the parameters shows that BayeSED can recover the physical parameters of galaxies reasonably well. We then applied BayeSED to interpret the SEDs of a large K{sub s} -selected sample of galaxies in the COSMOS/UltraVISTA field with stellar population synthesis models. Using the new BayeSED code, a Bayesian model comparison of stellar population synthesis models has beenmore » performed for the first time. We found that the 2003 model by Bruzual and Charlot, statistically speaking, has greater Bayesian evidence than the 2005 model by Maraston for the K{sub s} -selected sample. In addition, while setting the stellar metallicity as a free parameter obviously increases the Bayesian evidence of both models, varying the initial mass function has a notable effect only on the Maraston model. Meanwhile, the physical parameters estimated with BayeSED are found to be generally consistent with those obtained using the popular grid-based FAST code, while the former parameters exhibit more natural distributions. Based on the estimated physical parameters of the galaxies in the sample, we qualitatively classified the galaxies in the sample into five populations that may represent galaxies at different evolution stages or in different environments. We conclude that BayeSED could be a reliable and powerful tool for investigating the formation and evolution of galaxies from the rich multi-wavelength observations currently available. A binary version of the BayeSED code parallelized with Message Passing Interface is publicly available at https://bitbucket.org/hanyk/bayesed.« less
Predicting ICU mortality: a comparison of stationary and nonstationary temporal models.
Kayaalp, M.; Cooper, G. F.; Clermont, G.
2000-01-01
OBJECTIVE: This study evaluates the effectiveness of the stationarity assumption in predicting the mortality of intensive care unit (ICU) patients at the ICU discharge. DESIGN: This is a comparative study. A stationary temporal Bayesian network learned from data was compared to a set of (33) nonstationary temporal Bayesian networks learned from data. A process observed as a sequence of events is stationary if its stochastic properties stay the same when the sequence is shifted in a positive or negative direction by a constant time parameter. The temporal Bayesian networks forecast mortalities of patients, where each patient has one record per day. The predictive performance of the stationary model is compared with nonstationary models using the area under the receiver operating characteristics (ROC) curves. RESULTS: The stationary model usually performed best. However, one nonstationary model using large data sets performed significantly better than the stationary model. CONCLUSION: Results suggest that using a combination of stationary and nonstationary models may predict better than using either alone. PMID:11079917
Automated Bayesian model development for frequency detection in biological time series.
Granqvist, Emma; Oldroyd, Giles E D; Morris, Richard J
2011-06-24
A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time series often deviate significantly from the requirements of optimality for Fourier transformation. In this paper we present an alternative approach based on Bayesian inference. We show the value of placing spectral analysis in the framework of Bayesian inference and demonstrate how model comparison can automate this procedure.
Automated Bayesian model development for frequency detection in biological time series
2011-01-01
Background A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. Results In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Conclusions Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time series often deviate significantly from the requirements of optimality for Fourier transformation. In this paper we present an alternative approach based on Bayesian inference. We show the value of placing spectral analysis in the framework of Bayesian inference and demonstrate how model comparison can automate this procedure. PMID:21702910
Chen, Bo; Chen, Minhua; Paisley, John; Zaas, Aimee; Woods, Christopher; Ginsburg, Geoffrey S; Hero, Alfred; Lucas, Joseph; Dunson, David; Carin, Lawrence
2010-11-09
Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Yu, Rongjie; Abdel-Aty, Mohamed
2013-07-01
The Bayesian inference method has been frequently adopted to develop safety performance functions. One advantage of the Bayesian inference is that prior information for the independent variables can be included in the inference procedures. However, there are few studies that discussed how to formulate informative priors for the independent variables and evaluated the effects of incorporating informative priors in developing safety performance functions. This paper addresses this deficiency by introducing four approaches of developing informative priors for the independent variables based on historical data and expert experience. Merits of these informative priors have been tested along with two types of Bayesian hierarchical models (Poisson-gamma and Poisson-lognormal models). Deviance information criterion (DIC), R-square values, and coefficients of variance for the estimations were utilized as evaluation measures to select the best model(s). Comparison across the models indicated that the Poisson-gamma model is superior with a better model fit and it is much more robust with the informative priors. Moreover, the two-stage Bayesian updating informative priors provided the best goodness-of-fit and coefficient estimation accuracies. Furthermore, informative priors for the inverse dispersion parameter have also been introduced and tested. Different types of informative priors' effects on the model estimations and goodness-of-fit have been compared and concluded. Finally, based on the results, recommendations for future research topics and study applications have been made. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Gong, Maozhen
Selecting an appropriate prior distribution is a fundamental issue in Bayesian Statistics. In this dissertation, under the framework provided by Berger and Bernardo, I derive the reference priors for several models which include: Analysis of Variance (ANOVA)/Analysis of Covariance (ANCOVA) models with a categorical variable under common ordering constraints, the conditionally autoregressive (CAR) models and the simultaneous autoregressive (SAR) models with a spatial autoregression parameter rho considered. The performances of reference priors for ANOVA/ANCOVA models are evaluated by simulation studies with comparisons to Jeffreys' prior and Least Squares Estimation (LSE). The priors are then illustrated in a Bayesian model of the "Risk of Type 2 Diabetes in New Mexico" data, where the relationship between the type 2 diabetes risk (through Hemoglobin A1c) and different smoking levels is investigated. In both simulation studies and real data set modeling, the reference priors that incorporate internal order information show good performances and can be used as default priors. The reference priors for the CAR and SAR models are also illustrated in the "1999 SAT State Average Verbal Scores" data with a comparison to a Uniform prior distribution. Due to the complexity of the reference priors for both CAR and SAR models, only a portion (12 states in the Midwest) of the original data set is considered. The reference priors can give a different marginal posterior distribution compared to a Uniform prior, which provides an alternative for prior specifications for areal data in Spatial statistics.
A comparison of machine learning and Bayesian modelling for molecular serotyping.
Newton, Richard; Wernisch, Lorenz
2017-08-11
Streptococcus pneumoniae is a human pathogen that is a major cause of infant mortality. Identifying the pneumococcal serotype is an important step in monitoring the impact of vaccines used to protect against disease. Genomic microarrays provide an effective method for molecular serotyping. Previously we developed an empirical Bayesian model for the classification of serotypes from a molecular serotyping array. With only few samples available, a model driven approach was the only option. In the meanwhile, several thousand samples have been made available to us, providing an opportunity to investigate serotype classification by machine learning methods, which could complement the Bayesian model. We compare the performance of the original Bayesian model with two machine learning algorithms: Gradient Boosting Machines and Random Forests. We present our results as an example of a generic strategy whereby a preliminary probabilistic model is complemented or replaced by a machine learning classifier once enough data are available. Despite the availability of thousands of serotyping arrays, a problem encountered when applying machine learning methods is the lack of training data containing mixtures of serotypes; due to the large number of possible combinations. Most of the available training data comprises samples with only a single serotype. To overcome the lack of training data we implemented an iterative analysis, creating artificial training data of serotype mixtures by combining raw data from single serotype arrays. With the enhanced training set the machine learning algorithms out perform the original Bayesian model. However, for serotypes currently lacking sufficient training data the best performing implementation was a combination of the results of the Bayesian Model and the Gradient Boosting Machine. As well as being an effective method for classifying biological data, machine learning can also be used as an efficient method for revealing subtle biological insights, which we illustrate with an example.
Bayesian X-ray computed tomography using a three-level hierarchical prior model
NASA Astrophysics Data System (ADS)
Wang, Li; Mohammad-Djafari, Ali; Gac, Nicolas
2017-06-01
In recent decades X-ray Computed Tomography (CT) image reconstruction has been largely developed in both medical and industrial domain. In this paper, we propose using the Bayesian inference approach with a new hierarchical prior model. In the proposed model, a generalised Student-t distribution is used to enforce the Haar transformation of images to be sparse. Comparisons with some state of the art methods are presented. It is shown that by using the proposed model, the sparsity of sparse representation of images is enforced, so that edges of images are preserved. Simulation results are also provided to demonstrate the effectiveness of the new hierarchical model for reconstruction with fewer projections.
Bayesian Model Comparison for the Order Restricted RC Association Model
ERIC Educational Resources Information Center
Iliopoulos, G.; Kateri, M.; Ntzoufras, I.
2009-01-01
Association models constitute an attractive alternative to the usual log-linear models for modeling the dependence between classification variables. They impose special structure on the underlying association by assigning scores on the levels of each classification variable, which can be fixed or parametric. Under the general row-column (RC)…
Sa-Ngamuang, Chaitawat; Haddawy, Peter; Luvira, Viravarn; Piyaphanee, Watcharapong; Iamsirithaworn, Sopon; Lawpoolsri, Saranath
2018-06-18
Differentiating dengue patients from other acute febrile illness patients is a great challenge among physicians. Several dengue diagnosis methods are recommended by WHO. The application of specific laboratory tests is still limited due to high cost, lack of equipment, and uncertain validity. Therefore, clinical diagnosis remains a common practice especially in resource limited settings. Bayesian networks have been shown to be a useful tool for diagnostic decision support. This study aimed to construct Bayesian network models using basic demographic, clinical, and laboratory profiles of acute febrile illness patients to diagnose dengue. Data of 397 acute undifferentiated febrile illness patients who visited the fever clinic of the Bangkok Hospital for Tropical Diseases, Thailand, were used for model construction and validation. The two best final models were selected: one with and one without NS1 rapid test result. The diagnostic accuracy of the models was compared with that of physicians on the same set of patients. The Bayesian network models provided good diagnostic accuracy of dengue infection, with ROC AUC of 0.80 and 0.75 for models with and without NS1 rapid test result, respectively. The models had approximately 80% specificity and 70% sensitivity, similar to the diagnostic accuracy of the hospital's fellows in infectious disease. Including information on NS1 rapid test improved the specificity, but reduced the sensitivity, both in model and physician diagnoses. The Bayesian network model developed in this study could be useful to assist physicians in diagnosing dengue, particularly in regions where experienced physicians and laboratory confirmation tests are limited.
NASA Astrophysics Data System (ADS)
Feeney, Stephen M.; Mortlock, Daniel J.; Dalmasso, Niccolò
2018-05-01
Estimates of the Hubble constant, H0, from the local distance ladder and from the cosmic microwave background (CMB) are discrepant at the ˜3σ level, indicating a potential issue with the standard Λ cold dark matter (ΛCDM) cosmology. A probabilistic (i.e. Bayesian) interpretation of this tension requires a model comparison calculation, which in turn depends strongly on the tails of the H0 likelihoods. Evaluating the tails of the local H0 likelihood requires the use of non-Gaussian distributions to faithfully represent anchor likelihoods and outliers, and simultaneous fitting of the complete distance-ladder data set to ensure correct uncertainty propagation. We have hence developed a Bayesian hierarchical model of the full distance ladder that does not rely on Gaussian distributions and allows outliers to be modelled without arbitrary data cuts. Marginalizing over the full ˜3000-parameter joint posterior distribution, we find H0 = (72.72 ± 1.67) km s-1 Mpc-1 when applied to the outlier-cleaned Riess et al. data, and (73.15 ± 1.78) km s-1 Mpc-1 with supernova outliers reintroduced (the pre-cut Cepheid data set is not available). Using our precise evaluation of the tails of the H0 likelihood, we apply Bayesian model comparison to assess the evidence for deviation from ΛCDM given the distance-ladder and CMB data. The odds against ΛCDM are at worst ˜10:1 when considering the Planck 2015 XIII data, regardless of outlier treatment, considerably less dramatic than naïvely implied by the 2.8σ discrepancy. These odds become ˜60:1 when an approximation to the more-discrepant Planck Intermediate XLVI likelihood is included.
Fong, Ted C T; Ho, Rainbow T H
2015-01-01
The aim of this study was to reexamine the dimensionality of the widely used 9-item Utrecht Work Engagement Scale using the maximum likelihood (ML) approach and Bayesian structural equation modeling (BSEM) approach. Three measurement models (1-factor, 3-factor, and bi-factor models) were evaluated in two split samples of 1,112 health-care workers using confirmatory factor analysis and BSEM, which specified small-variance informative priors for cross-loadings and residual covariances. Model fit and comparisons were evaluated by posterior predictive p-value (PPP), deviance information criterion, and Bayesian information criterion (BIC). None of the three ML-based models showed an adequate fit to the data. The use of informative priors for cross-loadings did not improve the PPP for the models. The 1-factor BSEM model with approximately zero residual covariances displayed a good fit (PPP>0.10) to both samples and a substantially lower BIC than its 3-factor and bi-factor counterparts. The BSEM results demonstrate empirical support for the 1-factor model as a parsimonious and reasonable representation of work engagement.
Bayesian randomized clinical trials: From fixed to adaptive design.
Yin, Guosheng; Lam, Chi Kin; Shi, Haolun
2017-08-01
Randomized controlled studies are the gold standard for phase III clinical trials. Using α-spending functions to control the overall type I error rate, group sequential methods are well established and have been dominating phase III studies. Bayesian randomized design, on the other hand, can be viewed as a complement instead of competitive approach to the frequentist methods. For the fixed Bayesian design, the hypothesis testing can be cast in the posterior probability or Bayes factor framework, which has a direct link to the frequentist type I error rate. Bayesian group sequential design relies upon Bayesian decision-theoretic approaches based on backward induction, which is often computationally intensive. Compared with the frequentist approaches, Bayesian methods have several advantages. The posterior predictive probability serves as a useful and convenient tool for trial monitoring, and can be updated at any time as the data accrue during the trial. The Bayesian decision-theoretic framework possesses a direct link to the decision making in the practical setting, and can be modeled more realistically to reflect the actual cost-benefit analysis during the drug development process. Other merits include the possibility of hierarchical modeling and the use of informative priors, which would lead to a more comprehensive utilization of information from both historical and longitudinal data. From fixed to adaptive design, we focus on Bayesian randomized controlled clinical trials and make extensive comparisons with frequentist counterparts through numerical studies. Copyright © 2017 Elsevier Inc. All rights reserved.
Alderman, Phillip D.; Stanfill, Bryan
2016-10-06
Recent international efforts have brought renewed emphasis on the comparison of different agricultural systems models. Thus far, analysis of model-ensemble simulated results has not clearly differentiated between ensemble prediction uncertainties due to model structural differences per se and those due to parameter value uncertainties. Additionally, despite increasing use of Bayesian parameter estimation approaches with field-scale crop models, inadequate attention has been given to the full posterior distributions for estimated parameters. The objectives of this study were to quantify the impact of parameter value uncertainty on prediction uncertainty for modeling spring wheat phenology using Bayesian analysis and to assess the relativemore » contributions of model-structure-driven and parameter-value-driven uncertainty to overall prediction uncertainty. This study used a random walk Metropolis algorithm to estimate parameters for 30 spring wheat genotypes using nine phenology models based on multi-location trial data for days to heading and days to maturity. Across all cases, parameter-driven uncertainty accounted for between 19 and 52% of predictive uncertainty, while model-structure-driven uncertainty accounted for between 12 and 64%. Here, this study demonstrated the importance of quantifying both model-structure- and parameter-value-driven uncertainty when assessing overall prediction uncertainty in modeling spring wheat phenology. More generally, Bayesian parameter estimation provided a useful framework for quantifying and analyzing sources of prediction uncertainty.« less
ERIC Educational Resources Information Center
Mun, Eun Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.
2008-01-01
Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of nonnested models using the Bayesian information criterion to compare multiple models and identify the…
A Prior for Neural Networks utilizing Enclosing Spheres for Normalization
NASA Astrophysics Data System (ADS)
v. Toussaint, U.; Gori, S.; Dose, V.
2004-11-01
Neural Networks are famous for their advantageous flexibility for problems when there is insufficient knowledge to set up a proper model. On the other hand this flexibility can cause over-fitting and can hamper the generalization properties of neural networks. Many approaches to regularize NN have been suggested but most of them based on ad-hoc arguments. Employing the principle of transformation invariance we derive a general prior in accordance with the Bayesian probability theory for a class of feedforward networks. Optimal networks are determined by Bayesian model comparison verifying the applicability of this approach.
Two-Stage Bayesian Model Averaging in Endogenous Variable Models*
Lenkoski, Alex; Eicher, Theo S.; Raftery, Adrian E.
2013-01-01
Economic modeling in the presence of endogeneity is subject to model uncertainty at both the instrument and covariate level. We propose a Two-Stage Bayesian Model Averaging (2SBMA) methodology that extends the Two-Stage Least Squares (2SLS) estimator. By constructing a Two-Stage Unit Information Prior in the endogenous variable model, we are able to efficiently combine established methods for addressing model uncertainty in regression models with the classic technique of 2SLS. To assess the validity of instruments in the 2SBMA context, we develop Bayesian tests of the identification restriction that are based on model averaged posterior predictive p-values. A simulation study showed that 2SBMA has the ability to recover structure in both the instrument and covariate set, and substantially improves the sharpness of resulting coefficient estimates in comparison to 2SLS using the full specification in an automatic fashion. Due to the increased parsimony of the 2SBMA estimate, the Bayesian Sargan test had a power of 50 percent in detecting a violation of the exogeneity assumption, while the method based on 2SLS using the full specification had negligible power. We apply our approach to the problem of development accounting, and find support not only for institutions, but also for geography and integration as development determinants, once both model uncertainty and endogeneity have been jointly addressed. PMID:24223471
Zaikin, Alexey; Míguez, Joaquín
2017-01-01
We compare three state-of-the-art Bayesian inference methods for the estimation of the unknown parameters in a stochastic model of a genetic network. In particular, we introduce a stochastic version of the paradigmatic synthetic multicellular clock model proposed by Ullner et al., 2007. By introducing dynamical noise in the model and assuming that the partial observations of the system are contaminated by additive noise, we enable a principled mechanism to represent experimental uncertainties in the synthesis of the multicellular system and pave the way for the design of probabilistic methods for the estimation of any unknowns in the model. Within this setup, we tackle the Bayesian estimation of a subset of the model parameters. Specifically, we compare three Monte Carlo based numerical methods for the approximation of the posterior probability density function of the unknown parameters given a set of partial and noisy observations of the system. The schemes we assess are the particle Metropolis-Hastings (PMH) algorithm, the nonlinear population Monte Carlo (NPMC) method and the approximate Bayesian computation sequential Monte Carlo (ABC-SMC) scheme. We present an extensive numerical simulation study, which shows that while the three techniques can effectively solve the problem there are significant differences both in estimation accuracy and computational efficiency. PMID:28797087
Quantitative comparison of alternative methods for coarse-graining biological networks
Bowman, Gregory R.; Meng, Luming; Huang, Xuhui
2013-01-01
Markov models and master equations are a powerful means of modeling dynamic processes like protein conformational changes. However, these models are often difficult to understand because of the enormous number of components and connections between them. Therefore, a variety of methods have been developed to facilitate understanding by coarse-graining these complex models. Here, we employ Bayesian model comparison to determine which of these coarse-graining methods provides the models that are most faithful to the original set of states. We find that the Bayesian agglomerative clustering engine and the hierarchical Nyström expansion graph (HNEG) typically provide the best performance. Surprisingly, the original Perron cluster cluster analysis (PCCA) method often provides the next best results, outperforming the newer PCCA+ method and the most probable paths algorithm. We also show that the differences between the models are qualitatively significant, rather than being minor shifts in the boundaries between states. The performance of the methods correlates well with the entropy of the resulting coarse-grainings, suggesting that finding states with more similar populations (i.e., avoiding low population states that may just be noise) gives better results. PMID:24089717
Bayesian cross-entropy methodology for optimal design of validation experiments
NASA Astrophysics Data System (ADS)
Jiang, X.; Mahadevan, S.
2006-07-01
An important concern in the design of validation experiments is how to incorporate the mathematical model in the design in order to allow conclusive comparisons of model prediction with experimental output in model assessment. The classical experimental design methods are more suitable for phenomena discovery and may result in a subjective, expensive, time-consuming and ineffective design that may adversely impact these comparisons. In this paper, an integrated Bayesian cross-entropy methodology is proposed to perform the optimal design of validation experiments incorporating the computational model. The expected cross entropy, an information-theoretic distance between the distributions of model prediction and experimental observation, is defined as a utility function to measure the similarity of two distributions. A simulated annealing algorithm is used to find optimal values of input variables through minimizing or maximizing the expected cross entropy. The measured data after testing with the optimum input values are used to update the distribution of the experimental output using Bayes theorem. The procedure is repeated to adaptively design the required number of experiments for model assessment, each time ensuring that the experiment provides effective comparison for validation. The methodology is illustrated for the optimal design of validation experiments for a three-leg bolted joint structure and a composite helicopter rotor hub component.
NASA Astrophysics Data System (ADS)
Kim, Junhan; Marrone, Daniel P.; Chan, Chi-Kwan; Medeiros, Lia; Özel, Feryal; Psaltis, Dimitrios
2016-12-01
The Event Horizon Telescope (EHT) is a millimeter-wavelength, very-long-baseline interferometry (VLBI) experiment that is capable of observing black holes with horizon-scale resolution. Early observations have revealed variable horizon-scale emission in the Galactic Center black hole, Sagittarius A* (Sgr A*). Comparing such observations to time-dependent general relativistic magnetohydrodynamic (GRMHD) simulations requires statistical tools that explicitly consider the variability in both the data and the models. We develop here a Bayesian method to compare time-resolved simulation images to variable VLBI data, in order to infer model parameters and perform model comparisons. We use mock EHT data based on GRMHD simulations to explore the robustness of this Bayesian method and contrast it to approaches that do not consider the effects of variability. We find that time-independent models lead to offset values of the inferred parameters with artificially reduced uncertainties. Moreover, neglecting the variability in the data and the models often leads to erroneous model selections. We finally apply our method to the early EHT data on Sgr A*.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Junhan; Marrone, Daniel P.; Chan, Chi-Kwan
2016-12-01
The Event Horizon Telescope (EHT) is a millimeter-wavelength, very-long-baseline interferometry (VLBI) experiment that is capable of observing black holes with horizon-scale resolution. Early observations have revealed variable horizon-scale emission in the Galactic Center black hole, Sagittarius A* (Sgr A*). Comparing such observations to time-dependent general relativistic magnetohydrodynamic (GRMHD) simulations requires statistical tools that explicitly consider the variability in both the data and the models. We develop here a Bayesian method to compare time-resolved simulation images to variable VLBI data, in order to infer model parameters and perform model comparisons. We use mock EHT data based on GRMHD simulations to explore themore » robustness of this Bayesian method and contrast it to approaches that do not consider the effects of variability. We find that time-independent models lead to offset values of the inferred parameters with artificially reduced uncertainties. Moreover, neglecting the variability in the data and the models often leads to erroneous model selections. We finally apply our method to the early EHT data on Sgr A*.« less
A Bayesian pick-the-winner design in a randomized phase II clinical trial.
Chen, Dung-Tsa; Huang, Po-Yu; Lin, Hui-Yi; Chiappori, Alberto A; Gabrilovich, Dmitry I; Haura, Eric B; Antonia, Scott J; Gray, Jhanelle E
2017-10-24
Many phase II clinical trials evaluate unique experimental drugs/combinations through multi-arm design to expedite the screening process (early termination of ineffective drugs) and to identify the most effective drug (pick the winner) to warrant a phase III trial. Various statistical approaches have been developed for the pick-the-winner design but have been criticized for lack of objective comparison among the drug agents. We developed a Bayesian pick-the-winner design by integrating a Bayesian posterior probability with Simon two-stage design in a randomized two-arm clinical trial. The Bayesian posterior probability, as the rule to pick the winner, is defined as probability of the response rate in one arm higher than in the other arm. The posterior probability aims to determine the winner when both arms pass the second stage of the Simon two-stage design. When both arms are competitive (i.e., both passing the second stage), the Bayesian posterior probability performs better to correctly identify the winner compared with the Fisher exact test in the simulation study. In comparison to a standard two-arm randomized design, the Bayesian pick-the-winner design has a higher power to determine a clear winner. In application to two studies, the approach is able to perform statistical comparison of two treatment arms and provides a winner probability (Bayesian posterior probability) to statistically justify the winning arm. We developed an integrated design that utilizes Bayesian posterior probability, Simon two-stage design, and randomization into a unique setting. It gives objective comparisons between the arms to determine the winner.
Applications of Bayesian spectrum representation in acoustics
NASA Astrophysics Data System (ADS)
Botts, Jonathan M.
This dissertation utilizes a Bayesian inference framework to enhance the solution of inverse problems where the forward model maps to acoustic spectra. A Bayesian solution to filter design inverts a acoustic spectra to pole-zero locations of a discrete-time filter model. Spatial sound field analysis with a spherical microphone array is a data analysis problem that requires inversion of spatio-temporal spectra to directions of arrival. As with many inverse problems, a probabilistic analysis results in richer solutions than can be achieved with ad-hoc methods. In the filter design problem, the Bayesian inversion results in globally optimal coefficient estimates as well as an estimate the most concise filter capable of representing the given spectrum, within a single framework. This approach is demonstrated on synthetic spectra, head-related transfer function spectra, and measured acoustic reflection spectra. The Bayesian model-based analysis of spatial room impulse responses is presented as an analogous problem with equally rich solution. The model selection mechanism provides an estimate of the number of arrivals, which is necessary to properly infer the directions of simultaneous arrivals. Although, spectrum inversion problems are fairly ubiquitous, the scope of this dissertation has been limited to these two and derivative problems. The Bayesian approach to filter design is demonstrated on an artificial spectrum to illustrate the model comparison mechanism and then on measured head-related transfer functions to show the potential range of application. Coupled with sampling methods, the Bayesian approach is shown to outperform least-squares filter design methods commonly used in commercial software, confirming the need for a global search of the parameter space. The resulting designs are shown to be comparable to those that result from global optimization methods, but the Bayesian approach has the added advantage of a filter length estimate within the same unified framework. The application to reflection data is useful for representing frequency-dependent impedance boundaries in finite difference acoustic simulations. Furthermore, since the filter transfer function is a parametric model, it can be modified to incorporate arbitrary frequency weighting and account for the band-limited nature of measured reflection spectra. Finally, the model is modified to compensate for dispersive error in the finite difference simulation, from the filter design process. Stemming from the filter boundary problem, the implementation of pressure sources in finite difference simulation is addressed in order to assure that schemes properly converge. A class of parameterized source functions is proposed and shown to offer straightforward control of residual error in the simulation. Guided by the notion that the solution to be approximated affects the approximation error, sources are designed which reduce residual dispersive error to the size of round-off errors. The early part of a room impulse response can be characterized by a series of isolated plane waves. Measured with an array of microphones, plane waves map to a directional response of the array or spatial intensity map. Probabilistic inversion of this response results in estimates of the number and directions of image source arrivals. The model-based inversion is shown to avoid ambiguities associated with peak-finding or inspection of the spatial intensity map. For this problem, determining the number of arrivals in a given frame is critical for properly inferring the state of the sound field. This analysis is effectively compression of the spatial room response, which is useful for analysis or encoding of the spatial sound field. Parametric, model-based formulations of these problems enhance the solution in all cases, and a Bayesian interpretation provides a principled approach to model comparison and parameter estimation. v
Deep Learning with Hierarchical Convolutional Factor Analysis
Chen, Bo; Polatkan, Gungor; Sapiro, Guillermo; Blei, David; Dunson, David; Carin, Lawrence
2013-01-01
Unsupervised multi-layered (“deep”) models are considered for general data, with a particular focus on imagery. The model is represented using a hierarchical convolutional factor-analysis construction, with sparse factor loadings and scores. The computation of layer-dependent model parameters is implemented within a Bayesian setting, employing a Gibbs sampler and variational Bayesian (VB) analysis, that explicitly exploit the convolutional nature of the expansion. In order to address large-scale and streaming data, an online version of VB is also developed. The number of basis functions or dictionary elements at each layer is inferred from the data, based on a beta-Bernoulli implementation of the Indian buffet process. Example results are presented for several image-processing applications, with comparisons to related models in the literature. PMID:23787342
Is the Jeffreys' scale a reliable tool for Bayesian model comparison in cosmology?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nesseris, Savvas; García-Bellido, Juan, E-mail: savvas.nesseris@uam.es, E-mail: juan.garciabellido@uam.es
2013-08-01
We are entering an era where progress in cosmology is driven by data, and alternative models will have to be compared and ruled out according to some consistent criterium. The most conservative and widely used approach is Bayesian model comparison. In this paper we explicitly calculate the Bayes factors for all models that are linear with respect to their parameters. We do this in order to test the so called Jeffreys' scale and determine analytically how accurate its predictions are in a simple case where we fully understand and can calculate everything analytically. We also discuss the case of nestedmore » models, e.g. one with M{sub 1} and another with M{sub 2} superset of M{sub 1} parameters and we derive analytic expressions for both the Bayes factor and the figure of Merit, defined as the inverse area of the model parameter's confidence contours. With all this machinery and the use of an explicit example we demonstrate that the threshold nature of Jeffreys' scale is not a ''one size fits all'' reliable tool for model comparison and that it may lead to biased conclusions. Furthermore, we discuss the importance of choosing the right basis in the context of models that are linear with respect to their parameters and how that basis affects the parameter estimation and the derived constraints.« less
Modular analysis of the probabilistic genetic interaction network.
Hou, Lin; Wang, Lin; Qian, Minping; Li, Dong; Tang, Chao; Zhu, Yunping; Deng, Minghua; Li, Fangting
2011-03-15
Epistatic Miniarray Profiles (EMAP) has enabled the mapping of large-scale genetic interaction networks; however, the quantitative information gained from EMAP cannot be fully exploited since the data are usually interpreted as a discrete network based on an arbitrary hard threshold. To address such limitations, we adopted a mixture modeling procedure to construct a probabilistic genetic interaction network and then implemented a Bayesian approach to identify densely interacting modules in the probabilistic network. Mixture modeling has been demonstrated as an effective soft-threshold technique of EMAP measures. The Bayesian approach was applied to an EMAP dataset studying the early secretory pathway in Saccharomyces cerevisiae. Twenty-seven modules were identified, and 14 of those were enriched by gold standard functional gene sets. We also conducted a detailed comparison with state-of-the-art algorithms, hierarchical cluster and Markov clustering. The experimental results show that the Bayesian approach outperforms others in efficiently recovering biologically significant modules.
How Recent History Affects Perception: The Normative Approach and Its Heuristic Approximation
Raviv, Ofri; Ahissar, Merav; Loewenstein, Yonatan
2012-01-01
There is accumulating evidence that prior knowledge about expectations plays an important role in perception. The Bayesian framework is the standard computational approach to explain how prior knowledge about the distribution of expected stimuli is incorporated with noisy observations in order to improve performance. However, it is unclear what information about the prior distribution is acquired by the perceptual system over short periods of time and how this information is utilized in the process of perceptual decision making. Here we address this question using a simple two-tone discrimination task. We find that the “contraction bias”, in which small magnitudes are overestimated and large magnitudes are underestimated, dominates the pattern of responses of human participants. This contraction bias is consistent with the Bayesian hypothesis in which the true prior information is available to the decision-maker. However, a trial-by-trial analysis of the pattern of responses reveals that the contribution of most recent trials to performance is overweighted compared with the predictions of a standard Bayesian model. Moreover, we study participants' performance in a-typical distributions of stimuli and demonstrate substantial deviations from the ideal Bayesian detector, suggesting that the brain utilizes a heuristic approximation of the Bayesian inference. We propose a biologically plausible model, in which decision in the two-tone discrimination task is based on a comparison between the second tone and an exponentially-decaying average of the first tone and past tones. We show that this model accounts for both the contraction bias and the deviations from the ideal Bayesian detector hypothesis. These findings demonstrate the power of Bayesian-like heuristics in the brain, as well as their limitations in their failure to fully adapt to novel environments. PMID:23133343
Merlé, Y; Mentré, F
1995-02-01
In this paper 3 criteria to design experiments for Bayesian estimation of the parameters of nonlinear models with respect to their parameters, when a prior distribution is available, are presented: the determinant of the Bayesian information matrix, the determinant of the pre-posterior covariance matrix, and the expected information provided by an experiment. A procedure to simplify the computation of these criteria is proposed in the case of continuous prior distributions and is compared with the criterion obtained from a linearization of the model about the mean of the prior distribution for the parameters. This procedure is applied to two models commonly encountered in the area of pharmacokinetics and pharmacodynamics: the one-compartment open model with bolus intravenous single-dose injection and the Emax model. They both involve two parameters. Additive as well as multiplicative gaussian measurement errors are considered with normal prior distributions. Various combinations of the variances of the prior distribution and of the measurement error are studied. Our attention is restricted to designs with limited numbers of measurements (1 or 2 measurements). This situation often occurs in practice when Bayesian estimation is performed. The optimal Bayesian designs that result vary with the variances of the parameter distribution and with the measurement error. The two-point optimal designs sometimes differ from the D-optimal designs for the mean of the prior distribution and may consist of replicating measurements. For the studied cases, the determinant of the Bayesian information matrix and its linearized form lead to the same optimal designs. In some cases, the pre-posterior covariance matrix can be far from its lower bound, namely, the inverse of the Bayesian information matrix, especially for the Emax model and a multiplicative measurement error. The expected information provided by the experiment and the determinant of the pre-posterior covariance matrix generally lead to the same designs except for the Emax model and the multiplicative measurement error. Results show that these criteria can be easily computed and that they could be incorporated in modules for designing experiments.
2010-01-01
Background Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. Results Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. Conclusions Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data. PMID:21062443
Liu, Fang; Eugenio, Evercita C
2018-04-01
Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
Probabilistic inference using linear Gaussian importance sampling for hybrid Bayesian networks
NASA Astrophysics Data System (ADS)
Sun, Wei; Chang, K. C.
2005-05-01
Probabilistic inference for Bayesian networks is in general NP-hard using either exact algorithms or approximate methods. However, for very complex networks, only the approximate methods such as stochastic sampling could be used to provide a solution given any time constraint. There are several simulation methods currently available. They include logic sampling (the first proposed stochastic method for Bayesian networks, the likelihood weighting algorithm) the most commonly used simulation method because of its simplicity and efficiency, the Markov blanket scoring method, and the importance sampling algorithm. In this paper, we first briefly review and compare these available simulation methods, then we propose an improved importance sampling algorithm called linear Gaussian importance sampling algorithm for general hybrid model (LGIS). LGIS is aimed for hybrid Bayesian networks consisting of both discrete and continuous random variables with arbitrary distributions. It uses linear function and Gaussian additive noise to approximate the true conditional probability distribution for continuous variable given both its parents and evidence in a Bayesian network. One of the most important features of the newly developed method is that it can adaptively learn the optimal important function from the previous samples. We test the inference performance of LGIS using a 16-node linear Gaussian model and a 6-node general hybrid model. The performance comparison with other well-known methods such as Junction tree (JT) and likelihood weighting (LW) shows that LGIS-GHM is very promising.
Bayesian learning of visual chunks by human observers
Orbán, Gergő; Fiser, József; Aslin, Richard N.; Lengyel, Máté
2008-01-01
Efficient and versatile processing of any hierarchically structured information requires a learning mechanism that combines lower-level features into higher-level chunks. We investigated this chunking mechanism in humans with a visual pattern-learning paradigm. We developed an ideal learner based on Bayesian model comparison that extracts and stores only those chunks of information that are minimally sufficient to encode a set of visual scenes. Our ideal Bayesian chunk learner not only reproduced the results of a large set of previous empirical findings in the domain of human pattern learning but also made a key prediction that we confirmed experimentally. In accordance with Bayesian learning but contrary to associative learning, human performance was well above chance when pair-wise statistics in the exemplars contained no relevant information. Thus, humans extract chunks from complex visual patterns by generating accurate yet economical representations and not by encoding the full correlational structure of the input. PMID:18268353
ERIC Educational Resources Information Center
Kieftenbeld, Vincent; Natesan, Prathiba
2012-01-01
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
A Bayesian hierarchical approach to comparative audit for carotid surgery.
Kuhan, G; Marshall, E C; Abidia, A F; Chetter, I C; McCollum, P T
2002-12-01
the aim of this study was to illustrate how a Bayesian hierarchical modelling approach can aid the reliable comparison of outcome rates between surgeons. retrospective analysis of prospective and retrospective data. binary outcome data (death/stroke within 30 days), together with information on 15 possible risk factors specific for CEA were available on 836 CEAs performed by four vascular surgeons from 1992-99. The median patient age was 68 (range 38-86) years and 60% were men. the model was developed using the WinBUGS software. After adjusting for patient-level risk factors, a cross-validatory approach was adopted to identify "divergent" performance. A ranking exercise was also carried out. the overall observed 30-day stroke/death rate was 3.9% (33/836). The model found diabetes, stroke and heart disease to be significant risk factors. There was no significant difference between the predicted and observed outcome rates for any surgeon (Bayesian p -value>0.05). Each surgeon had a median rank of 3 with associated 95% CI 1.0-5.0, despite the variability of observed stroke/death rate from 2.9-4.4%. After risk adjustment, there was very little residual between-surgeon variability in outcome rate. Bayesian hierarchical models can help to accurately quantify the uncertainty associated with surgeons' performance and rank.
Hoaglin, David C; Hawkins, Neil; Jansen, Jeroen P; Scott, David A; Itzler, Robbin; Cappelleri, Joseph C; Boersma, Cornelis; Thompson, David; Larholt, Kay M; Diaz, Mireya; Barrett, Annabel
2011-06-01
Evidence-based health care decision making requires comparison of all relevant competing interventions. In the absence of randomized controlled trials involving a direct comparison of all treatments of interest, indirect treatment comparisons and network meta-analysis provide useful evidence for judiciously selecting the best treatment(s). Mixed treatment comparisons, a special case of network meta-analysis, combine direct evidence and indirect evidence for particular pairwise comparisons, thereby synthesizing a greater share of the available evidence than traditional meta-analysis. This report from the International Society for Pharmacoeconomics and Outcomes Research Indirect Treatment Comparisons Good Research Practices Task Force provides guidance on technical aspects of conducting network meta-analyses (our use of this term includes most methods that involve meta-analysis in the context of a network of evidence). We start with a discussion of strategies for developing networks of evidence. Next we briefly review assumptions of network meta-analysis. Then we focus on the statistical analysis of the data: objectives, models (fixed-effects and random-effects), frequentist versus Bayesian approaches, and model validation. A checklist highlights key components of network meta-analysis, and substantial examples illustrate indirect treatment comparisons (both frequentist and Bayesian approaches) and network meta-analysis. A further section discusses eight key areas for future research. Copyright © 2011 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Evaluating scaling models in biology using hierarchical Bayesian approaches
Price, Charles A; Ogle, Kiona; White, Ethan P; Weitz, Joshua S
2009-01-01
Theoretical models for allometric relationships between organismal form and function are typically tested by comparing a single predicted relationship with empirical data. Several prominent models, however, predict more than one allometric relationship, and comparisons among alternative models have not taken this into account. Here we evaluate several different scaling models of plant morphology within a hierarchical Bayesian framework that simultaneously fits multiple scaling relationships to three large allometric datasets. The scaling models include: inflexible universal models derived from biophysical assumptions (e.g. elastic similarity or fractal networks), a flexible variation of a fractal network model, and a highly flexible model constrained only by basic algebraic relationships. We demonstrate that variation in intraspecific allometric scaling exponents is inconsistent with the universal models, and that more flexible approaches that allow for biological variability at the species level outperform universal models, even when accounting for relative increases in model complexity. PMID:19453621
A Bayesian approach to model structural error and input variability in groundwater modeling
NASA Astrophysics Data System (ADS)
Xu, T.; Valocchi, A. J.; Lin, Y. F. F.; Liang, F.
2015-12-01
Effective water resource management typically relies on numerical models to analyze groundwater flow and solute transport processes. Model structural error (due to simplification and/or misrepresentation of the "true" environmental system) and input forcing variability (which commonly arises since some inputs are uncontrolled or estimated with high uncertainty) are ubiquitous in groundwater models. Calibration that overlooks errors in model structure and input data can lead to biased parameter estimates and compromised predictions. We present a fully Bayesian approach for a complete assessment of uncertainty for spatially distributed groundwater models. The approach explicitly recognizes stochastic input and uses data-driven error models based on nonparametric kernel methods to account for model structural error. We employ exploratory data analysis to assist in specifying informative prior for error models to improve identifiability. The inference is facilitated by an efficient sampling algorithm based on DREAM-ZS and a parameter subspace multiple-try strategy to reduce the required number of forward simulations of the groundwater model. We demonstrate the Bayesian approach through a synthetic case study of surface-ground water interaction under changing pumping conditions. It is found that explicit treatment of errors in model structure and input data (groundwater pumping rate) has substantial impact on the posterior distribution of groundwater model parameters. Using error models reduces predictive bias caused by parameter compensation. In addition, input variability increases parametric and predictive uncertainty. The Bayesian approach allows for a comparison among the contributions from various error sources, which could inform future model improvement and data collection efforts on how to best direct resources towards reducing predictive uncertainty.
Berchialla, Paola; Scarinzi, Cecilia; Snidero, Silvia; Gregori, Dario
2016-08-01
Risk Assessment is the systematic study of decisions subject to uncertain consequences. An increasing interest has been focused on modeling techniques like Bayesian Networks since their capability of (1) combining in the probabilistic framework different type of evidence including both expert judgments and objective data; (2) overturning previous beliefs in the light of the new information being received and (3) making predictions even with incomplete data. In this work, we proposed a comparison among Bayesian Networks and other classical Quantitative Risk Assessment techniques such as Neural Networks, Classification Trees, Random Forests and Logistic Regression models. Hybrid approaches, combining both Classification Trees and Bayesian Networks, were also considered. Among Bayesian Networks, a clear distinction between purely data-driven approach and combination of expert knowledge with objective data is made. The aim of this paper consists in evaluating among this models which best can be applied, in the framework of Quantitative Risk Assessment, to assess the safety of children who are exposed to the risk of inhalation/insertion/aspiration of consumer products. The issue of preventing injuries in children is of paramount importance, in particular where product design is involved: quantifying the risk associated to product characteristics can be of great usefulness in addressing the product safety design regulation. Data of the European Registry of Foreign Bodies Injuries formed the starting evidence for risk assessment. Results showed that Bayesian Networks appeared to have both the ease of interpretability and accuracy in making prediction, even if simpler models like logistic regression still performed well. © The Author(s) 2013.
Schöniger, Anneli; Wöhling, Thomas; Samaniego, Luis; Nowak, Wolfgang
2014-01-01
Bayesian model selection or averaging objectively ranks a number of plausible, competing conceptual models based on Bayes' theorem. It implicitly performs an optimal trade-off between performance in fitting available data and minimum model complexity. The procedure requires determining Bayesian model evidence (BME), which is the likelihood of the observed data integrated over each model's parameter space. The computation of this integral is highly challenging because it is as high-dimensional as the number of model parameters. Three classes of techniques to compute BME are available, each with its own challenges and limitations: (1) Exact and fast analytical solutions are limited by strong assumptions. (2) Numerical evaluation quickly becomes unfeasible for expensive models. (3) Approximations known as information criteria (ICs) such as the AIC, BIC, or KIC (Akaike, Bayesian, or Kashyap information criterion, respectively) yield contradicting results with regard to model ranking. Our study features a theory-based intercomparison of these techniques. We further assess their accuracy in a simplistic synthetic example where for some scenarios an exact analytical solution exists. In more challenging scenarios, we use a brute-force Monte Carlo integration method as reference. We continue this analysis with a real-world application of hydrological model selection. This is a first-time benchmarking of the various methods for BME evaluation against true solutions. Results show that BME values from ICs are often heavily biased and that the choice of approximation method substantially influences the accuracy of model ranking. For reliable model selection, bias-free numerical methods should be preferred over ICs whenever computationally feasible. PMID:25745272
Upadhyay, S K; Mukherjee, Bhaswati; Gupta, Ashutosh
2009-09-01
Several models for studies related to tensile strength of materials are proposed in the literature where the size or length component has been taken to be an important factor for studying the specimens' failure behaviour. An important model, developed on the basis of cumulative damage approach, is the three-parameter extension of the Birnbaum-Saunders fatigue model that incorporates size of the specimen as an additional variable. This model is a strong competitor of the commonly used Weibull model and stands better than the traditional models, which do not incorporate the size effect. The paper considers two such cumulative damage models, checks their compatibility with a real dataset, compares them with some of the recent toolkits, and finally recommends a model, which appears an appropriate one. Throughout the study is Bayesian based on Markov chain Monte Carlo simulation.
A Bayesian analysis of inflationary primordial spectrum models using Planck data
NASA Astrophysics Data System (ADS)
Santos da Costa, Simony; Benetti, Micol; Alcaniz, Jailson
2018-03-01
The current available Cosmic Microwave Background (CMB) data show an anomalously low value of the CMB temperature fluctuations at large angular scales (l < 40). This lack of power is not explained by the minimal ΛCDM model, and one of the possible mechanisms explored in the literature to address this problem is the presence of features in the primordial power spectrum (PPS) motivated by the early universe physics. In this paper, we analyse a set of cutoff inflationary PPS models using a Bayesian model comparison approach in light of the latest CMB data from the Planck Collaboration. Our results show that the standard power-law parameterisation is preferred over all models considered in the analysis, which motivates the search for alternative explanations for the observed lack of power in the CMB anisotropy spectrum.
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-07-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
A Bayesian Missing Data Framework for Generalized Multiple Outcome Mixed Treatment Comparisons
ERIC Educational Resources Information Center
Hong, Hwanhee; Chu, Haitao; Zhang, Jing; Carlin, Bradley P.
2016-01-01
Bayesian statistical approaches to mixed treatment comparisons (MTCs) are becoming more popular because of their flexibility and interpretability. Many randomized clinical trials report multiple outcomes with possible inherent correlations. Moreover, MTC data are typically sparse (although richer than standard meta-analysis, comparing only two…
Ferragina, A.; de los Campos, G.; Vazquez, A. I.; Cecchinato, A.; Bittante, G.
2017-01-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict “difficult-to-predict” dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm−1 were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R2 value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R2 (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R2 of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. PMID:26387015
Prediction of Sybil attack on WSN using Bayesian network and swarm intelligence
NASA Astrophysics Data System (ADS)
Muraleedharan, Rajani; Ye, Xiang; Osadciw, Lisa Ann
2008-04-01
Security in wireless sensor networks is typically sacrificed or kept minimal due to limited resources such as memory and battery power. Hence, the sensor nodes are prone to Denial-of-service attacks and detecting the threats is crucial in any application. In this paper, the Sybil attack is analyzed and a novel prediction method, combining Bayesian algorithm and Swarm Intelligence (SI) is proposed. Bayesian Networks (BN) is used in representing and reasoning problems, by modeling the elements of uncertainty. The decision from the BN is applied to SI forming an Hybrid Intelligence Scheme (HIS) to re-route the information and disconnecting the malicious nodes in future routes. A performance comparison based on the prediction using HIS vs. Ant System (AS) helps in prioritizing applications where decisions are time-critical.
NASA Astrophysics Data System (ADS)
Arregui, Iñigo
2018-01-01
In contrast to the situation in a laboratory, the study of the solar atmosphere has to be pursued without direct access to the physical conditions of interest. Information is therefore incomplete and uncertain and inference methods need to be employed to diagnose the physical conditions and processes. One of such methods, solar atmospheric seismology, makes use of observed and theoretically predicted properties of waves to infer plasma and magnetic field properties. A recent development in solar atmospheric seismology consists in the use of inversion and model comparison methods based on Bayesian analysis. In this paper, the philosophy and methodology of Bayesian analysis are first explained. Then, we provide an account of what has been achieved so far from the application of these techniques to solar atmospheric seismology and a prospect of possible future extensions.
Compromise decision support problems for hierarchical design involving uncertainty
NASA Astrophysics Data System (ADS)
Vadde, S.; Allen, J. K.; Mistree, F.
1994-08-01
In this paper an extension to the traditional compromise Decision Support Problem (DSP) formulation is presented. Bayesian statistics is used in the formulation to model uncertainties associated with the information being used. In an earlier paper a compromise DSP that accounts for uncertainty using fuzzy set theory was introduced. The Bayesian Decision Support Problem is described in this paper. The method for hierarchical design is demonstrated by using this formulation to design a portal frame. The results are discussed and comparisons are made with those obtained using the fuzzy DSP. Finally, the efficacy of incorporating Bayesian statistics into the traditional compromise DSP formulation is discussed and some pending research issues are described. Our emphasis in this paper is on the method rather than the results per se.
NASA Astrophysics Data System (ADS)
Freni, Gabriele; Mannina, Giorgio
In urban drainage modelling, uncertainty analysis is of undoubted necessity. However, uncertainty analysis in urban water-quality modelling is still in its infancy and only few studies have been carried out. Therefore, several methodological aspects still need to be experienced and clarified especially regarding water quality modelling. The use of the Bayesian approach for uncertainty analysis has been stimulated by its rigorous theoretical framework and by the possibility of evaluating the impact of new knowledge on the modelling predictions. Nevertheless, the Bayesian approach relies on some restrictive hypotheses that are not present in less formal methods like the Generalised Likelihood Uncertainty Estimation (GLUE). One crucial point in the application of Bayesian method is the formulation of a likelihood function that is conditioned by the hypotheses made regarding model residuals. Statistical transformations, such as the use of Box-Cox equation, are generally used to ensure the homoscedasticity of residuals. However, this practice may affect the reliability of the analysis leading to a wrong uncertainty estimation. The present paper aims to explore the influence of the Box-Cox equation for environmental water quality models. To this end, five cases were considered one of which was the “real” residuals distributions (i.e. drawn from available data). The analysis was applied to the Nocella experimental catchment (Italy) which is an agricultural and semi-urbanised basin where two sewer systems, two wastewater treatment plants and a river reach were monitored during both dry and wet weather periods. The results show that the uncertainty estimation is greatly affected by residual transformation and a wrong assumption may also affect the evaluation of model uncertainty. The use of less formal methods always provide an overestimation of modelling uncertainty with respect to Bayesian method but such effect is reduced if a wrong assumption is made regarding the residuals distribution. If residuals are not normally distributed, the uncertainty is over-estimated if Box-Cox transformation is not applied or non-calibrated parameter is used.
Agapova, Maria; Devine, Emily B; Nguyen, Hiep; Wolf, Fredric M; Inoue, Lurdes Y T
2014-07-01
Assessing relative performance among competing interventions is an important part of comparative effectiveness research. Bayesian indirect comparisons add information to existing Cochrane reviews, such as which intervention is likely to perform best. However, heterogeneity variance priors may influence results and, potentially, clinical guidance. We highlight the features of Bayesian indirect comparisons using a case study of a Cochrane review update in asthma care. The probability that one self-management educational intervention outperforms others is estimated. Simulation studies investigate the effect of heterogeneity variance prior distributions. Results suggest a 55% probability that individual education is best, followed by combination (39%) and group (6%). The intervention with few trials was sensitive to prior distributions. Bayesian indirect comparisons updates of Cochrane reviews are valuable comparative effectiveness research tools.
Methods of comparing associative models and an application to retrospective revaluation.
Witnauer, James E; Hutchings, Ryan; Miller, Ralph R
2017-11-01
Contemporary theories of associative learning are increasingly complex, which necessitates the use of computational methods to reveal predictions of these models. We argue that comparisons across multiple models in terms of goodness of fit to empirical data from experiments often reveal more about the actual mechanisms of learning and behavior than do simulations of only a single model. Such comparisons are best made when the values of free parameters are discovered through some optimization procedure based on the specific data being fit (e.g., hill climbing), so that the comparisons hinge on the psychological mechanisms assumed by each model rather than being biased by using parameters that differ in quality across models with respect to the data being fit. Statistics like the Bayesian information criterion facilitate comparisons among models that have different numbers of free parameters. These issues are examined using retrospective revaluation data. Copyright © 2017 Elsevier B.V. All rights reserved.
Novick, Steven; Shen, Yan; Yang, Harry; Peterson, John; LeBlond, Dave; Altan, Stan
2015-01-01
Dissolution (or in vitro release) studies constitute an important aspect of pharmaceutical drug development. One important use of such studies is for justifying a biowaiver for post-approval changes which requires establishing equivalence between the new and old product. We propose a statistically rigorous modeling approach for this purpose based on the estimation of what we refer to as the F2 parameter, an extension of the commonly used f2 statistic. A Bayesian test procedure is proposed in relation to a set of composite hypotheses that capture the similarity requirement on the absolute mean differences between test and reference dissolution profiles. Several examples are provided to illustrate the application. Results of our simulation study comparing the performance of f2 and the proposed method show that our Bayesian approach is comparable to or in many cases superior to the f2 statistic as a decision rule. Further useful extensions of the method, such as the use of continuous-time dissolution modeling, are considered.
Using Bayesian neural networks to classify forest scenes
NASA Astrophysics Data System (ADS)
Vehtari, Aki; Heikkonen, Jukka; Lampinen, Jouko; Juujarvi, Jouni
1998-10-01
We present results that compare the performance of Bayesian learning methods for neural networks on the task of classifying forest scenes into trees and background. Classification task is demanding due to the texture richness of the trees, occlusions of the forest scene objects and diverse lighting conditions under operation. This makes it difficult to determine which are optimal image features for the classification. A natural way to proceed is to extract many different types of potentially suitable features, and to evaluate their usefulness in later processing stages. One approach to cope with large number of features is to use Bayesian methods to control the model complexity. Bayesian learning uses a prior on model parameters, combines this with evidence from a training data, and the integrates over the resulting posterior to make predictions. With this method, we can use large networks and many features without fear of overfitting. For this classification task we compare two Bayesian learning methods for multi-layer perceptron (MLP) neural networks: (1) The evidence framework of MacKay uses a Gaussian approximation to the posterior weight distribution and maximizes with respect to hyperparameters. (2) In a Markov Chain Monte Carlo (MCMC) method due to Neal, the posterior distribution of the network parameters is numerically integrated using the MCMC method. As baseline classifiers for comparison we use (3) MLP early stop committee, (4) K-nearest-neighbor and (5) Classification And Regression Tree.
Bayesian Estimation of Small Effects in Exercise and Sports Science.
Mengersen, Kerrie L; Drovandi, Christopher C; Robert, Christian P; Pyne, David B; Gore, Christopher J
2016-01-01
The aim of this paper is to provide a Bayesian formulation of the so-called magnitude-based inference approach to quantifying and interpreting effects, and in a case study example provide accurate probabilistic statements that correspond to the intended magnitude-based inferences. The model is described in the context of a published small-scale athlete study which employed a magnitude-based inference approach to compare the effect of two altitude training regimens (live high-train low (LHTL), and intermittent hypoxic exposure (IHE)) on running performance and blood measurements of elite triathletes. The posterior distributions, and corresponding point and interval estimates, for the parameters and associated effects and comparisons of interest, were estimated using Markov chain Monte Carlo simulations. The Bayesian analysis was shown to provide more direct probabilistic comparisons of treatments and able to identify small effects of interest. The approach avoided asymptotic assumptions and overcame issues such as multiple testing. Bayesian analysis of unscaled effects showed a probability of 0.96 that LHTL yields a substantially greater increase in hemoglobin mass than IHE, a 0.93 probability of a substantially greater improvement in running economy and a greater than 0.96 probability that both IHE and LHTL yield a substantially greater improvement in maximum blood lactate concentration compared to a Placebo. The conclusions are consistent with those obtained using a 'magnitude-based inference' approach that has been promoted in the field. The paper demonstrates that a fully Bayesian analysis is a simple and effective way of analysing small effects, providing a rich set of results that are straightforward to interpret in terms of probabilistic statements.
Revisiting Isotherm Analyses Using R: Comparison of Linear, Non-linear, and Bayesian Techniques
Extensive adsorption isotherm data exist for an array of chemicals of concern on a variety of engineered and natural sorbents. Several isotherm models exist that can accurately describe these data from which the resultant fitting parameters may subsequently be used in numerical ...
Bayesian comparison of protein structures using partial Procrustes distance.
Ejlali, Nasim; Faghihi, Mohammad Reza; Sadeghi, Mehdi
2017-09-26
An important topic in bioinformatics is the protein structure alignment. Some statistical methods have been proposed for this problem, but most of them align two protein structures based on the global geometric information without considering the effect of neighbourhood in the structures. In this paper, we provide a Bayesian model to align protein structures, by considering the effect of both local and global geometric information of protein structures. Local geometric information is incorporated to the model through the partial Procrustes distance of small substructures. These substructures are composed of β-carbon atoms from the side chains. Parameters are estimated using a Markov chain Monte Carlo (MCMC) approach. We evaluate the performance of our model through some simulation studies. Furthermore, we apply our model to a real dataset and assess the accuracy and convergence rate. Results show that our model is much more efficient than previous approaches.
Yen, A M-F; Liou, H-H; Lin, H-L; Chen, T H-H
2006-01-01
The study aimed to develop a predictive model to deal with data fraught with heterogeneity that cannot be explained by sampling variation or measured covariates. The random-effect Poisson regression model was first proposed to deal with over-dispersion for data fraught with heterogeneity after making allowance for measured covariates. Bayesian acyclic graphic model in conjunction with Markov Chain Monte Carlo (MCMC) technique was then applied to estimate the parameters of both relevant covariates and random effect. Predictive distribution was then generated to compare the predicted with the observed for the Bayesian model with and without random effect. Data from repeated measurement of episodes among 44 patients with intractable epilepsy were used as an illustration. The application of Poisson regression without taking heterogeneity into account to epilepsy data yielded a large value of heterogeneity (heterogeneity factor = 17.90, deviance = 1485, degree of freedom (df) = 83). After taking the random effect into account, the value of heterogeneity factor was greatly reduced (heterogeneity factor = 0.52, deviance = 42.5, df = 81). The Pearson chi2 for the comparison between the expected seizure frequencies and the observed ones at two and three months of the model with and without random effect were 34.27 (p = 1.00) and 1799.90 (p < 0.0001), respectively. The Bayesian acyclic model using the MCMC method was demonstrated to have great potential for disease prediction while data show over-dispersion attributed either to correlated property or to subject-to-subject variability.
Bayesian model evidence as a model evaluation metric
NASA Astrophysics Data System (ADS)
Guthke, Anneli; Höge, Marvin; Nowak, Wolfgang
2017-04-01
When building environmental systems models, we are typically confronted with the questions of how to choose an appropriate model (i.e., which processes to include or neglect) and how to measure its quality. Various metrics have been proposed that shall guide the modeller towards a most robust and realistic representation of the system under study. Criteria for evaluation often address aspects of accuracy (absence of bias) or of precision (absence of unnecessary variance) and need to be combined in a meaningful way in order to address the inherent bias-variance dilemma. We suggest using Bayesian model evidence (BME) as a model evaluation metric that implicitly performs a tradeoff between bias and variance. BME is typically associated with model weights in the context of Bayesian model averaging (BMA). However, it can also be seen as a model evaluation metric in a single-model context or in model comparison. It combines a measure for goodness of fit with a penalty for unjustifiable complexity. Unjustifiable refers to the fact that the appropriate level of model complexity is limited by the amount of information available for calibration. Derived in a Bayesian context, BME naturally accounts for measurement errors in the calibration data as well as for input and parameter uncertainty. BME is therefore perfectly suitable to assess model quality under uncertainty. We will explain in detail and with schematic illustrations what BME measures, i.e. how complexity is defined in the Bayesian setting and how this complexity is balanced with goodness of fit. We will further discuss how BME compares to other model evaluation metrics that address accuracy and precision such as the predictive logscore or other model selection criteria such as the AIC, BIC or KIC. Although computationally more expensive than other metrics or criteria, BME represents an appealing alternative because it provides a global measure of model quality. Even if not applicable to each and every case, we aim at stimulating discussion about how to judge the quality of hydrological models in the presence of uncertainty in general by dissecting the mechanism behind BME.
Using Alien Coins to Test Whether Simple Inference Is Bayesian
ERIC Educational Resources Information Center
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Bayesian multivariate Poisson abundance models for T-cell receptor data.
Greene, Joshua; Birtwistle, Marc R; Ignatowicz, Leszek; Rempala, Grzegorz A
2013-06-07
A major feature of an adaptive immune system is its ability to generate B- and T-cell clones capable of recognizing and neutralizing specific antigens. These clones recognize antigens with the help of the surface molecules, called antigen receptors, acquired individually during the clonal development process. In order to ensure a response to a broad range of antigens, the number of different receptor molecules is extremely large, resulting in a huge clonal diversity of both B- and T-cell receptor populations and making their experimental comparisons statistically challenging. To facilitate such comparisons, we propose a flexible parametric model of multivariate count data and illustrate its use in a simultaneous analysis of multiple antigen receptor populations derived from mammalian T-cells. The model relies on a representation of the observed receptor counts as a multivariate Poisson abundance mixture (m PAM). A Bayesian parameter fitting procedure is proposed, based on the complete posterior likelihood, rather than the conditional one used typically in similar settings. The new procedure is shown to be considerably more efficient than its conditional counterpart (as measured by the Fisher information) in the regions of m PAM parameter space relevant to model T-cell data. Copyright © 2013 Elsevier Ltd. All rights reserved.
Adam-Poupart, Ariane; Brand, Allan; Fournier, Michel; Jerrett, Michael
2014-01-01
Background: Ambient air ozone (O3) is a pulmonary irritant that has been associated with respiratory health effects including increased lung inflammation and permeability, airway hyperreactivity, respiratory symptoms, and decreased lung function. Estimation of O3 exposure is a complex task because the pollutant exhibits complex spatiotemporal patterns. To refine the quality of exposure estimation, various spatiotemporal methods have been developed worldwide. Objectives: We sought to compare the accuracy of three spatiotemporal models to predict summer ground-level O3 in Quebec, Canada. Methods: We developed a land-use mixed-effects regression (LUR) model based on readily available data (air quality and meteorological monitoring data, road networks information, latitude), a Bayesian maximum entropy (BME) model incorporating both O3 monitoring station data and the land-use mixed model outputs (BME-LUR), and a kriging method model based only on available O3 monitoring station data (BME kriging). We performed leave-one-station-out cross-validation and visually assessed the predictive capability of each model by examining the mean temporal and spatial distributions of the average estimated errors. Results: The BME-LUR was the best predictive model (R2 = 0.653) with the lowest root mean-square error (RMSE ;7.06 ppb), followed by the LUR model (R2 = 0.466, RMSE = 8.747) and the BME kriging model (R2 = 0.414, RMSE = 9.164). Conclusions: Our findings suggest that errors of estimation in the interpolation of O3 concentrations with BME can be greatly reduced by incorporating outputs from a LUR model developed with readily available data. Citation: Adam-Poupart A, Brand A, Fournier M, Jerrett M, Smargiassi A. 2014. Spatiotemporal modeling of ozone levels in Quebec (Canada): a comparison of kriging, land-use regression (LUR), and combined Bayesian maximum entropy–LUR approaches. Environ Health Perspect 122:970–976; http://dx.doi.org/10.1289/ehp.1306566 PMID:24879650
Comparison of Co-Temporal Modeling Algorithms on Sparse Experimental Time Series Data Sets.
Allen, Edward E; Norris, James L; John, David J; Thomas, Stan J; Turkett, William H; Fetrow, Jacquelyn S
2010-01-01
Multiple approaches for reverse-engineering biological networks from time-series data have been proposed in the computational biology literature. These approaches can be classified by their underlying mathematical algorithms, such as Bayesian or algebraic techniques, as well as by their time paradigm, which includes next-state and co-temporal modeling. The types of biological relationships, such as parent-child or siblings, discovered by these algorithms are quite varied. It is important to understand the strengths and weaknesses of the various algorithms and time paradigms on actual experimental data. We assess how well the co-temporal implementations of three algorithms, continuous Bayesian, discrete Bayesian, and computational algebraic, can 1) identify two types of entity relationships, parent and sibling, between biological entities, 2) deal with experimental sparse time course data, and 3) handle experimental noise seen in replicate data sets. These algorithms are evaluated, using the shuffle index metric, for how well the resulting models match literature models in terms of siblings and parent relationships. Results indicate that all three co-temporal algorithms perform well, at a statistically significant level, at finding sibling relationships, but perform relatively poorly in finding parent relationships.
NASA Astrophysics Data System (ADS)
Mearns, L. O.; Sain, S. R.; McGinnis, S. A.; Steinschneider, S.; Brown, C. M.
2015-12-01
In this talk we present the development of a joint Bayesian Probabilistic Model for the climate change results of the North American Regional Climate Change Assessment Program (NARCCAP) that uses a unique prior in the model formulation. We use the climate change results (joint distribution of seasonal temperature and precipitation changes (future vs. current)) from the global climate models (GCMs) that provided boundary conditions for the six different regional climate models used in the program as informative priors for the bivariate Bayesian Model. The two variables involved are seasonal temperature and precipitation over sub-regions (i.e., Bukovsky Regions) of the full NARCCAP domain. The basic approach to the joint Bayesian hierarchical model follows the approach of Tebaldi and Sansó (2009). We compare model results using informative (i.e., GCM information) as well as uninformative priors. We apply these results to the Water Evaluation and Planning System (WEAP) model for the Colorado Springs Utility in Colorado. We investigate the layout of the joint pdfs in the context of the water model sensitivities to ranges of temperature and precipitation results to determine the likelihoods of future climate conditions that cannot be accommodated by possible adaptation options. Comparisons may also be made with joint pdfs formed from the CMIP5 collection of global climate models and empirically downscaled to the region of interest.
Kwon, Deukwoo; Hoffman, F Owen; Moroz, Brian E; Simon, Steven L
2016-02-10
Most conventional risk analysis methods rely on a single best estimate of exposure per person, which does not allow for adjustment for exposure-related uncertainty. Here, we propose a Bayesian model averaging method to properly quantify the relationship between radiation dose and disease outcomes by accounting for shared and unshared uncertainty in estimated dose. Our Bayesian risk analysis method utilizes multiple realizations of sets (vectors) of doses generated by a two-dimensional Monte Carlo simulation method that properly separates shared and unshared errors in dose estimation. The exposure model used in this work is taken from a study of the risk of thyroid nodules among a cohort of 2376 subjects who were exposed to fallout from nuclear testing in Kazakhstan. We assessed the performance of our method through an extensive series of simulations and comparisons against conventional regression risk analysis methods. When the estimated doses contain relatively small amounts of uncertainty, the Bayesian method using multiple a priori plausible draws of dose vectors gave similar results to the conventional regression-based methods of dose-response analysis. However, when large and complex mixtures of shared and unshared uncertainties are present, the Bayesian method using multiple dose vectors had significantly lower relative bias than conventional regression-based risk analysis methods and better coverage, that is, a markedly increased capability to include the true risk coefficient within the 95% credible interval of the Bayesian-based risk estimate. An evaluation of the dose-response using our method is presented for an epidemiological study of thyroid disease following radiation exposure. Copyright © 2015 John Wiley & Sons, Ltd.
Comparing hierarchical models via the marginalized deviance information criterion.
Quintero, Adrian; Lesaffre, Emmanuel
2018-07-20
Hierarchical models are extensively used in pharmacokinetics and longitudinal studies. When the estimation is performed from a Bayesian approach, model comparison is often based on the deviance information criterion (DIC). In hierarchical models with latent variables, there are several versions of this statistic: the conditional DIC (cDIC) that incorporates the latent variables in the focus of the analysis and the marginalized DIC (mDIC) that integrates them out. Regardless of the asymptotic and coherency difficulties of cDIC, this alternative is usually used in Markov chain Monte Carlo (MCMC) methods for hierarchical models because of practical convenience. The mDIC criterion is more appropriate in most cases but requires integration of the likelihood, which is computationally demanding and not implemented in Bayesian software. Therefore, we consider a method to compute mDIC by generating replicate samples of the latent variables that need to be integrated out. This alternative can be easily conducted from the MCMC output of Bayesian packages and is widely applicable to hierarchical models in general. Additionally, we propose some approximations in order to reduce the computational complexity for large-sample situations. The method is illustrated with simulated data sets and 2 medical studies, evidencing that cDIC may be misleading whilst mDIC appears pertinent. Copyright © 2018 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Sen, Sedat
2018-01-01
Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…
Ferragina, A; de los Campos, G; Vazquez, A I; Cecchinato, A; Bittante, G
2015-11-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R(2) value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R(2) (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R(2) of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Funamizu, Akihiro; Ito, Makoto; Doya, Kenji; Kanzaki, Ryohei; Takahashi, Hirokazu
2012-01-01
The estimation of reward outcomes for action candidates is essential for decision making. In this study, we examined whether and how the uncertainty in reward outcome estimation affects the action choice and learning rate. We designed a choice task in which rats selected either the left-poking or right-poking hole and received a reward of a food pellet stochastically. The reward probabilities of the left and right holes were chosen from six settings (high, 100% vs. 66%; mid, 66% vs. 33%; low, 33% vs. 0% for the left vs. right holes, and the opposites) in every 20–549 trials. We used Bayesian Q-learning models to estimate the time course of the probability distribution of action values and tested if they better explain the behaviors of rats than standard Q-learning models that estimate only the mean of action values. Model comparison by cross-validation revealed that a Bayesian Q-learning model with an asymmetric update for reward and non-reward outcomes fit the choice time course of the rats best. In the action-choice equation of the Bayesian Q-learning model, the estimated coefficient for the variance of action value was positive, meaning that rats were uncertainty seeking. Further analysis of the Bayesian Q-learning model suggested that the uncertainty facilitated the effective learning rate. These results suggest that the rats consider uncertainty in action-value estimation and that they have an uncertainty-seeking action policy and uncertainty-dependent modulation of the effective learning rate. PMID:22487046
Quantifying properties of hot and dense QCD matter through systematic model-to-data comparison
Bernhard, Jonah E.; Marcy, Peter W.; Coleman-Smith, Christopher E.; ...
2015-05-22
We systematically compare an event-by-event heavy-ion collision model to data from the CERN Large Hadron Collider. Using a general Bayesian method, we probe multiple model parameters including fundamental quark-gluon plasma properties such as the specific shear viscosity η/s, calibrate the model to optimally reproduce experimental data, and extract quantitative constraints for all parameters simultaneously. Furthermore, the method is universal and easily extensible to other data and collision models.
A Comparison of a Bayesian and a Maximum Likelihood Tailored Testing Procedure.
ERIC Educational Resources Information Center
McKinley, Robert L.; Reckase, Mark D.
A study was conducted to compare tailored testing procedures based on a Bayesian ability estimation technique and on a maximum likelihood ability estimation technique. The Bayesian tailored testing procedure selected items so as to minimize the posterior variance of the ability estimate distribution, while the maximum likelihood tailored testing…
The Development of Bayesian Theory and Its Applications in Business and Bioinformatics
NASA Astrophysics Data System (ADS)
Zhang, Yifei
2018-03-01
Bayesian Theory originated from an Essay of a British mathematician named Thomas Bayes in 1763, and after its development in 20th century, Bayesian Statistics has been taking a significant part in statistical study of all fields. Due to the recent breakthrough of high-dimensional integral, Bayesian Statistics has been improved and perfected, and now it can be used to solve problems that Classical Statistics failed to solve. This paper summarizes Bayesian Statistics’ history, concepts and applications, which are illustrated in five parts: the history of Bayesian Statistics, the weakness of Classical Statistics, Bayesian Theory and its development and applications. The first two parts make a comparison between Bayesian Statistics and Classical Statistics in a macroscopic aspect. And the last three parts focus on Bayesian Theory in specific -- from introducing some particular Bayesian Statistics’ concepts to listing their development and finally their applications.
Moving beyond qualitative evaluations of Bayesian models of cognition.
Hemmer, Pernille; Tauber, Sean; Steyvers, Mark
2015-06-01
Bayesian models of cognition provide a powerful way to understand the behavior and goals of individuals from a computational point of view. Much of the focus in the Bayesian cognitive modeling approach has been on qualitative model evaluations, where predictions from the models are compared to data that is often averaged over individuals. In many cognitive tasks, however, there are pervasive individual differences. We introduce an approach to directly infer individual differences related to subjective mental representations within the framework of Bayesian models of cognition. In this approach, Bayesian data analysis methods are used to estimate cognitive parameters and motivate the inference process within a Bayesian cognitive model. We illustrate this integrative Bayesian approach on a model of memory. We apply the model to behavioral data from a memory experiment involving the recall of heights of people. A cross-validation analysis shows that the Bayesian memory model with inferred subjective priors predicts withheld data better than a Bayesian model where the priors are based on environmental statistics. In addition, the model with inferred priors at the individual subject level led to the best overall generalization performance, suggesting that individual differences are important to consider in Bayesian models of cognition.
A Bayesian approach to estimate the biomass of anchovies off the coast of Perú.
Quiroz, Zaida C; Prates, Marcos O; Rue, Håvard
2015-03-01
The Northern Humboldt Current System (NHCS) is the world's most productive ecosystem in terms of fish. In particular, the Peruvian anchovy (Engraulis ringens) is the major prey of the main top predators, like seabirds, fish, humans, and other mammals. In this context, it is important to understand the dynamics of the anchovy distribution to preserve it as well as to exploit its economic capacities. Using the data collected by the "Instituto del Mar del Perú" (IMARPE) during a scientific survey in 2005, we present a statistical analysis that has as main goals: (i) to adapt to the characteristics of the sampled data, such as spatial dependence, high proportions of zeros and big size of samples; (ii) to provide important insights on the dynamics of the anchovy population; and (iii) to propose a model for estimation and prediction of anchovy biomass in the NHCS offshore from Perú. These data were analyzed in a Bayesian framework using the integrated nested Laplace approximation (INLA) method. Further, to select the best model and to study the predictive power of each model, we performed model comparisons and predictive checks, respectively. Finally, we carried out a Bayesian spatial influence diagnostic for the preferred model. © 2014, The International Biometric Society.
Bayesian inference of physiologically meaningful parameters from body sway measurements.
Tietäväinen, A; Gutmann, M U; Keski-Vakkuri, E; Corander, J; Hæggström, E
2017-06-19
The control of the human body sway by the central nervous system, muscles, and conscious brain is of interest since body sway carries information about the physiological status of a person. Several models have been proposed to describe body sway in an upright standing position, however, due to the statistical intractability of the more realistic models, no formal parameter inference has previously been conducted and the expressive power of such models for real human subjects remains unknown. Using the latest advances in Bayesian statistical inference for intractable models, we fitted a nonlinear control model to posturographic measurements, and we showed that it can accurately predict the sway characteristics of both simulated and real subjects. Our method provides a full statistical characterization of the uncertainty related to all model parameters as quantified by posterior probability density functions, which is useful for comparisons across subjects and test settings. The ability to infer intractable control models from sensor data opens new possibilities for monitoring and predicting body status in health applications.
A 3D model retrieval approach based on Bayesian networks lightfield descriptor
NASA Astrophysics Data System (ADS)
Xiao, Qinhan; Li, Yanjun
2009-12-01
A new 3D model retrieval methodology is proposed by exploiting a novel Bayesian networks lightfield descriptor (BNLD). There are two key novelties in our approach: (1) a BN-based method for building lightfield descriptor; and (2) a 3D model retrieval scheme based on the proposed BNLD. To overcome the disadvantages of the existing 3D model retrieval methods, we explore BN for building a new lightfield descriptor. Firstly, 3D model is put into lightfield, about 300 binary-views can be obtained along a sphere, then Fourier descriptors and Zernike moments descriptors can be calculated out from binaryviews. Then shape feature sequence would be learned into a BN model based on BN learning algorithm; Secondly, we propose a new 3D model retrieval method by calculating Kullback-Leibler Divergence (KLD) between BNLDs. Beneficial from the statistical learning, our BNLD is noise robustness as compared to the existing methods. The comparison between our method and the lightfield descriptor-based approach is conducted to demonstrate the effectiveness of our proposed methodology.
Bayesian LASSO, scale space and decision making in association genetics.
Pasanen, Leena; Holmström, Lasse; Sillanpää, Mikko J
2015-01-01
LASSO is a penalized regression method that facilitates model fitting in situations where there are as many, or even more explanatory variables than observations, and only a few variables are relevant in explaining the data. We focus on the Bayesian version of LASSO and consider four problems that need special attention: (i) controlling false positives, (ii) multiple comparisons, (iii) collinearity among explanatory variables, and (iv) the choice of the tuning parameter that controls the amount of shrinkage and the sparsity of the estimates. The particular application considered is association genetics, where LASSO regression can be used to find links between chromosome locations and phenotypic traits in a biological organism. However, the proposed techniques are relevant also in other contexts where LASSO is used for variable selection. We separate the true associations from false positives using the posterior distribution of the effects (regression coefficients) provided by Bayesian LASSO. We propose to solve the multiple comparisons problem by using simultaneous inference based on the joint posterior distribution of the effects. Bayesian LASSO also tends to distribute an effect among collinear variables, making detection of an association difficult. We propose to solve this problem by considering not only individual effects but also their functionals (i.e. sums and differences). Finally, whereas in Bayesian LASSO the tuning parameter is often regarded as a random variable, we adopt a scale space view and consider a whole range of fixed tuning parameters, instead. The effect estimates and the associated inference are considered for all tuning parameters in the selected range and the results are visualized with color maps that provide useful insights into data and the association problem considered. The methods are illustrated using two sets of artificial data and one real data set, all representing typical settings in association genetics.
RadVel: General toolkit for modeling Radial Velocities
NASA Astrophysics Data System (ADS)
Fulton, Benjamin J.; Petigura, Erik A.; Blunt, Sarah; Sinukoff, Evan
2018-01-01
RadVel models Keplerian orbits in radial velocity (RV) time series. The code is written in Python with a fast Kepler's equation solver written in C. It provides a framework for fitting RVs using maximum a posteriori optimization and computing robust confidence intervals by sampling the posterior probability density via Markov Chain Monte Carlo (MCMC). RadVel can perform Bayesian model comparison and produces publication quality plots and LaTeX tables.
USDA-ARS?s Scientific Manuscript database
Background Several studies have examined the accuracy of genomic selection both within and across purebred beef or dairy populations. However, the accuracy of direct genomic breeding values (DGVs) has been less well studied in crossbred or admixed cattle populations. We used a population of 3,240 cr...
Spatiotemporal Bayesian networks for malaria prediction.
Haddawy, Peter; Hasan, A H M Imrul; Kasantikul, Rangwan; Lawpoolsri, Saranath; Sa-Angchai, Patiwat; Kaewkungwal, Jaranit; Singhasivanon, Pratap
2018-01-01
Targeted intervention and resource allocation are essential for effective malaria control, particularly in remote areas, with predictive models providing important information for decision making. While a diversity of modeling technique have been used to create predictive models of malaria, no work has made use of Bayesian networks. Bayes nets are attractive due to their ability to represent uncertainty, model time lagged and nonlinear relations, and provide explanations. This paper explores the use of Bayesian networks to model malaria, demonstrating the approach by creating village level models with weekly temporal resolution for Tha Song Yang district in northern Thailand. The networks are learned using data on cases and environmental covariates. Three types of networks are explored: networks for numeric prediction, networks for outbreak prediction, and networks that incorporate spatial autocorrelation. Evaluation of the numeric prediction network shows that the Bayes net has prediction accuracy in terms of mean absolute error of about 1.4 cases for 1 week prediction and 1.7 cases for 6 week prediction. The network for outbreak prediction has an ROC AUC above 0.9 for all prediction horizons. Comparison of prediction accuracy of both Bayes nets against several traditional modeling approaches shows the Bayes nets to outperform the other models for longer time horizon prediction of high incidence transmission. To model spread of malaria over space, we elaborate the models with links between the village networks. This results in some very large models which would be far too laborious to build by hand. So we represent the models as collections of probability logic rules and automatically generate the networks. Evaluation of the models shows that the autocorrelation links significantly improve prediction accuracy for some villages in regions of high incidence. We conclude that spatiotemporal Bayesian networks are a highly promising modeling alternative for prediction of malaria and other vector-borne diseases. Copyright © 2017 Elsevier B.V. All rights reserved.
Funamizu, Akihiro; Ito, Makoto; Doya, Kenji; Kanzaki, Ryohei; Takahashi, Hirokazu
2012-04-01
The estimation of reward outcomes for action candidates is essential for decision making. In this study, we examined whether and how the uncertainty in reward outcome estimation affects the action choice and learning rate. We designed a choice task in which rats selected either the left-poking or right-poking hole and received a reward of a food pellet stochastically. The reward probabilities of the left and right holes were chosen from six settings (high, 100% vs. 66%; mid, 66% vs. 33%; low, 33% vs. 0% for the left vs. right holes, and the opposites) in every 20-549 trials. We used Bayesian Q-learning models to estimate the time course of the probability distribution of action values and tested if they better explain the behaviors of rats than standard Q-learning models that estimate only the mean of action values. Model comparison by cross-validation revealed that a Bayesian Q-learning model with an asymmetric update for reward and non-reward outcomes fit the choice time course of the rats best. In the action-choice equation of the Bayesian Q-learning model, the estimated coefficient for the variance of action value was positive, meaning that rats were uncertainty seeking. Further analysis of the Bayesian Q-learning model suggested that the uncertainty facilitated the effective learning rate. These results suggest that the rats consider uncertainty in action-value estimation and that they have an uncertainty-seeking action policy and uncertainty-dependent modulation of the effective learning rate. © 2012 The Authors. European Journal of Neuroscience © 2012 Federation of European Neuroscience Societies and Blackwell Publishing Ltd.
Hulin, Anne; Blanchet, Benoît; Audard, Vincent; Barau, Caroline; Furlan, Valérie; Durrbach, Antoine; Taïeb, Fabrice; Lang, Philippe; Grimbert, Philippe; Tod, Michel
2009-04-01
A significant relationship between mycophenolic acid (MPA) area under the plasma concentration-time curve (AUC) and the risk for rejection has been reported. Based on 3 concentration measurements, 3 approaches have been proposed for the estimation of MPA AUC, involving either a multilinear regression approach model (MLRA) or a Bayesian estimation using either gamma absorption or zero-order absorption population models. The aim of the study was to compare the 3 approaches for the estimation of MPA AUC in 150 renal transplant patients treated with mycophenolate mofetil and tacrolimus. The population parameters were determined in 77 patients (learning study). The AUC estimation methods were compared in the learning population and in 73 patients from another center (validation study). In the latter study, the reference AUCs were estimated by the trapezoidal rule on 8 measurements. MPA concentrations were measured by liquid chromatography. The gamma absorption model gave the best fit. In the learning study, the AUCs estimated by both Bayesian methods were very similar, whereas the multilinear approach was highly correlated but yielded estimates about 20% lower than Bayesian methods. This resulted in dosing recommendations differing by 250 mg/12 h or more in 27% of cases. In the validation study, AUC estimates based on the Bayesian method with gamma absorption model and multilinear regression approach model were, respectively, 12% higher and 7% lower than the reference values. To conclude, the bicompartmental model with gamma absorption rate gave the best fit. The 3 AUC estimation methods are highly correlated but not concordant. For a given patient, the same estimation method should always be used.
Vernon, Ian; Liu, Junli; Goldstein, Michael; Rowe, James; Topping, Jen; Lindsey, Keith
2018-01-02
Many mathematical models have now been employed across every area of systems biology. These models increasingly involve large numbers of unknown parameters, have complex structure which can result in substantial evaluation time relative to the needs of the analysis, and need to be compared to observed data of various forms. The correct analysis of such models usually requires a global parameter search, over a high dimensional parameter space, that incorporates and respects the most important sources of uncertainty. This can be an extremely difficult task, but it is essential for any meaningful inference or prediction to be made about any biological system. It hence represents a fundamental challenge for the whole of systems biology. Bayesian statistical methodology for the uncertainty analysis of complex models is introduced, which is designed to address the high dimensional global parameter search problem. Bayesian emulators that mimic the systems biology model but which are extremely fast to evaluate are embeded within an iterative history match: an efficient method to search high dimensional spaces within a more formal statistical setting, while incorporating major sources of uncertainty. The approach is demonstrated via application to a model of hormonal crosstalk in Arabidopsis root development, which has 32 rate parameters, for which we identify the sets of rate parameter values that lead to acceptable matches between model output and observed trend data. The multiple insights into the model's structure that this analysis provides are discussed. The methodology is applied to a second related model, and the biological consequences of the resulting comparison, including the evaluation of gene functions, are described. Bayesian uncertainty analysis for complex models using both emulators and history matching is shown to be a powerful technique that can greatly aid the study of a large class of systems biology models. It both provides insight into model behaviour and identifies the sets of rate parameters of interest.
NASA Astrophysics Data System (ADS)
Hernández-López, Mario R.; Romero-Cuéllar, Jonathan; Camilo Múnera-Estrada, Juan; Coccia, Gabriele; Francés, Félix
2017-04-01
It is noticeably important to emphasize the role of uncertainty particularly when the model forecasts are used to support decision-making and water management. This research compares two approaches for the evaluation of the predictive uncertainty in hydrological modeling. First approach is the Bayesian Joint Inference of hydrological and error models. Second approach is carried out through the Model Conditional Processor using the Truncated Normal Distribution in the transformed space. This comparison is focused on the predictive distribution reliability. The case study is applied to two basins included in the Model Parameter Estimation Experiment (MOPEX). These two basins, which have different hydrological complexity, are the French Broad River (North Carolina) and the Guadalupe River (Texas). The results indicate that generally, both approaches are able to provide similar predictive performances. However, the differences between them can arise in basins with complex hydrology (e.g. ephemeral basins). This is because obtained results with Bayesian Joint Inference are strongly dependent on the suitability of the hypothesized error model. Similarly, the results in the case of the Model Conditional Processor are mainly influenced by the selected model of tails or even by the selected full probability distribution model of the data in the real space, and by the definition of the Truncated Normal Distribution in the transformed space. In summary, the different hypotheses that the modeler choose on each of the two approaches are the main cause of the different results. This research also explores a proper combination of both methodologies which could be useful to achieve less biased hydrological parameter estimation. For this approach, firstly the predictive distribution is obtained through the Model Conditional Processor. Secondly, this predictive distribution is used to derive the corresponding additive error model which is employed for the hydrological parameter estimation with the Bayesian Joint Inference methodology.
NASA Astrophysics Data System (ADS)
Fer, I.; Kelly, R.; Andrews, T.; Dietze, M.; Richardson, A. D.
2016-12-01
Our ability to forecast ecosystems is limited by how well we parameterize ecosystem models. Direct measurements for all model parameters are not always possible and inverse estimation of these parameters through Bayesian methods is computationally costly. A solution to computational challenges of Bayesian calibration is to approximate the posterior probability surface using a Gaussian Process that emulates the complex process-based model. Here we report the integration of this method within an ecoinformatics toolbox, Predictive Ecosystem Analyzer (PEcAn), and its application with two ecosystem models: SIPNET and ED2.1. SIPNET is a simple model, allowing application of MCMC methods both to the model itself and to its emulator. We used both approaches to assimilate flux (CO2 and latent heat), soil respiration, and soil carbon data from Bartlett Experimental Forest. This comparison showed that emulator is reliable in terms of convergence to the posterior distribution. A 10000-iteration MCMC analysis with SIPNET itself required more than two orders of magnitude greater computation time than an MCMC run of same length with its emulator. This difference would be greater for a more computationally demanding model. Validation of the emulator-calibrated SIPNET against both the assimilated data and out-of-sample data showed improved fit and reduced uncertainty around model predictions. We next applied the validated emulator method to the ED2, whose complexity precludes standard Bayesian data assimilation. We used the ED2 emulator to assimilate demographic data from a network of inventory plots. For validation of the calibrated ED2, we compared the model to results from Empirical Succession Mapping (ESM), a novel synthesis of successional patterns in Forest Inventory and Analysis data. Our results revealed that while the pre-assimilation ED2 formulation cannot capture the emergent demographic patterns from ESM analysis, constrained model parameters controlling demographic processes increased their agreement considerably.
Fuzzy Naive Bayesian model for medical diagnostic decision support.
Wagholikar, Kavishwar B; Vijayraghavan, Sundararajan; Deshpande, Ashok W
2009-01-01
This work relates to the development of computational algorithms to provide decision support to physicians. The authors propose a Fuzzy Naive Bayesian (FNB) model for medical diagnosis, which extends the Fuzzy Bayesian approach proposed by Okuda. A physician's interview based method is described to define a orthogonal fuzzy symptom information system, required to apply the model. For the purpose of elaboration and elicitation of characteristics, the algorithm is applied to a simple simulated dataset, and compared with conventional Naive Bayes (NB) approach. As a preliminary evaluation of FNB in real world scenario, the comparison is repeated on a real fuzzy dataset of 81 patients diagnosed with infectious diseases. The case study on simulated dataset elucidates that FNB can be optimal over NB for diagnosing patients with imprecise-fuzzy information, on account of the following characteristics - 1) it can model the information that, values of some attributes are semantically closer than values of other attributes, and 2) it offers a mechanism to temper exaggerations in patient information. Although the algorithm requires precise training data, its utility for fuzzy training data is argued for. This is supported by the case study on infectious disease dataset, which indicates optimality of FNB over NB for the infectious disease domain. Further case studies on large datasets are required to establish utility of FNB.
A Bayesian Nonparametric Approach to Test Equating
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2009-01-01
A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are…
Model Diagnostics for Bayesian Networks
ERIC Educational Resources Information Center
Sinharay, Sandip
2006-01-01
Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…
The decisive future of inflation
NASA Astrophysics Data System (ADS)
Hardwick, Robert J.; Vennin, Vincent; Wands, David
2018-05-01
How much more will we learn about single-field inflationary models in the future? We address this question in the context of Bayesian design and information theory. We develop a novel method to compute the expected utility of deciding between models and apply it to a set of futuristic measurements. This necessarily requires one to evaluate the Bayesian evidence many thousands of times over, which is numerically challenging. We show how this can be done using a number of simplifying assumptions and discuss their validity. We also modify the form of the expected utility, as previously introduced in the literature in different contexts, in order to partition each possible future into either the rejection of models at the level of the maximum likelihood or the decision between models using Bayesian model comparison. We then quantify the ability of future experiments to constrain the reheating temperature and the scalar running. Our approach allows us to discuss possible strategies for maximising information from future cosmological surveys. In particular, our conclusions suggest that, in the context of inflationary model selection, a decrease in the measurement uncertainty of the scalar spectral index would be more decisive than a decrease in the uncertainty in the tensor-to-scalar ratio. We have incorporated our approach into a publicly available python class, foxi,1 that can be readily applied to any survey optimisation problem.
Uncertainty plus prior equals rational bias: an intuitive Bayesian probability weighting function.
Fennell, John; Baddeley, Roland
2012-10-01
Empirical research has shown that when making choices based on probabilistic options, people behave as if they overestimate small probabilities, underestimate large probabilities, and treat positive and negative outcomes differently. These distortions have been modeled using a nonlinear probability weighting function, which is found in several nonexpected utility theories, including rank-dependent models and prospect theory; here, we propose a Bayesian approach to the probability weighting function and, with it, a psychological rationale. In the real world, uncertainty is ubiquitous and, accordingly, the optimal strategy is to combine probability statements with prior information using Bayes' rule. First, we show that any reasonable prior on probabilities leads to 2 of the observed effects; overweighting of low probabilities and underweighting of high probabilities. We then investigate 2 plausible kinds of priors: informative priors based on previous experience and uninformative priors of ignorance. Individually, these priors potentially lead to large problems of bias and inefficiency, respectively; however, when combined using Bayesian model comparison methods, both forms of prior can be applied adaptively, gaining the efficiency of empirical priors and the robustness of ignorance priors. We illustrate this for the simple case of generic good and bad options, using Internet blogs to estimate the relevant priors of inference. Given this combined ignorant/informative prior, the Bayesian probability weighting function is not only robust and efficient but also matches all of the major characteristics of the distortions found in empirical research. PsycINFO Database Record (c) 2012 APA, all rights reserved.
NASA Technical Reports Server (NTRS)
Caulfield, John; Crosson, William L.; Inguva, Ramarao; Laymon, Charles A.; Schamschula, Marius
1998-01-01
This is a followup on the preceding presentation by Crosson and Schamschula. The grid size for remote microwave measurements is much coarser than the hydrological model computational grids. To validate the hydrological models with measurements we propose mechanisms to disaggregate the microwave measurements to allow comparison with outputs from the hydrological models. Weighted interpolation and Bayesian methods are proposed to facilitate the comparison. While remote measurements occur at a large scale, they reflect underlying small-scale features. We can give continuing estimates of the small scale features by correcting the simple 0th-order, starting with each small-scale model with each large-scale measurement using a straightforward method based on Kalman filtering.
DOE Office of Scientific and Technical Information (OSTI.GOV)
La Russa, D
Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributionsmore » found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.« less
Context Relevant Prediction Model for COPD Domain Using Bayesian Belief Network
Saleh, Lokman; Ajami, Hicham; Mili, Hafedh
2017-01-01
In the last three decades, researchers have examined extensively how context-aware systems can assist people, specifically those suffering from incurable diseases, to help them cope with their medical illness. Over the years, a huge number of studies on Chronic Obstructive Pulmonary Disease (COPD) have been published. However, how to derive relevant attributes and early detection of COPD exacerbations remains a challenge. In this research work, we will use an efficient algorithm to select relevant attributes where there is no proper approach in this domain. Such algorithm predicts exacerbations with high accuracy by adding discretization process, and organizes the pertinent attributes in priority order based on their impact to facilitate the emergency medical treatment. In this paper, we propose an extension of our existing Helper Context-Aware Engine System (HCES) for COPD. This project uses Bayesian network algorithm to depict the dependency between the COPD symptoms (attributes) in order to overcome the insufficiency and the independency hypothesis of naïve Bayesian. In addition, the dependency in Bayesian network is realized using TAN algorithm rather than consulting pneumologists. All these combined algorithms (discretization, selection, dependency, and the ordering of the relevant attributes) constitute an effective prediction model, comparing to effective ones. Moreover, an investigation and comparison of different scenarios of these algorithms are also done to verify which sequence of steps of prediction model gives more accurate results. Finally, we designed and validated a computer-aided support application to integrate different steps of this model. The findings of our system HCES has shown promising results using Area Under Receiver Operating Characteristic (AUC = 81.5%). PMID:28644419
Bayesian Model Averaging for Propensity Score Analysis
ERIC Educational Resources Information Center
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
A Study of Bayesian Estimation and Comparison of Response Time Models in Item Response Theory
ERIC Educational Resources Information Center
Suh, Hongwook
2010-01-01
Response time has been regarded as an important source for investigating the relationship between human performance and response speed. It is important to examine the relationship between response time and item characteristics, especially in the perspective of the relationship between response time and various factors that affect examinee's…
ERIC Educational Resources Information Center
Martin-Fernandez, Manuel; Revuelta, Javier
2017-01-01
This study compares the performance of two estimation algorithms of new usage, the Metropolis-Hastings Robins-Monro (MHRM) and the Hamiltonian MCMC (HMC), with two consolidated algorithms in the psychometric literature, the marginal likelihood via EM algorithm (MML-EM) and the Markov chain Monte Carlo (MCMC), in the estimation of multidimensional…
ERIC Educational Resources Information Center
Lee, Soo; Suh, Youngsuk
2018-01-01
Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…
BATSE gamma-ray burst line search. 2: Bayesian consistency methodology
NASA Technical Reports Server (NTRS)
Band, D. L.; Ford, L. A.; Matteson, J. L.; Briggs, M.; Paciesas, W.; Pendleton, G.; Preece, R.; Palmer, D.; Teegarden, B.; Schaefer, B.
1994-01-01
We describe a Bayesian methodology to evaluate the consistency between the reported Ginga and Burst and Transient Source Experiment (BATSE) detections of absorption features in gamma-ray burst spectra. Currently no features have been detected by BATSE, but this methodology will still be applicable if and when such features are discovered. The Bayesian methodology permits the comparison of hypotheses regarding the two detectors' observations and makes explicit the subjective aspects of our analysis (e.g., the quantification of our confidence in detector performance). We also present non-Bayesian consistency statistics. Based on preliminary calculations of line detectability, we find that both the Bayesian and non-Bayesian techniques show that the BATSE and Ginga observations are consistent given our understanding of these detectors.
Saha, Dibakar; Alluri, Priyanka; Gan, Albert
2017-01-01
The Highway Safety Manual (HSM) presents statistical models to quantitatively estimate an agency's safety performance. The models were developed using data from only a few U.S. states. To account for the effects of the local attributes and temporal factors on crash occurrence, agencies are required to calibrate the HSM-default models for crash predictions. The manual suggests updating calibration factors every two to three years, or preferably on an annual basis. Given that the calibration process involves substantial time, effort, and resources, a comprehensive analysis of the required calibration factor update frequency is valuable to the agencies. Accordingly, the objective of this study is to evaluate the HSM's recommendation and determine the required frequency of calibration factor updates. A robust Bayesian estimation procedure is used to assess the variation between calibration factors computed annually, biennially, and triennially using data collected from over 2400 miles of segments and over 700 intersections on urban and suburban facilities in Florida. Bayesian model yields a posterior distribution of the model parameters that give credible information to infer whether the difference between calibration factors computed at specified intervals is credibly different from the null value which represents unaltered calibration factors between the comparison years or in other words, zero difference. The concept of the null value is extended to include the range of values that are practically equivalent to zero. Bayesian inference shows that calibration factors based on total crash frequency are required to be updated every two years in cases where the variations between calibration factors are not greater than 0.01. When the variations are between 0.01 and 0.05, calibration factors based on total crash frequency could be updated every three years. Copyright © 2016 Elsevier Ltd. All rights reserved.
Approximate Bayesian Computation in the estimation of the parameters of the Forbush decrease model
NASA Astrophysics Data System (ADS)
Wawrzynczak, A.; Kopka, P.
2017-12-01
Realistic modeling of the complicated phenomena as Forbush decrease of the galactic cosmic ray intensity is a quite challenging task. One aspect is a numerical solution of the Fokker-Planck equation in five-dimensional space (three spatial variables, the time and particles energy). The second difficulty arises from a lack of detailed knowledge about the spatial and time profiles of the parameters responsible for the creation of the Forbush decrease. Among these parameters, the central role plays a diffusion coefficient. Assessment of the correctness of the proposed model can be done only by comparison of the model output with the experimental observations of the galactic cosmic ray intensity. We apply the Approximate Bayesian Computation (ABC) methodology to match the Forbush decrease model to experimental data. The ABC method is becoming increasing exploited for dynamic complex problems in which the likelihood function is costly to compute. The main idea of all ABC methods is to accept samples as an approximate posterior draw if its associated modeled data are close enough to the observed one. In this paper, we present application of the Sequential Monte Carlo Approximate Bayesian Computation algorithm scanning the space of the diffusion coefficient parameters. The proposed algorithm is adopted to create the model of the Forbush decrease observed by the neutron monitors at the Earth in March 2002. The model of the Forbush decrease is based on the stochastic approach to the solution of the Fokker-Planck equation.
Valence-Dependent Belief Updating: Computational Validation
Kuzmanovic, Bojana; Rigoux, Lionel
2017-01-01
People tend to update beliefs about their future outcomes in a valence-dependent way: they are likely to incorporate good news and to neglect bad news. However, belief formation is a complex process which depends not only on motivational factors such as the desire for favorable conclusions, but also on multiple cognitive variables such as prior beliefs, knowledge about personal vulnerabilities and resources, and the size of the probabilities and estimation errors. Thus, we applied computational modeling in order to test for valence-induced biases in updating while formally controlling for relevant cognitive factors. We compared biased and unbiased Bayesian models of belief updating, and specified alternative models based on reinforcement learning. The experiment consisted of 80 trials with 80 different adverse future life events. In each trial, participants estimated the base rate of one of these events and estimated their own risk of experiencing the event before and after being confronted with the actual base rate. Belief updates corresponded to the difference between the two self-risk estimates. Valence-dependent updating was assessed by comparing trials with good news (better-than-expected base rates) with trials with bad news (worse-than-expected base rates). After receiving bad relative to good news, participants' updates were smaller and deviated more strongly from rational Bayesian predictions, indicating a valence-induced bias. Model comparison revealed that the biased (i.e., optimistic) Bayesian model of belief updating better accounted for data than the unbiased (i.e., rational) Bayesian model, confirming that the valence of the new information influenced the amount of updating. Moreover, alternative computational modeling based on reinforcement learning demonstrated higher learning rates for good than for bad news, as well as a moderating role of personal knowledge. Finally, in this specific experimental context, the approach based on reinforcement learning was superior to the Bayesian approach. The computational validation of valence-dependent belief updating represents a novel support for a genuine optimism bias in human belief formation. Moreover, the precise control of relevant cognitive variables justifies the conclusion that the motivation to adopt the most favorable self-referential conclusions biases human judgments. PMID:28706499
Valence-Dependent Belief Updating: Computational Validation.
Kuzmanovic, Bojana; Rigoux, Lionel
2017-01-01
People tend to update beliefs about their future outcomes in a valence-dependent way: they are likely to incorporate good news and to neglect bad news. However, belief formation is a complex process which depends not only on motivational factors such as the desire for favorable conclusions, but also on multiple cognitive variables such as prior beliefs, knowledge about personal vulnerabilities and resources, and the size of the probabilities and estimation errors. Thus, we applied computational modeling in order to test for valence-induced biases in updating while formally controlling for relevant cognitive factors. We compared biased and unbiased Bayesian models of belief updating, and specified alternative models based on reinforcement learning. The experiment consisted of 80 trials with 80 different adverse future life events. In each trial, participants estimated the base rate of one of these events and estimated their own risk of experiencing the event before and after being confronted with the actual base rate. Belief updates corresponded to the difference between the two self-risk estimates. Valence-dependent updating was assessed by comparing trials with good news (better-than-expected base rates) with trials with bad news (worse-than-expected base rates). After receiving bad relative to good news, participants' updates were smaller and deviated more strongly from rational Bayesian predictions, indicating a valence-induced bias. Model comparison revealed that the biased (i.e., optimistic) Bayesian model of belief updating better accounted for data than the unbiased (i.e., rational) Bayesian model, confirming that the valence of the new information influenced the amount of updating. Moreover, alternative computational modeling based on reinforcement learning demonstrated higher learning rates for good than for bad news, as well as a moderating role of personal knowledge. Finally, in this specific experimental context, the approach based on reinforcement learning was superior to the Bayesian approach. The computational validation of valence-dependent belief updating represents a novel support for a genuine optimism bias in human belief formation. Moreover, the precise control of relevant cognitive variables justifies the conclusion that the motivation to adopt the most favorable self-referential conclusions biases human judgments.
Hailer, Frank; Kutschera, Verena E; Hallström, Björn M; Fain, Steven R; Leonard, Jennifer A; Arnason, Ulfur; Janke, Axel
2013-03-29
Nakagome et al. reanalyzed some of our data and assert that we cannot refute the mitochondrial DNA-based scenario for polar bear evolution. Their single-locus test statistic is strongly affected by introgression and incomplete lineage sorting, whereas our multilocus approaches are better suited to recover the true species relationships. Indeed, our sister-lineage model receives high support in a Bayesian model comparison.
Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data
Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian; Punjabi, Naresh M.
2013-01-01
Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis. PMID:22241689
Wang, Ling; Abdel-Aty, Mohamed; Wang, Xuesong; Yu, Rongjie
2018-02-01
There have been plenty of traffic safety studies based on average daily traffic (ADT), average hourly traffic (AHT), or microscopic traffic at 5 min intervals. Nevertheless, not enough research has compared the performance of these three types of safety studies, and seldom of previous studies have intended to find whether the results of one type of study is transferable to the other two studies. First, this study built three models: a Bayesian Poisson-lognormal model to estimate the daily crash frequency using ADT, a Bayesian Poisson-lognormal model to estimate the hourly crash frequency using AHT, and a Bayesian logistic regression model for the real-time safety analysis using microscopic traffic. The model results showed that the crash contributing factors found by different models were comparable but not the same. Four variables, i.e., the logarithm of volume, the standard deviation of speed, the logarithm of segment length, and the existence of diverge segment, were positively significant in the three models. Additionally, weaving segments experienced higher daily and hourly crash frequencies than merge and basic segments. Then, each of the ADT-based, AHT-based, and real-time models was used to estimate safety conditions at different levels: daily and hourly, meanwhile, the real-time model was also used in 5 min intervals. The results uncovered that the ADT- and AHT-based safety models performed similar in predicting daily and hourly crash frequencies, and the real-time safety model was able to provide hourly crash frequency. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
Real medical benefit assessed by indirect comparison.
Falissard, Bruno; Zylberman, Myriam; Cucherat, Michel; Izard, Valérie; Meyer, François
2009-01-01
Frequently, in data packages submitted for Marketing Approval to the CHMP, there is a lack of relevant head-to-head comparisons of medicinal products that could enable national authorities responsible for the approval of reimbursement to assess the Added Therapeutic Value (ASMR) of new clinical entities or line extensions of existing therapies.Indirect or mixed treatment comparisons (MTC) are methods stemming from the field of meta-analysis that have been designed to tackle this problem. Adjusted indirect comparisons, meta-regressions, mixed models, Bayesian network analyses pool results of randomised controlled trials (RCTs), enabling a quantitative synthesis.The REAL procedure, recently developed by the HAS (French National Authority for Health), is a mixture of an MTC and effect model based on expert opinions. It is intended to translate the efficacy observed in the trials into effectiveness expected in day-to-day clinical practice in France.
Bayesian model reduction and empirical Bayes for group (DCM) studies
Friston, Karl J.; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E.; van Wijk, Bernadette C.M.; Ziegler, Gabriel; Zeidman, Peter
2016-01-01
This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level – e.g., dynamic causal models – and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570
The Stochastic Early Reaction, Inhibition, and late Action (SERIA) model for antisaccades
2017-01-01
The antisaccade task is a classic paradigm used to study the voluntary control of eye movements. It requires participants to suppress a reactive eye movement to a visual target and to concurrently initiate a saccade in the opposite direction. Although several models have been proposed to explain error rates and reaction times in this task, no formal model comparison has yet been performed. Here, we describe a Bayesian modeling approach to the antisaccade task that allows us to formally compare different models on the basis of their evidence. First, we provide a formal likelihood function of actions (pro- and antisaccades) and reaction times based on previously published models. Second, we introduce the Stochastic Early Reaction, Inhibition, and late Action model (SERIA), a novel model postulating two different mechanisms that interact in the antisaccade task: an early GO/NO-GO race decision process and a late GO/GO decision process. Third, we apply these models to a data set from an experiment with three mixed blocks of pro- and antisaccade trials. Bayesian model comparison demonstrates that the SERIA model explains the data better than competing models that do not incorporate a late decision process. Moreover, we show that the early decision process postulated by the SERIA model is, to a large extent, insensitive to the cue presented in a single trial. Finally, we use parameter estimates to demonstrate that changes in reaction time and error rate due to the probability of a trial type (pro- or antisaccade) are best explained by faster or slower inhibition and the probability of generating late voluntary prosaccades. PMID:28767650
Bartlett, Jonathan W; Keogh, Ruth H
2018-06-01
Bayesian approaches for handling covariate measurement error are well established and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper, we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration, arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to regression calibration and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next, we describe the closely related maximum likelihood and multiple imputation approaches and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of regression calibration and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey.
NASA Astrophysics Data System (ADS)
Tang, Shaolei; Yang, Xiaofeng; Dong, Di; Li, Ziwei
2015-12-01
Sea surface temperature (SST) is an important variable for understanding interactions between the ocean and the atmosphere. SST fusion is crucial for acquiring SST products of high spatial resolution and coverage. This study introduces a Bayesian maximum entropy (BME) method for blending daily SSTs from multiple satellite sensors. A new spatiotemporal covariance model of an SST field is built to integrate not only single-day SSTs but also time-adjacent SSTs. In addition, AVHRR 30-year SST climatology data are introduced as soft data at the estimation points to improve the accuracy of blended results within the BME framework. The merged SSTs, with a spatial resolution of 4 km and a temporal resolution of 24 hours, are produced in the Western Pacific Ocean region to demonstrate and evaluate the proposed methodology. Comparisons with in situ drifting buoy observations show that the merged SSTs are accurate and the bias and root-mean-square errors for the comparison are 0.15°C and 0.72°C, respectively.
The Dopaminergic Midbrain Encodes the Expected Certainty about Desired Outcomes.
Schwartenbeck, Philipp; FitzGerald, Thomas H B; Mathys, Christoph; Dolan, Ray; Friston, Karl
2015-10-01
Dopamine plays a key role in learning; however, its exact function in decision making and choice remains unclear. Recently, we proposed a generic model based on active (Bayesian) inference wherein dopamine encodes the precision of beliefs about optimal policies. Put simply, dopamine discharges reflect the confidence that a chosen policy will lead to desired outcomes. We designed a novel task to test this hypothesis, where subjects played a "limited offer" game in a functional magnetic resonance imaging experiment. Subjects had to decide how long to wait for a high offer before accepting a low offer, with the risk of losing everything if they waited too long. Bayesian model comparison showed that behavior strongly supported active inference, based on surprise minimization, over classical utility maximization schemes. Furthermore, midbrain activity, encompassing dopamine projection neurons, was accurately predicted by trial-by-trial variations in model-based estimates of precision. Our findings demonstrate that human subjects infer both optimal policies and the precision of those inferences, and thus support the notion that humans perform hierarchical probabilistic Bayesian inference. In other words, subjects have to infer both what they should do as well as how confident they are in their choices, where confidence may be encoded by dopaminergic firing. © The Author 2014. Published by Oxford University Press.
The Dopaminergic Midbrain Encodes the Expected Certainty about Desired Outcomes
Schwartenbeck, Philipp; FitzGerald, Thomas H. B.; Mathys, Christoph; Dolan, Ray; Friston, Karl
2015-01-01
Dopamine plays a key role in learning; however, its exact function in decision making and choice remains unclear. Recently, we proposed a generic model based on active (Bayesian) inference wherein dopamine encodes the precision of beliefs about optimal policies. Put simply, dopamine discharges reflect the confidence that a chosen policy will lead to desired outcomes. We designed a novel task to test this hypothesis, where subjects played a “limited offer” game in a functional magnetic resonance imaging experiment. Subjects had to decide how long to wait for a high offer before accepting a low offer, with the risk of losing everything if they waited too long. Bayesian model comparison showed that behavior strongly supported active inference, based on surprise minimization, over classical utility maximization schemes. Furthermore, midbrain activity, encompassing dopamine projection neurons, was accurately predicted by trial-by-trial variations in model-based estimates of precision. Our findings demonstrate that human subjects infer both optimal policies and the precision of those inferences, and thus support the notion that humans perform hierarchical probabilistic Bayesian inference. In other words, subjects have to infer both what they should do as well as how confident they are in their choices, where confidence may be encoded by dopaminergic firing. PMID:25056572
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Comparing models of the periodic variations in spin-down and beamwidth for PSR B1828-11
NASA Astrophysics Data System (ADS)
Ashton, G.; Jones, D. I.; Prix, R.
2016-05-01
We build a framework using tools from Bayesian data analysis to evaluate models explaining the periodic variations in spin-down and beamwidth of PSR B1828-11. The available data consist of the time-averaged spin-down rate, which displays a distinctive double-peaked modulation, and measurements of the beamwidth. Two concepts exist in the literature that are capable of explaining these variations; we formulate predictive models from these and quantitatively compare them. The first concept is phenomenological and stipulates that the magnetosphere undergoes periodic switching between two metastable states as first suggested by Lyne et al. The second concept, precession, was first considered as a candidate for the modulation of B1828-11 by Stairs et al. We quantitatively compare models built from these concepts using a Bayesian odds ratio. Because the phenomenological switching model itself was informed by these data in the first place, it is difficult to specify appropriate parameter-space priors that can be trusted for an unbiased model comparison. Therefore, we first perform a parameter estimation using the spin-down data, and then use the resulting posterior distributions as priors for model comparison on the beamwidth data. We find that a precession model with a simple circular Gaussian beam geometry fails to appropriately describe the data, while allowing for a more general beam geometry provides a good fit to the data. The resulting odds between the precession model (with a general beam geometry) and the switching model are estimated as 102.7±0.5 in favour of the precession model.
Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
Bayesian analysis of multiple direct detection experiments
NASA Astrophysics Data System (ADS)
Arina, Chiara
2014-12-01
Bayesian methods offer a coherent and efficient framework for implementing uncertainties into induction problems. In this article, we review how this approach applies to the analysis of dark matter direct detection experiments. In particular we discuss the exclusion limit of XENON100 and the debated hints of detection under the hypothesis of a WIMP signal. Within parameter inference, marginalizing consistently over uncertainties to extract robust posterior probability distributions, we find that the claimed tension between XENON100 and the other experiments can be partially alleviated in isospin violating scenario, while elastic scattering model appears to be compatible with the frequentist statistical approach. We then move to model comparison, for which Bayesian methods are particularly well suited. Firstly, we investigate the annual modulation seen in CoGeNT data, finding that there is weak evidence for a modulation. Modulation models due to other physics compare unfavorably with the WIMP models, paying the price for their excessive complexity. Secondly, we confront several coherent scattering models to determine the current best physical scenario compatible with the experimental hints. We find that exothermic and inelastic dark matter are moderatly disfavored against the elastic scenario, while the isospin violating model has a similar evidence. Lastly the Bayes' factor gives inconclusive evidence for an incompatibility between the data sets of XENON100 and the hints of detection. The same question assessed with goodness of fit would indicate a 2 σ discrepancy. This suggests that more data are therefore needed to settle this question.
Bayesian Inference for Signal-Based Seismic Monitoring
NASA Astrophysics Data System (ADS)
Moore, D.
2015-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software, discarding significant information present in the original recorded signal. SIG-VISA (Signal-based Vertically Integrated Seismic Analysis) is a system for global seismic monitoring through Bayesian inference on seismic signals. By modeling signals directly, our forward model is able to incorporate a rich representation of the physics underlying the signal generation process, including source mechanisms, wave propagation, and station response. This allows inference in the model to recover the qualitative behavior of recent geophysical methods including waveform matching and double-differencing, all as part of a unified Bayesian monitoring system that simultaneously detects and locates events from a global network of stations. We demonstrate recent progress in scaling up SIG-VISA to efficiently process the data stream of global signals recorded by the International Monitoring System (IMS), including comparisons against existing processing methods that show increased sensitivity from our signal-based model and in particular the ability to locate events (including aftershock sequences that can tax analyst processing) precisely from waveform correlation effects. We also provide a Bayesian analysis of an alleged low-magnitude event near the DPRK test site in May 2010 [1] [2], investigating whether such an event could plausibly be detected through automated processing in a signal-based monitoring system. [1] Zhang, Miao and Wen, Lianxing. "Seismological Evidence for a Low-Yield Nuclear Test on 12 May 2010 in North Korea". Seismological Research Letters, January/February 2015. [2] Richards, Paul. "A Seismic Event in North Korea on 12 May 2010". CTBTO SnT 2015 oral presentation, video at https://video-archive.ctbto.org/index.php/kmc/preview/partner_id/103/uiconf_id/4421629/entry_id/0_ymmtpps0/delivery/http
Skew-t partially linear mixed-effects models for AIDS clinical studies.
Lu, Tao
2016-01-01
We propose partially linear mixed-effects models with asymmetry and missingness to investigate the relationship between two biomarkers in clinical studies. The proposed models take into account irregular time effects commonly observed in clinical studies under a semiparametric model framework. In addition, commonly assumed symmetric distributions for model errors are substituted by asymmetric distribution to account for skewness. Further, informative missing data mechanism is accounted for. A Bayesian approach is developed to perform parameter estimation simultaneously. The proposed model and method are applied to an AIDS dataset and comparisons with alternative models are performed.
Wang, Tianli; Baron, Kyle; Zhong, Wei; Brundage, Richard; Elmquist, William
2014-03-01
The current study presents a Bayesian approach to non-compartmental analysis (NCA), which provides the accurate and precise estimate of AUC 0 (∞) and any AUC 0 (∞) -based NCA parameter or derivation. In order to assess the performance of the proposed method, 1,000 simulated datasets were generated in different scenarios. A Bayesian method was used to estimate the tissue and plasma AUC 0 (∞) s and the tissue-to-plasma AUC 0 (∞) ratio. The posterior medians and the coverage of 95% credible intervals for the true parameter values were examined. The method was applied to laboratory data from a mice brain distribution study with serial sacrifice design for illustration. Bayesian NCA approach is accurate and precise in point estimation of the AUC 0 (∞) and the partition coefficient under a serial sacrifice design. It also provides a consistently good variance estimate, even considering the variability of the data and the physiological structure of the pharmacokinetic model. The application in the case study obtained a physiologically reasonable posterior distribution of AUC, with a posterior median close to the value estimated by classic Bailer-type methods. This Bayesian NCA approach for sparse data analysis provides statistical inference on the variability of AUC 0 (∞) -based parameters such as partition coefficient and drug targeting index, so that the comparison of these parameters following destructive sampling becomes statistically feasible.
2003-01-01
adapted from Kass and Rafferty (1995) and Congdon (2001). Page 10 of 57 density adjusted for resin content, z, since resin contributes to the density...c.f.: Congdon , 2001). How to Download the WinBUGS Software Package BUGS was originally a statistical research project at the Medical Research...Likelihood Estimation,” July 2002, working paper to be published. 18) Congdon , Peter, Bayesian Statistical Modeling, Wiley, 2001 19) Cox, D. R. and
Optimal modeling of 1D azimuth correlations in the context of Bayesian inference
NASA Astrophysics Data System (ADS)
De Kock, Michiel B.; Eggers, Hans C.; Trainor, Thomas A.
2015-09-01
Analysis and interpretation of spectrum and correlation data from high-energy nuclear collisions is currently controversial because two opposing physics narratives derive contradictory implications from the same data, one narrative claiming collision dynamics is dominated by dijet production and projectile-nucleon fragmentation, the other claiming collision dynamics is dominated by a dense, flowing QCD medium. Opposing interpretations seem to be supported by alternative data models, and current model-comparison schemes are unable to distinguish between them. There is clearly need for a convincing new methodology to break the deadlock. In this study we introduce Bayesian inference (BI) methods applied to angular correlation data as a basis to evaluate competing data models. For simplicity the data considered are projections of two-dimensional (2D) angular correlations onto a 1D azimuth from three centrality classes of 200-GeV Au-Au collisions. We consider several data models typical of current model choices, including Fourier series (FS) and a Gaussian plus various combinations of individual cosine components. We evaluate model performance with BI methods and with power-spectrum analysis. We find that FS-only models are rejected in all cases by Bayesian analysis, which always prefers a Gaussian. A cylindrical quadrupole cos(2 ϕ ) is required in some cases but rejected for 0%-5%-central Au-Au collisions. Given a Gaussian centered at the azimuth origin, "higher harmonics" cos(m ϕ ) for m >2 are rejected. A model consisting of Gaussian +dipole cos(ϕ )+quadrupole cos(2 ϕ ) provides good 1D data descriptions in all cases.
Bayesian model reduction and empirical Bayes for group (DCM) studies.
Friston, Karl J; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E; van Wijk, Bernadette C M; Ziegler, Gabriel; Zeidman, Peter
2016-03-01
This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level - e.g., dynamic causal models - and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Minunno, F.; Peltoniemi, M.; Launiainen, S.; Aurela, M.; Lindroth, A.; Lohila, A.; Mammarella, I.; Minkkinen, K.; Mäkelä, A.
2015-07-01
The problem of model complexity has been lively debated in environmental sciences as well as in the forest modelling community. Simple models are less input demanding and their calibration involves a lower number of parameters, but they might be suitable only at local scale. In this work we calibrated a simplified ecosystem process model (PRELES) to data from multiple sites and we tested if PRELES can be used at regional scale to estimate the carbon and water fluxes of Boreal conifer forests. We compared a multi-site (M-S) with site-specific (S-S) calibrations. Model calibrations and evaluations were carried out by the means of the Bayesian method; Bayesian calibration (BC) and Bayesian model comparison (BMC) were used to quantify the uncertainty in model parameters and model structure. To evaluate model performances BMC results were combined with more classical analysis of model-data mismatch (M-DM). Evapotranspiration (ET) and gross primary production (GPP) measurements collected in 10 sites of Finland and Sweden were used in the study. Calibration results showed that similar estimates were obtained for the parameters at which model outputs are most sensitive. No significant differences were encountered in the predictions of the multi-site and site-specific versions of PRELES with exception of a site with agricultural history (Alkkia). Although PRELES predicted GPP better than evapotranspiration, we concluded that the model can be reliably used at regional scale to simulate carbon and water fluxes of Boreal forests. Our analyses underlined also the importance of using long and carefully collected flux datasets in model calibration. In fact, even a single site can provide model calibrations that can be applied at a wider spatial scale, since it covers a wide range of variability in climatic conditions.
A study of finite mixture model: Bayesian approach on financial time series data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-07-01
Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.
Bayesian Variable Selection for Hierarchical Gene-Environment and Gene-Gene Interactions
Liu, Changlu; Ma, Jianzhong; Amos, Christopher I.
2014-01-01
We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions and gene by environment interactions in the same model. Our approach incorporates the natural hierarchical structure between the main effects and interaction effects into a mixture model, such that our methods tend to remove the irrelevant interaction effects more effectively, resulting in more robust and parsimonious models. We consider both strong and weak hierarchical models. For a strong hierarchical model, both of the main effects between interacting factors must be present for the interactions to be considered in the model development, while for a weak hierarchical model, only one of the two main effects is required to be present for the interaction to be evaluated. Our simulation results show that the proposed strong and weak hierarchical mixture models work well in controlling false positive rates and provide a powerful approach for identifying the predisposing effects and interactions in gene-environment interaction studies, in comparison with the naive model that does not impose this hierarchical constraint in most of the scenarios simulated. We illustrated our approach using data for lung cancer and cutaneous melanoma. PMID:25154630
Bayesian Data-Model Fit Assessment for Structural Equation Modeling
ERIC Educational Resources Information Center
Levy, Roy
2011-01-01
Bayesian approaches to modeling are receiving an increasing amount of attention in the areas of model construction and estimation in factor analysis, structural equation modeling (SEM), and related latent variable models. However, model diagnostics and model criticism remain relatively understudied aspects of Bayesian SEM. This article describes…
Commeau, Natalie; Cornu, Marie; Albert, Isabelle; Denis, Jean-Baptiste; Parent, Eric
2012-03-01
Assessing within-batch and between-batch variability is of major interest for risk assessors and risk managers in the context of microbiological contamination of food. For example, the ratio between the within-batch variability and the between-batch variability has a large impact on the results of a sampling plan. Here, we designed hierarchical Bayesian models to represent such variability. Compatible priors were built mathematically to obtain sound model comparisons. A numeric criterion is proposed to assess the contamination structure comparing the ability of the models to replicate grouped data at the batch level using a posterior predictive loss approach. Models were applied to two case studies: contamination by Listeria monocytogenes of pork breast used to produce diced bacon and contamination by the same microorganism on cold smoked salmon at the end of the process. In the first case study, a contamination structure clearly exists and is located at the batch level, that is, between batches variability is relatively strong, whereas in the second a structure also exists but is less marked. © 2012 Society for Risk Analysis.
Bayesian multimodel inference for dose-response studies
Link, W.A.; Albers, P.H.
2007-01-01
Statistical inference in dose?response studies is model-based: The analyst posits a mathematical model of the relation between exposure and response, estimates parameters of the model, and reports conclusions conditional on the model. Such analyses rarely include any accounting for the uncertainties associated with model selection. The Bayesian inferential system provides a convenient framework for model selection and multimodel inference. In this paper we briefly describe the Bayesian paradigm and Bayesian multimodel inference. We then present a family of models for multinomial dose?response data and apply Bayesian multimodel inferential methods to the analysis of data on the reproductive success of American kestrels (Falco sparveriuss) exposed to various sublethal dietary concentrations of methylmercury.
Kireeva, N; Baskin, I I; Gaspar, H A; Horvath, D; Marcou, G; Varnek, A
2012-04-01
Here, the utility of Generative Topographic Maps (GTM) for data visualization, structure-activity modeling and database comparison is evaluated, on hand of subsets of the Database of Useful Decoys (DUD). Unlike other popular dimensionality reduction approaches like Principal Component Analysis, Sammon Mapping or Self-Organizing Maps, the great advantage of GTMs is providing data probability distribution functions (PDF), both in the high-dimensional space defined by molecular descriptors and in 2D latent space. PDFs for the molecules of different activity classes were successfully used to build classification models in the framework of the Bayesian approach. Because PDFs are represented by a mixture of Gaussian functions, the Bhattacharyya kernel has been proposed as a measure of the overlap of datasets, which leads to an elegant method of global comparison of chemical libraries. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Model Selection in Historical Research Using Approximate Bayesian Computation
Rubio-Campillo, Xavier
2016-01-01
Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953
A guide to Bayesian model selection for ecologists
Hooten, Mevin B.; Hobbs, N.T.
2015-01-01
The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.
Application Bayesian Model Averaging method for ensemble system for Poland
NASA Astrophysics Data System (ADS)
Guzikowski, Jakub; Czerwinska, Agnieszka
2014-05-01
The aim of the project is to evaluate methods for generating numerical ensemble weather prediction using a meteorological data from The Weather Research & Forecasting Model and calibrating this data by means of Bayesian Model Averaging (WRF BMA) approach. We are constructing height resolution short range ensemble forecasts using meteorological data (temperature) generated by nine WRF's models. WRF models have 35 vertical levels and 2.5 km x 2.5 km horizontal resolution. The main emphasis is that the used ensemble members has a different parameterization of the physical phenomena occurring in the boundary layer. To calibrate an ensemble forecast we use Bayesian Model Averaging (BMA) approach. The BMA predictive Probability Density Function (PDF) is a weighted average of predictive PDFs associated with each individual ensemble member, with weights that reflect the member's relative skill. For test we chose a case with heat wave and convective weather conditions in Poland area from 23th July to 1st August 2013. From 23th July to 29th July 2013 temperature oscillated below or above 30 Celsius degree in many meteorology stations and new temperature records were added. During this time the growth of the hospitalized patients with cardiovascular system problems was registered. On 29th July 2013 an advection of moist tropical air masses was recorded in the area of Poland causes strong convection event with mesoscale convection system (MCS). MCS caused local flooding, damage to the transport infrastructure, destroyed buildings, trees and injuries and direct threat of life. Comparison of the meteorological data from ensemble system with the data recorded on 74 weather stations localized in Poland is made. We prepare a set of the model - observations pairs. Then, the obtained data from single ensemble members and median from WRF BMA system are evaluated on the basis of the deterministic statistical error Root Mean Square Error (RMSE), Mean Absolute Error (MAE). To evaluation probabilistic data The Brier Score (BS) and Continuous Ranked Probability Score (CRPS) were used. Finally comparison between BMA calibrated data and data from ensemble members will be displayed.
NASA Astrophysics Data System (ADS)
Love, C. A.; Skahill, B. E.; AghaKouchak, A.; Karlovits, G. S.; England, J. F.; Duren, A. M.
2017-12-01
We compare gridded extreme precipitation return levels obtained using spatial Bayesian hierarchical modeling (BHM) with their respective counterparts from a traditional regional frequency analysis (RFA) using the same set of extreme precipitation data. Our study area is the 11,478 square mile Willamette River basin (WRB) located in northwestern Oregon, a major tributary of the Columbia River whose 187 miles long main stem, the Willamette River, flows northward between the Coastal and Cascade Ranges. The WRB contains approximately two thirds of Oregon's population and 20 of the 25 most populous cities in the state. The U.S. Army Corps of Engineers (USACE) Portland District operates thirteen dams and extreme precipitation estimates are required to support risk informed hydrologic analyses as part of the USACE Dam Safety Program. Our intent is to profile for the USACE an alternate methodology to an RFA that was developed in 2008 due to the lack of an official NOAA Atlas 14 update for the state of Oregon. We analyze 24-hour annual precipitation maxima data for the WRB utilizing the spatial BHM R package "spatial.gev.bma", which has been shown to be efficient in developing coherent maps of extreme precipitation by return level. Our BHM modeling analysis involved application of leave-one-out cross validation (LOO-CV), which not only supported model selection but also a comprehensive assessment of location specific model performance. The LOO-CV results will provide a basis for the BHM RFA comparison.
Uncertainty aggregation and reduction in structure-material performance prediction
NASA Astrophysics Data System (ADS)
Hu, Zhen; Mahadevan, Sankaran; Ao, Dan
2018-02-01
An uncertainty aggregation and reduction framework is presented for structure-material performance prediction. Different types of uncertainty sources, structural analysis model, and material performance prediction model are connected through a Bayesian network for systematic uncertainty aggregation analysis. To reduce the uncertainty in the computational structure-material performance prediction model, Bayesian updating using experimental observation data is investigated based on the Bayesian network. It is observed that the Bayesian updating results will have large error if the model cannot accurately represent the actual physics, and that this error will be propagated to the predicted performance distribution. To address this issue, this paper proposes a novel uncertainty reduction method by integrating Bayesian calibration with model validation adaptively. The observation domain of the quantity of interest is first discretized into multiple segments. An adaptive algorithm is then developed to perform model validation and Bayesian updating over these observation segments sequentially. Only information from observation segments where the model prediction is highly reliable is used for Bayesian updating; this is found to increase the effectiveness and efficiency of uncertainty reduction. A composite rotorcraft hub component fatigue life prediction model, which combines a finite element structural analysis model and a material damage model, is used to demonstrate the proposed method.
Yang, Ziheng; Zhu, Tianqi
2018-02-20
The Bayesian method is noted to produce spuriously high posterior probabilities for phylogenetic trees in analysis of large datasets, but the precise reasons for this overconfidence are unknown. In general, the performance of Bayesian selection of misspecified models is poorly understood, even though this is of great scientific interest since models are never true in real data analysis. Here we characterize the asymptotic behavior of Bayesian model selection and show that when the competing models are equally wrong, Bayesian model selection exhibits surprising and polarized behaviors in large datasets, supporting one model with full force while rejecting the others. If one model is slightly less wrong than the other, the less wrong model will eventually win when the amount of data increases, but the method may become overconfident before it becomes reliable. We suggest that this extreme behavior may be a major factor for the spuriously high posterior probabilities for evolutionary trees. The philosophical implications of our results to the application of Bayesian model selection to evaluate opposing scientific hypotheses are yet to be explored, as are the behaviors of non-Bayesian methods in similar situations.
Modeling Dynamic Contrast-Enhanced MRI Data with a Constrained Local AIF.
Duan, Chong; Kallehauge, Jesper F; Pérez-Torres, Carlos J; Bretthorst, G Larry; Beeman, Scott C; Tanderup, Kari; Ackerman, Joseph J H; Garbow, Joel R
2018-02-01
This study aims to develop a constrained local arterial input function (cL-AIF) to improve quantitative analysis of dynamic contrast-enhanced (DCE)-magnetic resonance imaging (MRI) data by accounting for the contrast-agent bolus amplitude error in the voxel-specific AIF. Bayesian probability theory-based parameter estimation and model selection were used to compare tracer kinetic modeling employing either the measured remote-AIF (R-AIF, i.e., the traditional approach) or an inferred cL-AIF against both in silico DCE-MRI data and clinical, cervical cancer DCE-MRI data. When the data model included the cL-AIF, tracer kinetic parameters were correctly estimated from in silico data under contrast-to-noise conditions typical of clinical DCE-MRI experiments. Considering the clinical cervical cancer data, Bayesian model selection was performed for all tumor voxels of the 16 patients (35,602 voxels in total). Among those voxels, a tracer kinetic model that employed the voxel-specific cL-AIF was preferred (i.e., had a higher posterior probability) in 80 % of the voxels compared to the direct use of a single R-AIF. Maps of spatial variation in voxel-specific AIF bolus amplitude and arrival time for heterogeneous tissues, such as cervical cancer, are accessible with the cL-AIF approach. The cL-AIF method, which estimates unique local-AIF amplitude and arrival time for each voxel within the tissue of interest, provides better modeling of DCE-MRI data than the use of a single, measured R-AIF. The Bayesian-based data analysis described herein affords estimates of uncertainties for each model parameter, via posterior probability density functions, and voxel-wise comparison across methods/models, via model selection in data modeling.
NASA Astrophysics Data System (ADS)
Frey, M. P.; Stamm, C.; Schneider, M. K.; Reichert, P.
2011-12-01
A distributed hydrological model was used to simulate the distribution of fast runoff formation as a proxy for critical source areas for herbicide pollution in a small agricultural catchment in Switzerland. We tested to what degree predictions based on prior knowledge without local measurements could be improved upon relying on observed discharge. This learning process consisted of five steps: For the prior prediction (step 1), knowledge of the model parameters was coarse and predictions were fairly uncertain. In the second step, discharge data were used to update the prior parameter distribution. Effects of uncertainty in input data and model structure were accounted for by an autoregressive error model. This step decreased the width of the marginal distributions of parameters describing the lower boundary (percolation rates) but hardly affected soil hydraulic parameters. Residual analysis (step 3) revealed model structure deficits. We modified the model, and in the subsequent Bayesian updating (step 4) the widths of the posterior marginal distributions were reduced for most parameters compared to those of the prior. This incremental procedure led to a strong reduction in the uncertainty of the spatial prediction. Thus, despite only using spatially integrated data (discharge), the spatially distributed effect of the improved model structure can be expected to improve the spatially distributed predictions also. The fifth step consisted of a test with independent spatial data on herbicide losses and revealed ambiguous results. The comparison depended critically on the ratio of event to preevent water that was discharged. This ratio cannot be estimated from hydrological data only. The results demonstrate that the value of local data is strongly dependent on a correct model structure. An iterative procedure of Bayesian updating, model testing, and model modification is suggested.
Bayesian analysis of volcanic eruptions
NASA Astrophysics Data System (ADS)
Ho, Chih-Hsiang
1990-10-01
The simple Poisson model generally gives a good fit to many volcanoes for volcanic eruption forecasting. Nonetheless, empirical evidence suggests that volcanic activity in successive equal time-periods tends to be more variable than a simple Poisson with constant eruptive rate. An alternative model is therefore examined in which eruptive rate(λ) for a given volcano or cluster(s) of volcanoes is described by a gamma distribution (prior) rather than treated as a constant value as in the assumptions of a simple Poisson model. Bayesian analysis is performed to link two distributions together to give the aggregate behavior of the volcanic activity. When the Poisson process is expanded to accomodate a gamma mixing distribution on λ, a consequence of this mixed (or compound) Poisson model is that the frequency distribution of eruptions in any given time-period of equal length follows the negative binomial distribution (NBD). Applications of the proposed model and comparisons between the generalized model and simple Poisson model are discussed based on the historical eruptive count data of volcanoes Mauna Loa (Hawaii) and Etna (Italy). Several relevant facts lead to the conclusion that the generalized model is preferable for practical use both in space and time.
USDA-ARS?s Scientific Manuscript database
Data assimilation and regression are two commonly used methods for predicting agricultural yield from remote sensing observations. Data assimilation is a generative approach because it requires explicit approximations of the Bayesian prior and likelihood to compute the probability density function...
ERIC Educational Resources Information Center
Griffiths, Thomas L.; Chater, Nick; Norris, Dennis; Pouget, Alexandre
2012-01-01
Bowers and Davis (2012) criticize Bayesian modelers for telling "just so" stories about cognition and neuroscience. Their criticisms are weakened by not giving an accurate characterization of the motivation behind Bayesian modeling or the ways in which Bayesian models are used and by not evaluating this theoretical framework against specific…
A default Bayesian hypothesis test for mediation.
Nuijten, Michèle B; Wetzels, Ruud; Matzke, Dora; Dolan, Conor V; Wagenmakers, Eric-Jan
2015-03-01
In order to quantify the relationship between multiple variables, researchers often carry out a mediation analysis. In such an analysis, a mediator (e.g., knowledge of a healthy diet) transmits the effect from an independent variable (e.g., classroom instruction on a healthy diet) to a dependent variable (e.g., consumption of fruits and vegetables). Almost all mediation analyses in psychology use frequentist estimation and hypothesis-testing techniques. A recent exception is Yuan and MacKinnon (Psychological Methods, 14, 301-322, 2009), who outlined a Bayesian parameter estimation procedure for mediation analysis. Here we complete the Bayesian alternative to frequentist mediation analysis by specifying a default Bayesian hypothesis test based on the Jeffreys-Zellner-Siow approach. We further extend this default Bayesian test by allowing a comparison to directional or one-sided alternatives, using Markov chain Monte Carlo techniques implemented in JAGS. All Bayesian tests are implemented in the R package BayesMed (Nuijten, Wetzels, Matzke, Dolan, & Wagenmakers, 2014).
Bayesian Regression with Network Prior: Optimal Bayesian Filtering Perspective
Qian, Xiaoning; Dougherty, Edward R.
2017-01-01
The recently introduced intrinsically Bayesian robust filter (IBRF) provides fully optimal filtering relative to a prior distribution over an uncertainty class ofjoint random process models, whereas formerly the theory was limited to model-constrained Bayesian robust filters, for which optimization was limited to the filters that are optimal for models in the uncertainty class. This paper extends the IBRF theory to the situation where there are both a prior on the uncertainty class and sample data. The result is optimal Bayesian filtering (OBF), where optimality is relative to the posterior distribution derived from the prior and the data. The IBRF theories for effective characteristics and canonical expansions extend to the OBF setting. A salient focus of the present work is to demonstrate the advantages of Bayesian regression within the OBF setting over the classical Bayesian approach in the context otlinear Gaussian models. PMID:28824268
Modeling Diagnostic Assessments with Bayesian Networks
ERIC Educational Resources Information Center
Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego
2007-01-01
This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…
NASA Astrophysics Data System (ADS)
Wheeler, David C.; Waller, Lance A.
2009-03-01
In this paper, we compare and contrast a Bayesian spatially varying coefficient process (SVCP) model with a geographically weighted regression (GWR) model for the estimation of the potentially spatially varying regression effects of alcohol outlets and illegal drug activity on violent crime in Houston, Texas. In addition, we focus on the inherent coefficient shrinkage properties of the Bayesian SVCP model as a way to address increased coefficient variance that follows from collinearity in GWR models. We outline the advantages of the Bayesian model in terms of reducing inflated coefficient variance, enhanced model flexibility, and more formal measuring of model uncertainty for prediction. We find spatially varying effects for alcohol outlets and drug violations, but the amount of variation depends on the type of model used. For the Bayesian model, this variation is controllable through the amount of prior influence placed on the variance of the coefficients. For example, the spatial pattern of coefficients is similar for the GWR and Bayesian models when a relatively large prior variance is used in the Bayesian model.
Philosophy and the practice of Bayesian statistics
Gelman, Andrew; Shalizi, Cosma Rohilla
2015-01-01
A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. PMID:22364575
Philosophy and the practice of Bayesian statistics.
Gelman, Andrew; Shalizi, Cosma Rohilla
2013-02-01
A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. © 2012 The British Psychological Society.
Garrido, Marta I; Rowe, Elise G; Halász, Veronika; Mattingley, Jason B
2018-05-01
Predictive coding posits that the human brain continually monitors the environment for regularities and detects inconsistencies. It is unclear, however, what effect attention has on expectation processes, as there have been relatively few studies and the results of these have yielded contradictory findings. Here, we employed Bayesian model comparison to adjudicate between 2 alternative computational models. The "Opposition" model states that attention boosts neural responses equally to predicted and unpredicted stimuli, whereas the "Interaction" model assumes that attentional boosting of neural signals depends on the level of predictability. We designed a novel, audiospatial attention task that orthogonally manipulated attention and prediction by playing oddball sequences in either the attended or unattended ear. We observed sensory prediction error responses, with electroencephalography, across all attentional manipulations. Crucially, posterior probability maps revealed that, overall, the Opposition model better explained scalp and source data, suggesting that attention boosts responses to predicted and unpredicted stimuli equally. Furthermore, Dynamic Causal Modeling showed that these Opposition effects were expressed in plastic changes within the mismatch negativity network. Our findings provide empirical evidence for a computational model of the opposing interplay of attention and expectation in the brain.
Taking error into account when fitting models using Approximate Bayesian Computation.
van der Vaart, Elske; Prangle, Dennis; Sibly, Richard M
2018-03-01
Stochastic computer simulations are often the only practical way of answering questions relating to ecological management. However, due to their complexity, such models are difficult to calibrate and evaluate. Approximate Bayesian Computation (ABC) offers an increasingly popular approach to this problem, widely applied across a variety of fields. However, ensuring the accuracy of ABC's estimates has been difficult. Here, we obtain more accurate estimates by incorporating estimation of error into the ABC protocol. We show how this can be done where the data consist of repeated measures of the same quantity and errors may be assumed to be normally distributed and independent. We then derive the correct acceptance probabilities for a probabilistic ABC algorithm, and update the coverage test with which accuracy is assessed. We apply this method, which we call error-calibrated ABC, to a toy example and a realistic 14-parameter simulation model of earthworms that is used in environmental risk assessment. A comparison with exact methods and the diagnostic coverage test show that our approach improves estimation of parameter values and their credible intervals for both models. © 2017 by the Ecological Society of America.
CHAI, Lian En; LAW, Chow Kuan; MOHAMAD, Mohd Saberi; CHONG, Chuii Khim; CHOON, Yee Wen; DERIS, Safaai; ILLIAS, Rosli Md
2014-01-01
Background: Gene expression data often contain missing expression values. Therefore, several imputation methods have been applied to solve the missing values, which include k-nearest neighbour (kNN), local least squares (LLS), and Bayesian principal component analysis (BPCA). However, the effects of these imputation methods on the modelling of gene regulatory networks from gene expression data have rarely been investigated and analysed using a dynamic Bayesian network (DBN). Methods: In the present study, we separately imputed datasets of the Escherichia coli S.O.S. DNA repair pathway and the Saccharomyces cerevisiae cell cycle pathway with kNN, LLS, and BPCA, and subsequently used these to generate gene regulatory networks (GRNs) using a discrete DBN. We made comparisons on the basis of previous studies in order to select the gene network with the least error. Results: We found that BPCA and LLS performed better on larger networks (based on the S. cerevisiae dataset), whereas kNN performed better on smaller networks (based on the E. coli dataset). Conclusion: The results suggest that the performance of each imputation method is dependent on the size of the dataset, and this subsequently affects the modelling of the resultant GRNs using a DBN. In addition, on the basis of these results, a DBN has the capacity to discover potential edges, as well as display interactions, between genes. PMID:24876803
NASA Astrophysics Data System (ADS)
Rosenheim, B. E.; Firesinger, D.; Roberts, M. L.; Burton, J. R.; Khan, N.; Moyer, R. P.
2016-12-01
Radiocarbon (14C) sediment core chronologies benefit from a high density of dates, even when precision of individual dates is sacrificed. This is demonstrated by a combined approach of rapid 14C analysis of CO2 gas generated from carbonates and organic material coupled with Bayesian statistical modeling. Analysis of 14C is facilitated by the gas ion source on the Continuous Flow Accelerator Mass Spectrometry (CFAMS) system at the Woods Hole Oceanographic Institution's National Ocean Sciences Accelerator Mass Spectrometry facility. This instrument is capable of producing a 14C determination of +/- 100 14C y precision every 4-5 minutes, with limited sample handling (dissolution of carbonates and/or combustion of organic carbon in evacuated containers). Rapid analysis allows over-preparation of samples to include replicates at each depth and/or comparison of different sample types at particular depths in a sediment or peat core. Analysis priority is given to depths that have the least chronologic precision as determined by Bayesian modeling of the chronology of calibrated ages. Use of such a statistical approach to determine the order in which samples are run ensures that the chronology constantly improves so long as material is available for the analysis of chronologic weak points. Ultimately, accuracy of the chronology is determined by the material that is actually being dated, and our combined approach allows testing of different constituents of the organic carbon pool and the carbonate minerals within a core. We will present preliminary results from a deep-sea sediment core abundant in deep-sea foraminifera as well as coastal wetland peat cores to demonstrate statistical improvements in sediment- and peat-core chronologies obtained by increasing the quantity and decreasing the quality of individual dates.
Xu, Wei-Wei; Hu, Shen-Jiang; Wu, Tao
2017-07-01
Antithrombotic therapy using new oral anticoagulants (NOACs) in patients with atrial fibrillation (AF) has been generally shown to have a favorable risk-benefit profile. Since there has been dispute about the risks of gastrointestinal bleeding (GIB) and intracranial hemorrhage (ICH), we sought to conduct a systematic review and network meta-analysis using Bayesian inference to analyze the risks of GIB and ICH in AF patients taking NOACs. We analyzed data from 20 randomized controlled trials of 91 671 AF patients receiving anticoagulants, antiplatelet drugs, or placebo. Bayesian network meta-analysis of two different evidence networks was performed using a binomial likelihood model, based on a network in which different agents (and doses) were treated as separate nodes. Odds ratios (ORs) and 95% confidence intervals (CIs) were modeled using Markov chain Monte Carlo methods. Indirect comparisons with the Bayesian model confirmed that aspirin+clopidogrel significantly increased the risk of GIB in AF patients compared to the placebo (OR 0.33, 95% CI 0.01-0.92). Warfarin was identified as greatly increasing the risk of ICH compared to edoxaban 30 mg (OR 3.42, 95% CI 1.22-7.24) and dabigatran 110 mg (OR 3.56, 95% CI 1.10-8.45). We further ranked the NOACs for the lowest risk of GIB (apixaban 5 mg) and ICH (apixaban 5 mg, dabigatran 110 mg, and edoxaban 30 mg). Bayesian network meta-analysis of treatment of non-valvular AF patients with anticoagulants suggested that NOACs do not increase risks of GIB and/or ICH, compared to each other.
A fast combination method in DSmT and its application to recommender system
Liu, Yihai
2018-01-01
In many applications involving epistemic uncertainties usually modeled by belief functions, it is often necessary to approximate general (non-Bayesian) basic belief assignments (BBAs) to subjective probabilities (called Bayesian BBAs). This necessity occurs if one needs to embed the fusion result in a system based on the probabilistic framework and Bayesian inference (e.g. tracking systems), or if one needs to make a decision in the decision making problems. In this paper, we present a new fast combination method, called modified rigid coarsening (MRC), to obtain the final Bayesian BBAs based on hierarchical decomposition (coarsening) of the frame of discernment. Regarding this method, focal elements with probabilities are coarsened efficiently to reduce computational complexity in the process of combination by using disagreement vector and a simple dichotomous approach. In order to prove the practicality of our approach, this new approach is applied to combine users’ soft preferences in recommender systems (RSs). Additionally, in order to make a comprehensive performance comparison, the proportional conflict redistribution rule #6 (PCR6) is regarded as a baseline in a range of experiments. According to the results of experiments, MRC is more effective in accuracy of recommendations compared to original Rigid Coarsening (RC) method and comparable in computational time. PMID:29351297
Posada, David; Buckley, Thomas R
2004-10-01
Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (model-averaged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AIC-based model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus(genus Carabus) ground beetles described by Sota and Vogler (2001).
2012-05-30
annealing-based or Bayesian sequential simulation approaches B. Dafflon1,2 and W. Barrash1 Received 13 May 2011; revised 12 March 2012; accepted 17 April 2012...the withheld porosity log are also withheld for this estimation process. For both cases we do this for two wells having locally variable stratigraphy ...borehole location is given at the bottom of each log comparison panel. For comparison with stratigraphy at the BHRS, contacts between Units 1 to 4
NASA Astrophysics Data System (ADS)
Shafii, M.; Tolson, B.; Matott, L. S.
2012-04-01
Hydrologic modeling has benefited from significant developments over the past two decades. This has resulted in building of higher levels of complexity into hydrologic models, which eventually makes the model evaluation process (parameter estimation via calibration and uncertainty analysis) more challenging. In order to avoid unreasonable parameter estimates, many researchers have suggested implementation of multi-criteria calibration schemes. Furthermore, for predictive hydrologic models to be useful, proper consideration of uncertainty is essential. Consequently, recent research has emphasized comprehensive model assessment procedures in which multi-criteria parameter estimation is combined with statistically-based uncertainty analysis routines such as Bayesian inference using Markov Chain Monte Carlo (MCMC) sampling. Such a procedure relies on the use of formal likelihood functions based on statistical assumptions, and moreover, the Bayesian inference structured on MCMC samplers requires a considerably large number of simulations. Due to these issues, especially in complex non-linear hydrological models, a variety of alternative informal approaches have been proposed for uncertainty analysis in the multi-criteria context. This study aims at exploring a number of such informal uncertainty analysis techniques in multi-criteria calibration of hydrological models. The informal methods addressed in this study are (i) Pareto optimality which quantifies the parameter uncertainty using the Pareto solutions, (ii) DDS-AU which uses the weighted sum of objective functions to derive the prediction limits, and (iii) GLUE which describes the total uncertainty through identification of behavioral solutions. The main objective is to compare such methods with MCMC-based Bayesian inference with respect to factors such as computational burden, and predictive capacity, which are evaluated based on multiple comparative measures. The measures for comparison are calculated both for calibration and evaluation periods. The uncertainty analysis methodologies are applied to a simple 5-parameter rainfall-runoff model, called HYMOD.
Bayesian Tracking of Emerging Epidemics Using Ensemble Optimal Statistical Interpolation
Cobb, Loren; Krishnamurthy, Ashok; Mandel, Jan; Beezley, Jonathan D.
2014-01-01
We present a preliminary test of the Ensemble Optimal Statistical Interpolation (EnOSI) method for the statistical tracking of an emerging epidemic, with a comparison to its popular relative for Bayesian data assimilation, the Ensemble Kalman Filter (EnKF). The spatial data for this test was generated by a spatial susceptible-infectious-removed (S-I-R) epidemic model of an airborne infectious disease. Both tracking methods in this test employed Poisson rather than Gaussian noise, so as to handle epidemic data more accurately. The EnOSI and EnKF tracking methods worked well on the main body of the simulated spatial epidemic, but the EnOSI was able to detect and track a distant secondary focus of infection that the EnKF missed entirely. PMID:25113590
Bayesian inference to identify parameters in viscoelasticity
NASA Astrophysics Data System (ADS)
Rappel, Hussein; Beex, Lars A. A.; Bordas, Stéphane P. A.
2017-08-01
This contribution discusses Bayesian inference (BI) as an approach to identify parameters in viscoelasticity. The aims are: (i) to show that the prior has a substantial influence for viscoelasticity, (ii) to show that this influence decreases for an increasing number of measurements and (iii) to show how different types of experiments influence the identified parameters and their uncertainties. The standard linear solid model is the material description of interest and a relaxation test, a constant strain-rate test and a creep test are the tensile experiments focused on. The experimental data are artificially created, allowing us to make a one-to-one comparison between the input parameters and the identified parameter values. Besides dealing with the aforementioned issues, we believe that this contribution forms a comprehensible start for those interested in applying BI in viscoelasticity.
Fundamentals and Recent Developments in Approximate Bayesian Computation
Lintusaari, Jarno; Gutmann, Michael U.; Dutta, Ritabrata; Kaski, Samuel; Corander, Jukka
2017-01-01
Abstract Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.] PMID:28175922
Measuring and modeling salience with the theory of visual attention.
Krüger, Alexander; Tünnermann, Jan; Scharlau, Ingrid
2017-08-01
For almost three decades, the theory of visual attention (TVA) has been successful in mathematically describing and explaining a wide variety of phenomena in visual selection and recognition with high quantitative precision. Interestingly, the influence of feature contrast on attention has been included in TVA only recently, although it has been extensively studied outside the TVA framework. The present approach further develops this extension of TVA's scope by measuring and modeling salience. An empirical measure of salience is achieved by linking different (orientation and luminance) contrasts to a TVA parameter. In the modeling part, the function relating feature contrasts to salience is described mathematically and tested against alternatives by Bayesian model comparison. This model comparison reveals that the power function is an appropriate model of salience growth in the dimensions of orientation and luminance contrast. Furthermore, if contrasts from the two dimensions are combined, salience adds up additively.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karagiannis, Georgios, E-mail: georgios.karagiannis@pnnl.gov; Lin, Guang, E-mail: guang.lin@pnnl.gov
2014-02-15
Generalized polynomial chaos (gPC) expansions allow us to represent the solution of a stochastic system using a series of polynomial chaos basis functions. The number of gPC terms increases dramatically as the dimension of the random input variables increases. When the number of the gPC terms is larger than that of the available samples, a scenario that often occurs when the corresponding deterministic solver is computationally expensive, evaluation of the gPC expansion can be inaccurate due to over-fitting. We propose a fully Bayesian approach that allows for global recovery of the stochastic solutions, in both spatial and random domains, bymore » coupling Bayesian model uncertainty and regularization regression methods. It allows the evaluation of the PC coefficients on a grid of spatial points, via (1) the Bayesian model average (BMA) or (2) the median probability model, and their construction as spatial functions on the spatial domain via spline interpolation. The former accounts for the model uncertainty and provides Bayes-optimal predictions; while the latter provides a sparse representation of the stochastic solutions by evaluating the expansion on a subset of dominating gPC bases. Moreover, the proposed methods quantify the importance of the gPC bases in the probabilistic sense through inclusion probabilities. We design a Markov chain Monte Carlo (MCMC) sampler that evaluates all the unknown quantities without the need of ad-hoc techniques. The proposed methods are suitable for, but not restricted to, problems whose stochastic solutions are sparse in the stochastic space with respect to the gPC bases while the deterministic solver involved is expensive. We demonstrate the accuracy and performance of the proposed methods and make comparisons with other approaches on solving elliptic SPDEs with 1-, 14- and 40-random dimensions.« less
NASA Astrophysics Data System (ADS)
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert elicitation methodology is developed and applied to the real-world test case in order to provide a road map for the use of fuzzy Bayesian inference in groundwater modeling applications.
Nonparametric Bayesian predictive distributions for future order statistics
Richard A. Johnson; James W. Evans; David W. Green
1999-01-01
We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...
Comparison of dark energy models after Planck 2015
NASA Astrophysics Data System (ADS)
Xu, Yue-Yao; Zhang, Xin
2016-11-01
We make a comparison for ten typical, popular dark energy models according to their capabilities of fitting the current observational data. The observational data we use in this work include the JLA sample of type Ia supernovae observation, the Planck 2015 distance priors of cosmic microwave background observation, the baryon acoustic oscillations measurements, and the direct measurement of the Hubble constant. Since the models have different numbers of parameters, in order to make a fair comparison, we employ the Akaike and Bayesian information criteria to assess the worth of the models. The analysis results show that, according to the capability of explaining observations, the cosmological constant model is still the best one among all the dark energy models. The generalized Chaplygin gas model, the constant w model, and the α dark energy model are worse than the cosmological constant model, but still are good models compared to others. The holographic dark energy model, the new generalized Chaplygin gas model, and the Chevalliear-Polarski-Linder model can still fit the current observations well, but from an economically feasible perspective, they are not so good. The new agegraphic dark energy model, the Dvali-Gabadadze-Porrati model, and the Ricci dark energy model are excluded by the current observations.
Bayesian inference based on stationary Fokker-Planck sampling.
Berrones, Arturo
2010-06-01
A novel formalism for bayesian learning in the context of complex inference models is proposed. The method is based on the use of the stationary Fokker-Planck (SFP) approach to sample from the posterior density. Stationary Fokker-Planck sampling generalizes the Gibbs sampler algorithm for arbitrary and unknown conditional densities. By the SFP procedure, approximate analytical expressions for the conditionals and marginals of the posterior can be constructed. At each stage of SFP, the approximate conditionals are used to define a Gibbs sampling process, which is convergent to the full joint posterior. By the analytical marginals efficient learning methods in the context of artificial neural networks are outlined. Offline and incremental bayesian inference and maximum likelihood estimation from the posterior are performed in classification and regression examples. A comparison of SFP with other Monte Carlo strategies in the general problem of sampling from arbitrary densities is also presented. It is shown that SFP is able to jump large low-probability regions without the need of a careful tuning of any step-size parameter. In fact, the SFP method requires only a small set of meaningful parameters that can be selected following clear, problem-independent guidelines. The computation cost of SFP, measured in terms of loss function evaluations, grows linearly with the given model's dimension.
Link, William; Sauer, John R.
2016-01-01
The analysis of ecological data has changed in two important ways over the last 15 years. The development and easy availability of Bayesian computational methods has allowed and encouraged the fitting of complex hierarchical models. At the same time, there has been increasing emphasis on acknowledging and accounting for model uncertainty. Unfortunately, the ability to fit complex models has outstripped the development of tools for model selection and model evaluation: familiar model selection tools such as Akaike's information criterion and the deviance information criterion are widely known to be inadequate for hierarchical models. In addition, little attention has been paid to the evaluation of model adequacy in context of hierarchical modeling, i.e., to the evaluation of fit for a single model. In this paper, we describe Bayesian cross-validation, which provides tools for model selection and evaluation. We describe the Bayesian predictive information criterion and a Bayesian approximation to the BPIC known as the Watanabe-Akaike information criterion. We illustrate the use of these tools for model selection, and the use of Bayesian cross-validation as a tool for model evaluation, using three large data sets from the North American Breeding Bird Survey.
Bayesian networks for maritime traffic accident prevention: benefits and challenges.
Hänninen, Maria
2014-12-01
Bayesian networks are quantitative modeling tools whose applications to the maritime traffic safety context are becoming more popular. This paper discusses the utilization of Bayesian networks in maritime safety modeling. Based on literature and the author's own experiences, the paper studies what Bayesian networks can offer to maritime accident prevention and safety modeling and discusses a few challenges in their application to this context. It is argued that the capability of representing rather complex, not necessarily causal but uncertain relationships makes Bayesian networks an attractive modeling tool for the maritime safety and accidents. Furthermore, as the maritime accident and safety data is still rather scarce and has some quality problems, the possibility to combine data with expert knowledge and the easy way of updating the model after acquiring more evidence further enhance their feasibility. However, eliciting the probabilities from the maritime experts might be challenging and the model validation can be tricky. It is concluded that with the utilization of several data sources, Bayesian updating, dynamic modeling, and hidden nodes for latent variables, Bayesian networks are rather well-suited tools for the maritime safety management and decision-making. Copyright © 2014 Elsevier Ltd. All rights reserved.
Bayesian Framework for Water Quality Model Uncertainty Estimation and Risk Management
A formal Bayesian methodology is presented for integrated model calibration and risk-based water quality management using Bayesian Monte Carlo simulation and maximum likelihood estimation (BMCML). The primary focus is on lucid integration of model calibration with risk-based wat...
NASA Astrophysics Data System (ADS)
Krishnanathan, Kirubhakaran; Anderson, Sean R.; Billings, Stephen A.; Kadirkamanathan, Visakan
2016-11-01
In this paper, we derive a system identification framework for continuous-time nonlinear systems, for the first time using a simulation-focused computational Bayesian approach. Simulation approaches to nonlinear system identification have been shown to outperform regression methods under certain conditions, such as non-persistently exciting inputs and fast-sampling. We use the approximate Bayesian computation (ABC) algorithm to perform simulation-based inference of model parameters. The framework has the following main advantages: (1) parameter distributions are intrinsically generated, giving the user a clear description of uncertainty, (2) the simulation approach avoids the difficult problem of estimating signal derivatives as is common with other continuous-time methods, and (3) as noted above, the simulation approach improves identification under conditions of non-persistently exciting inputs and fast-sampling. Term selection is performed by judging parameter significance using parameter distributions that are intrinsically generated as part of the ABC procedure. The results from a numerical example demonstrate that the method performs well in noisy scenarios, especially in comparison to competing techniques that rely on signal derivative estimation.
Perceptual decision making: drift-diffusion model is equivalent to a Bayesian model
Bitzer, Sebastian; Park, Hame; Blankenburg, Felix; Kiebel, Stefan J.
2014-01-01
Behavioral data obtained with perceptual decision making experiments are typically analyzed with the drift-diffusion model. This parsimonious model accumulates noisy pieces of evidence toward a decision bound to explain the accuracy and reaction times of subjects. Recently, Bayesian models have been proposed to explain how the brain extracts information from noisy input as typically presented in perceptual decision making tasks. It has long been known that the drift-diffusion model is tightly linked with such functional Bayesian models but the precise relationship of the two mechanisms was never made explicit. Using a Bayesian model, we derived the equations which relate parameter values between these models. In practice we show that this equivalence is useful when fitting multi-subject data. We further show that the Bayesian model suggests different decision variables which all predict equal responses and discuss how these may be discriminated based on neural correlates of accumulated evidence. In addition, we discuss extensions to the Bayesian model which would be difficult to derive for the drift-diffusion model. We suggest that these and other extensions may be highly useful for deriving new experiments which test novel hypotheses. PMID:24616689
Probabilistic selection of high-redshift quasars
NASA Astrophysics Data System (ADS)
Mortlock, Daniel J.; Patel, Mitesh; Warren, Stephen J.; Hewett, Paul C.; Venemans, Bram P.; McMahon, Richard G.; Simpson, Chris
2012-01-01
High-redshift quasars (HZQs) with redshifts of z ≳ 6 are so rare that any photometrically selected sample of sources with HZQ-like colours is likely to be dominated by Galactic stars and brown dwarfs scattered from the stellar locus. It is impractical to re-observe all such candidates, so an alternative approach was developed in which Bayesian model comparison techniques are used to calculate the probability that a candidate is a HZQ, Pq, by combining models of the quasar and star populations with the photometric measurements of the object. This method was motivated specifically by the large number of HZQ candidates identified by cross-matching the UKIRT (United Kingdom Infrared Telescope) Infrared Deep Sky Survey (UKIDSS) Large Area Survey (LAS) to the Sloan Digital Sky Survey (SDSS): in the ? covered by the LAS in the UKIDSS Eighth Data Release (DR8) there are ˜9 × 103 real astronomical point sources with the measured colours of the target quasars, of which only ˜10 are expected to be HZQs. Applying Bayesian model comparison to the sample reveals that most sources with HZQ-like colours have Pq≲ 0.1 and can be confidently rejected without the need for any further observations. In the case of the UKIDSS DR8 LAS, there were just 107 candidates with Pq≥ 0.1; these objects were prioritized for re-observation by ranking according to Pq (and their likely redshift, which was also inferred from the photometric data). Most candidates were rejected after one or two (moderate-depth) photometric measurements by recalculating Pq using the new data. That left 12 confirmed HZQs, six of which were previously identified in the SDSS and six of which were new UKIDSS discoveries. The high efficiency of this Bayesian selection method suggests that it could usefully be extended to other HZQ surveys (e.g. searches by the Panoramic Survey Telescope And Rapid Response System, Pan-STARRS, or the Visible and Infrared Survey Telescope for Astronomy, VISTA) as well as to other searches for rare objects.
Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng
2014-01-01
A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping.
Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng
2014-01-01
A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping. PMID:25254227
Hierarchical Bayesian Modeling of Fluid-Induced Seismicity
NASA Astrophysics Data System (ADS)
Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.
2017-11-01
In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.
Varughese, Eunice A.; Brinkman, Nichole E; Anneken, Emily M; Cashdollar, Jennifer S; Fout, G. Shay; Furlong, Edward T.; Kolpin, Dana W.; Glassmeyer, Susan T.; Keely, Scott P
2017-01-01
incorporated into a Bayesian model to more accurately determine viral load in both source and treated water. Results of the Bayesian model indicated that viruses are present in source water and treated water. By using a Bayesian framework that incorporates inhibition, as well as many other parameters that affect viral detection, this study offers an approach for more accurately estimating the occurrence of viral pathogens in environmental waters.
A local approach for focussed Bayesian fusion
NASA Astrophysics Data System (ADS)
Sander, Jennifer; Heizmann, Michael; Goussev, Igor; Beyerer, Jürgen
2009-04-01
Local Bayesian fusion approaches aim to reduce high storage and computational costs of Bayesian fusion which is separated from fixed modeling assumptions. Using the small world formalism, we argue why this proceeding is conform with Bayesian theory. Then, we concentrate on the realization of local Bayesian fusion by focussing the fusion process solely on local regions that are task relevant with a high probability. The resulting local models correspond then to restricted versions of the original one. In a previous publication, we used bounds for the probability of misleading evidence to show the validity of the pre-evaluation of task specific knowledge and prior information which we perform to build local models. In this paper, we prove the validity of this proceeding using information theoretic arguments. For additional efficiency, local Bayesian fusion can be realized in a distributed manner. Here, several local Bayesian fusion tasks are evaluated and unified after the actual fusion process. For the practical realization of distributed local Bayesian fusion, software agents are predestinated. There is a natural analogy between the resulting agent based architecture and criminal investigations in real life. We show how this analogy can be used to improve the efficiency of distributed local Bayesian fusion additionally. Using a landscape model, we present an experimental study of distributed local Bayesian fusion in the field of reconnaissance, which highlights its high potential.
A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction
Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R.; Buenrostro-Mariscal, Raymundo
2017-01-01
There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. PMID:28391241
A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R; Buenrostro-Mariscal, Raymundo
2017-06-07
There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. Copyright © 2017 Montesinos-López et al.
Bayesian functional integral method for inferring continuous data from discrete measurements.
Heuett, William J; Miller, Bernard V; Racette, Susan B; Holloszy, John O; Chow, Carson C; Periwal, Vipul
2012-02-08
Inference of the insulin secretion rate (ISR) from C-peptide measurements as a quantification of pancreatic β-cell function is clinically important in diseases related to reduced insulin sensitivity and insulin action. ISR derived from C-peptide concentration is an example of nonparametric Bayesian model selection where a proposed ISR time-course is considered to be a "model". An inferred value of inaccessible continuous variables from discrete observable data is often problematic in biology and medicine, because it is a priori unclear how robust the inference is to the deletion of data points, and a closely related question, how much smoothness or continuity the data actually support. Predictions weighted by the posterior distribution can be cast as functional integrals as used in statistical field theory. Functional integrals are generally difficult to evaluate, especially for nonanalytic constraints such as positivity of the estimated parameters. We propose a computationally tractable method that uses the exact solution of an associated likelihood function as a prior probability distribution for a Markov-chain Monte Carlo evaluation of the posterior for the full model. As a concrete application of our method, we calculate the ISR from actual clinical C-peptide measurements in human subjects with varying degrees of insulin sensitivity. Our method demonstrates the feasibility of functional integral Bayesian model selection as a practical method for such data-driven inference, allowing the data to determine the smoothing timescale and the width of the prior probability distribution on the space of models. In particular, our model comparison method determines the discrete time-step for interpolation of the unobservable continuous variable that is supported by the data. Attempts to go to finer discrete time-steps lead to less likely models. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Bayesian Models for Astrophysical Data Using R, JAGS, Python, and Stan
NASA Astrophysics Data System (ADS)
Hilbe, Joseph M.; de Souza, Rafael S.; Ishida, Emille E. O.
2017-05-01
This comprehensive guide to Bayesian methods in astronomy enables hands-on work by supplying complete R, JAGS, Python, and Stan code, to use directly or to adapt. It begins by examining the normal model from both frequentist and Bayesian perspectives and then progresses to a full range of Bayesian generalized linear and mixed or hierarchical models, as well as additional types of models such as ABC and INLA. The book provides code that is largely unavailable elsewhere and includes details on interpreting and evaluating Bayesian models. Initial discussions offer models in synthetic form so that readers can easily adapt them to their own data; later the models are applied to real astronomical data. The consistent focus is on hands-on modeling, analysis of data, and interpretations that address scientific questions. A must-have for astronomers, its concrete approach will also be attractive to researchers in the sciences more generally.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benetti, Micol; Alcaniz, Jailson S.; Landau, Susana J., E-mail: micolbenetti@on.br, E-mail: slandau@df.uba.ar, E-mail: alcaniz@on.br
The hypothesis of the self-induced collapse of the inflaton wave function was proposed as responsible for the emergence of inhomogeneity and anisotropy at all scales. This proposal was studied within an almost de Sitter space-time approximation for the background, which led to a perfect scale-invariant power spectrum, and also for a quasi-de Sitter background, which allows to distinguish departures from the standard approach due to the inclusion of the collapse hypothesis. In this work we perform a Bayesian model comparison for two different choices of the self-induced collapse in a full quasi-de Sitter expansion scenario. In particular, we analyze themore » possibility of detecting the imprint of these collapse schemes at low multipoles of the anisotropy temperature power spectrum of the Cosmic Microwave Background (CMB) using the most recent data provided by the Planck Collaboration. Our results show that one of the two collapse schemes analyzed provides the same Bayesian evidence of the minimal standard cosmological model ΛCDM, while the other scenario is weakly disfavoured with respect to the standard cosmology.« less
Assessing global vegetation activity using spatio-temporal Bayesian modelling
NASA Astrophysics Data System (ADS)
Mulder, Vera L.; van Eck, Christel M.; Friedlingstein, Pierre; Regnier, Pierre A. G.
2016-04-01
This work demonstrates the potential of modelling vegetation activity using a hierarchical Bayesian spatio-temporal model. This approach allows modelling changes in vegetation and climate simultaneous in space and time. Changes of vegetation activity such as phenology are modelled as a dynamic process depending on climate variability in both space and time. Additionally, differences in observed vegetation status can be contributed to other abiotic ecosystem properties, e.g. soil and terrain properties. Although these properties do not change in time, they do change in space and may provide valuable information in addition to the climate dynamics. The spatio-temporal Bayesian models were calibrated at a regional scale because the local trends in space and time can be better captured by the model. The regional subsets were defined according to the SREX segmentation, as defined by the IPCC. Each region is considered being relatively homogeneous in terms of large-scale climate and biomes, still capturing small-scale (grid-cell level) variability. Modelling within these regions is hence expected to be less uncertain due to the absence of these large-scale patterns, compared to a global approach. This overall modelling approach allows the comparison of model behavior for the different regions and may provide insights on the main dynamic processes driving the interaction between vegetation and climate within different regions. The data employed in this study encompasses the global datasets for soil properties (SoilGrids), terrain properties (Global Relief Model based on SRTM DEM and ETOPO), monthly time series of satellite-derived vegetation indices (GIMMS NDVI3g) and climate variables (Princeton Meteorological Forcing Dataset). The findings proved the potential of a spatio-temporal Bayesian modelling approach for assessing vegetation dynamics, at a regional scale. The observed interrelationships of the employed data and the different spatial and temporal trends support our hypothesis. That is, the change of vegetation in space and time may be better understood when modelling vegetation change as both a dynamic and multivariate process. Therefore, future research will focus on a multivariate dynamical spatio-temporal modelling approach. This ongoing research is performed within the context of the project "Global impacts of hydrological and climatic extremes on vegetation" (project acronym: SAT-EX) which is part of the Belgian research programme for Earth Observation Stereo III.
Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum
2006-01-01
A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…
Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837
ERIC Educational Resources Information Center
Levy, Roy
2014-01-01
Digital games offer an appealing environment for assessing student proficiencies, including skills and misconceptions in a diagnostic setting. This paper proposes a dynamic Bayesian network modeling approach for observations of student performance from an educational video game. A Bayesian approach to model construction, calibration, and use in…
Comparison of sampling techniques for Bayesian parameter estimation
NASA Astrophysics Data System (ADS)
Allison, Rupert; Dunkley, Joanna
2014-02-01
The posterior probability distribution for a set of model parameters encodes all that the data have to tell us in the context of a given model; it is the fundamental quantity for Bayesian parameter estimation. In order to infer the posterior probability distribution we have to decide how to explore parameter space. Here we compare three prescriptions for how parameter space is navigated, discussing their relative merits. We consider Metropolis-Hasting sampling, nested sampling and affine-invariant ensemble Markov chain Monte Carlo (MCMC) sampling. We focus on their performance on toy-model Gaussian likelihoods and on a real-world cosmological data set. We outline the sampling algorithms themselves and elaborate on performance diagnostics such as convergence time, scope for parallelization, dimensional scaling, requisite tunings and suitability for non-Gaussian distributions. We find that nested sampling delivers high-fidelity estimates for posterior statistics at low computational cost, and should be adopted in favour of Metropolis-Hastings in many cases. Affine-invariant MCMC is competitive when computing clusters can be utilized for massive parallelization. Affine-invariant MCMC and existing extensions to nested sampling naturally probe multimodal and curving distributions.
NASA Astrophysics Data System (ADS)
Rath, V.; Wolf, A.; Bücker, H. M.
2006-10-01
Inverse methods are useful tools not only for deriving estimates of unknown parameters of the subsurface, but also for appraisal of the thus obtained models. While not being neither the most general nor the most efficient methods, Bayesian inversion based on the calculation of the Jacobian of a given forward model can be used to evaluate many quantities useful in this process. The calculation of the Jacobian, however, is computationally expensive and, if done by divided differences, prone to truncation error. Here, automatic differentiation can be used to produce derivative code by source transformation of an existing forward model. We describe this process for a coupled fluid flow and heat transport finite difference code, which is used in a Bayesian inverse scheme to estimate thermal and hydraulic properties and boundary conditions form measured hydraulic potentials and temperatures. The resulting derivative code was validated by comparison to simple analytical solutions and divided differences. Synthetic examples from different flow regimes demonstrate the use of the inverse scheme, and its behaviour in different configurations.
NASA Astrophysics Data System (ADS)
Schmit, C. J.; Pritchard, J. R.
2018-03-01
Next generation radio experiments such as LOFAR, HERA, and SKA are expected to probe the Epoch of Reionization (EoR) and claim a first direct detection of the cosmic 21cm signal within the next decade. Data volumes will be enormous and can thus potentially revolutionize our understanding of the early Universe and galaxy formation. However, numerical modelling of the EoR can be prohibitively expensive for Bayesian parameter inference and how to optimally extract information from incoming data is currently unclear. Emulation techniques for fast model evaluations have recently been proposed as a way to bypass costly simulations. We consider the use of artificial neural networks as a blind emulation technique. We study the impact of training duration and training set size on the quality of the network prediction and the resulting best-fitting values of a parameter search. A direct comparison is drawn between our emulation technique and an equivalent analysis using 21CMMC. We find good predictive capabilities of our network using training sets of as low as 100 model evaluations, which is within the capabilities of fully numerical radiative transfer codes.
Algorithmic procedures for Bayesian MEG/EEG source reconstruction in SPM☆
López, J.D.; Litvak, V.; Espinosa, J.J.; Friston, K.; Barnes, G.R.
2014-01-01
The MEG/EEG inverse problem is ill-posed, giving different source reconstructions depending on the initial assumption sets. Parametric Empirical Bayes allows one to implement most popular MEG/EEG inversion schemes (Minimum Norm, LORETA, etc.) within the same generic Bayesian framework. It also provides a cost-function in terms of the variational Free energy—an approximation to the marginal likelihood or evidence of the solution. In this manuscript, we revisit the algorithm for MEG/EEG source reconstruction with a view to providing a didactic and practical guide. The aim is to promote and help standardise the development and consolidation of other schemes within the same framework. We describe the implementation in the Statistical Parametric Mapping (SPM) software package, carefully explaining each of its stages with the help of a simple simulated data example. We focus on the Multiple Sparse Priors (MSP) model, which we compare with the well-known Minimum Norm and LORETA models, using the negative variational Free energy for model comparison. The manuscript is accompanied by Matlab scripts to allow the reader to test and explore the underlying algorithm. PMID:24041874
Johnston, Iain G; Burgstaller, Joerg P; Havlicek, Vitezslav; Kolbe, Thomas; Rülicke, Thomas; Brem, Gottfried; Poulton, Jo; Jones, Nick S
2015-01-01
Dangerous damage to mitochondrial DNA (mtDNA) can be ameliorated during mammalian development through a highly debated mechanism called the mtDNA bottleneck. Uncertainty surrounding this process limits our ability to address inherited mtDNA diseases. We produce a new, physically motivated, generalisable theoretical model for mtDNA populations during development, allowing the first statistical comparison of proposed bottleneck mechanisms. Using approximate Bayesian computation and mouse data, we find most statistical support for a combination of binomial partitioning of mtDNAs at cell divisions and random mtDNA turnover, meaning that the debated exact magnitude of mtDNA copy number depletion is flexible. New experimental measurements from a wild-derived mtDNA pairing in mice confirm the theoretical predictions of this model. We analytically solve a mathematical description of this mechanism, computing probabilities of mtDNA disease onset, efficacy of clinical sampling strategies, and effects of potential dynamic interventions, thus developing a quantitative and experimentally-supported stochastic theory of the bottleneck. DOI: http://dx.doi.org/10.7554/eLife.07464.001 PMID:26035426
Molitor, John
2012-03-01
Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.
The Bayesian reader: explaining word recognition as an optimal Bayesian decision process.
Norris, Dennis
2006-04-01
This article presents a theory of visual word recognition that assumes that, in the tasks of word identification, lexical decision, and semantic categorization, human readers behave as optimal Bayesian decision makers. This leads to the development of a computational model of word recognition, the Bayesian reader. The Bayesian reader successfully simulates some of the most significant data on human reading. The model accounts for the nature of the function relating word frequency to reaction time and identification threshold, the effects of neighborhood density and its interaction with frequency, and the variation in the pattern of neighborhood density effects seen in different experimental tasks. Both the general behavior of the model and the way the model predicts different patterns of results in different tasks follow entirely from the assumption that human readers approximate optimal Bayesian decision makers. ((c) 2006 APA, all rights reserved).
Bayesian denoising in digital radiography: a comparison in the dental field.
Frosio, I; Olivieri, C; Lucchese, M; Borghese, N A; Boccacci, P
2013-01-01
We compared two Bayesian denoising algorithms for digital radiographs, based on Total Variation regularization and wavelet decomposition. The comparison was performed on simulated radiographs with different photon counts and frequency content and on real dental radiographs. Four different quality indices were considered to quantify the quality of the filtered radiographs. The experimental results suggested that Total Variation is more suited to preserve fine anatomical details, whereas wavelets produce images of higher quality at global scale; they also highlighted the need for more reliable image quality indices. Copyright © 2012 Elsevier Ltd. All rights reserved.
Bayesian flood forecasting methods: A review
NASA Astrophysics Data System (ADS)
Han, Shasha; Coulibaly, Paulin
2017-08-01
Over the past few decades, floods have been seen as one of the most common and largely distributed natural disasters in the world. If floods could be accurately forecasted in advance, then their negative impacts could be greatly minimized. It is widely recognized that quantification and reduction of uncertainty associated with the hydrologic forecast is of great importance for flood estimation and rational decision making. Bayesian forecasting system (BFS) offers an ideal theoretic framework for uncertainty quantification that can be developed for probabilistic flood forecasting via any deterministic hydrologic model. It provides suitable theoretical structure, empirically validated models and reasonable analytic-numerical computation method, and can be developed into various Bayesian forecasting approaches. This paper presents a comprehensive review on Bayesian forecasting approaches applied in flood forecasting from 1999 till now. The review starts with an overview of fundamentals of BFS and recent advances in BFS, followed with BFS application in river stage forecasting and real-time flood forecasting, then move to a critical analysis by evaluating advantages and limitations of Bayesian forecasting methods and other predictive uncertainty assessment approaches in flood forecasting, and finally discusses the future research direction in Bayesian flood forecasting. Results show that the Bayesian flood forecasting approach is an effective and advanced way for flood estimation, it considers all sources of uncertainties and produces a predictive distribution of the river stage, river discharge or runoff, thus gives more accurate and reliable flood forecasts. Some emerging Bayesian forecasting methods (e.g. ensemble Bayesian forecasting system, Bayesian multi-model combination) were shown to overcome limitations of single model or fixed model weight and effectively reduce predictive uncertainty. In recent years, various Bayesian flood forecasting approaches have been developed and widely applied, but there is still room for improvements. Future research in the context of Bayesian flood forecasting should be on assimilation of various sources of newly available information and improvement of predictive performance assessment methods.
Bayesian modeling of flexible cognitive control
Jiang, Jiefeng; Heller, Katherine; Egner, Tobias
2014-01-01
“Cognitive control” describes endogenous guidance of behavior in situations where routine stimulus-response associations are suboptimal for achieving a desired goal. The computational and neural mechanisms underlying this capacity remain poorly understood. We examine recent advances stemming from the application of a Bayesian learner perspective that provides optimal prediction for control processes. In reviewing the application of Bayesian models to cognitive control, we note that an important limitation in current models is a lack of a plausible mechanism for the flexible adjustment of control over conflict levels changing at varying temporal scales. We then show that flexible cognitive control can be achieved by a Bayesian model with a volatility-driven learning mechanism that modulates dynamically the relative dependence on recent and remote experiences in its prediction of future control demand. We conclude that the emergent Bayesian perspective on computational mechanisms of cognitive control holds considerable promise, especially if future studies can identify neural substrates of the variables encoded by these models, and determine the nature (Bayesian or otherwise) of their neural implementation. PMID:24929218
Bayesian generalized linear mixed modeling of Tuberculosis using informative priors.
Ojo, Oluwatobi Blessing; Lougue, Siaka; Woldegerima, Woldegebriel Assefa
2017-01-01
TB is rated as one of the world's deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014.
Bayesian statistics in medicine: a 25 year review.
Ashby, Deborah
2006-11-15
This review examines the state of Bayesian thinking as Statistics in Medicine was launched in 1982, reflecting particularly on its applicability and uses in medical research. It then looks at each subsequent five-year epoch, with a focus on papers appearing in Statistics in Medicine, putting these in the context of major developments in Bayesian thinking and computation with reference to important books, landmark meetings and seminal papers. It charts the growth of Bayesian statistics as it is applied to medicine and makes predictions for the future. From sparse beginnings, where Bayesian statistics was barely mentioned, Bayesian statistics has now permeated all the major areas of medical statistics, including clinical trials, epidemiology, meta-analyses and evidence synthesis, spatial modelling, longitudinal modelling, survival modelling, molecular genetics and decision-making in respect of new technologies.
Lee, Eun Gyung; Kim, Seung Won; Feigley, Charles E.; Harper, Martin
2015-01-01
This study introduces two semi-quantitative methods, Structured Subjective Assessment (SSA) and Control of Substances Hazardous to Health (COSHH) Essentials, in conjunction with two-dimensional Monte Carlo simulations for determining prior probabilities. Prior distribution using expert judgment was included for comparison. Practical applications of the proposed methods were demonstrated using personal exposure measurements of isoamyl acetate in an electronics manufacturing facility and of isopropanol in a printing shop. Applicability of these methods in real workplaces was discussed based on the advantages and disadvantages of each method. Although these methods could not be completely independent of expert judgments, this study demonstrated a methodological improvement in the estimation of the prior distribution for the Bayesian decision analysis tool. The proposed methods provide a logical basis for the decision process by considering determinants of worker exposure. PMID:23252451
Comparing nonparametric Bayesian tree priors for clonal reconstruction of tumors.
Deshwar, Amit G; Vembu, Shankar; Morris, Quaid
2015-01-01
Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor…
Screening for SNPs with Allele-Specific Methylation based on Next-Generation Sequencing Data.
Hu, Bo; Ji, Yuan; Xu, Yaomin; Ting, Angela H
2013-05-01
Allele-specific methylation (ASM) has long been studied but mainly documented in the context of genomic imprinting and X chromosome inactivation. Taking advantage of the next-generation sequencing technology, we conduct a high-throughput sequencing experiment with four prostate cell lines to survey the whole genome and identify single nucleotide polymorphisms (SNPs) with ASM. A Bayesian approach is proposed to model the counts of short reads for each SNP conditional on its genotypes of multiple subjects, leading to a posterior probability of ASM. We flag SNPs with high posterior probabilities of ASM by accounting for multiple comparisons based on posterior false discovery rates. Applying the Bayesian approach to the in-house prostate cell line data, we identify 269 SNPs as candidates of ASM. A simulation study is carried out to demonstrate the quantitative performance of the proposed approach.
Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832
Bayesian evidence and predictivity of the inflationary paradigm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gubitosi, Giulia; Lagos, Macarena; Magueijo, João
In this paper we consider the issue of paradigm evaluation by applying Bayes' theorem along the following nested hierarchy of progressively more complex structures: i) parameter estimation (within a model), ii) model selection and comparison (within a paradigm), iii) paradigm evaluation. In such a hierarchy the Bayesian evidence works both as the posterior's normalization at a given level and as the likelihood function at the next level up. Whilst raising no objections to the standard application of the procedure at the two lowest levels, we argue that it should receive a considerable modification when evaluating paradigms, when testability and fittingmore » data are equally important. By considering toy models we illustrate how models and paradigms that are difficult to falsify are always favoured by the Bayes factor. We argue that the evidence for a paradigm should not only be high for a given dataset, but exceptional with respect to what it would have been, had the data been different. With this motivation we propose a measure which we term predictivity , as well as a prior to be incorporated into the Bayesian framework, penalising unpredictivity as much as not fitting data. We apply this measure to inflation seen as a whole, and to a scenario where a specific inflationary model is hypothetically deemed as the only one viable as a result of information alien to cosmology (e.g. Solar System gravity experiments, or particle physics input). We conclude that cosmic inflation is currently hard to falsify, but that this could change were external/additional information to cosmology to select one of its many models. We also compare this state of affairs to bimetric varying speed of light cosmology.« less
NASA Astrophysics Data System (ADS)
Gogu, C.; Haftka, R.; LeRiche, R.; Molimard, J.; Vautrin, A.; Sankar, B.
2008-11-01
The basic formulation of the least squares method, based on the L2 norm of the misfit, is still widely used today for identifying elastic material properties from experimental data. An alternative statistical approach is the Bayesian method. We seek here situations with significant difference between the material properties found by the two methods. For a simple three bar truss example we illustrate three such situations in which the Bayesian approach leads to more accurate results: different magnitude of the measurements, different uncertainty in the measurements and correlation among measurements. When all three effects add up, the Bayesian approach can have a large advantage. We then compared the two methods for identification of elastic constants from plate vibration natural frequencies.
An introduction to Bayesian statistics in health psychology.
Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske
2017-09-01
The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.
A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.
Houseman, E Andres; Virji, M Abbas
2017-08-01
Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates were significant in some frequentist models, but in the Bayesian model their credible intervals contained zero; such discrepancies were observed in multiple datasets. Variance components from the Bayesian model reflected substantial autocorrelation, consistent with the frequentist models, except for the auto-regressive moving average model. Plots of means from the Bayesian model showed good fit to the observed data. The proposed Bayesian model provides an approach for modeling non-stationary autocorrelation in a hierarchical modeling framework to estimate task means, standard deviations, quantiles, and parameter estimates for covariates that are less biased and have better performance characteristics than some of the contemporary methods. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2017.
2017-09-01
efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components
Bayesian Learning and the Psychology of Rule Induction
ERIC Educational Resources Information Center
Endress, Ansgar D.
2013-01-01
In recent years, Bayesian learning models have been applied to an increasing variety of domains. While such models have been criticized on theoretical grounds, the underlying assumptions and predictions are rarely made concrete and tested experimentally. Here, I use Frank and Tenenbaum's (2011) Bayesian model of rule-learning as a case study to…
Properties of the Bayesian Knowledge Tracing Model
ERIC Educational Resources Information Center
van de Sande, Brett
2013-01-01
Bayesian Knowledge Tracing is used very widely to model student learning. It comes in two different forms: The first form is the Bayesian Knowledge Tracing "hidden Markov model" which predicts the probability of correct application of a skill as a function of the number of previous opportunities to apply that skill and the model…
Bayesian Analysis of Longitudinal Data Using Growth Curve Models
ERIC Educational Resources Information Center
Zhang, Zhiyong; Hamagami, Fumiaki; Wang, Lijuan Lijuan; Nesselroade, John R.; Grimm, Kevin J.
2007-01-01
Bayesian methods for analyzing longitudinal data in social and behavioral research are recommended for their ability to incorporate prior information in estimating simple and complex models. We first summarize the basics of Bayesian methods before presenting an empirical example in which we fit a latent basis growth curve model to achievement data…
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students’ e-learning via Facebook through Bayesian structural equation modeling
Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019
LeBlanc, Mallory; Allen, Joseph G; Herrick, Robert F; Stewart, James H
2018-03-01
The Advanced Reach Tool V1.5 (ART) is a mathematical model for occupational exposures conceptually based on, but implemented differently than, the "classic" Near Field/Far Field (NF/FF) exposure model. The NF/FF model conceptualizes two distinct exposure "zones"; the near field, within approximately 1m of the breathing zone, and the far field, consisting of the rest of the room in which the exposure occurs. ART has been reported to provide "realistic and reasonable worst case" estimates of the exposure distribution. In this study, benzene exposure during the use of a metal parts washer was modeled using ART V1.5, and compared to actual measured workers samples and to NF/FF model results from three previous studies. Next, the exposure concentrations expected to be exceeded 25%, 10% and 5% of the time for the exposure scenario were calculated using ART. Lastly, ART exposure estimates were compared with and without Bayesian adjustment. The modeled parts washing benzene exposure scenario included distinct tasks, e.g. spraying, brushing, rinsing and soaking/drying. Because ART can directly incorporate specific types of tasks that are part of the exposure scenario, the present analysis identified each task's determinants of exposure and performance time, thus extending the work of the previous three studies where the process of parts washing was modeled as one event. The ART 50th percentile exposure estimate for benzene (0.425ppm) more closely approximated the reported measured mean value of 0.50ppm than the NF/FF model estimates of 0.33ppm, 0.070ppm or 0.2ppm obtained from other modeling studies of this exposure scenario. The ART model with the Bayesian analysis provided the closest estimate to the measured value (0.50ppm). ART (with Bayesian adjustment) was then used to assess the 75th, the 90th and 95th percentile exposures, predicting that on randomly selected days during this parts washing exposure scenario, 25% of the benzene exposures would be above 0.70ppm; 10% above 0.95ppm; and 5% above 1.15ppm. These exposure estimates at the three different percentiles of the ART exposure distribution refer to the modeled exposure scenario not a specific workplace or worker. This study provides a detailed comparison of modeling tools currently available to occupational hygienists and other exposure assessors. Possible applications are considered. Copyright © 2017 Elsevier GmbH. All rights reserved.
When mechanism matters: Bayesian forecasting using models of ecological diffusion
Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.
2017-01-01
Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.
Evaluation of uncertainty in the adjustment of fundamental constants
NASA Astrophysics Data System (ADS)
Bodnar, Olha; Elster, Clemens; Fischer, Joachim; Possolo, Antonio; Toman, Blaza
2016-02-01
Combining multiple measurement results for the same quantity is an important task in metrology and in many other areas. Examples include the determination of fundamental constants, the calculation of reference values in interlaboratory comparisons, or the meta-analysis of clinical studies. However, neither the GUM nor its supplements give any guidance for this task. Various approaches are applied such as weighted least-squares in conjunction with the Birge ratio or random effects models. While the former approach, which is based on a location-scale model, is particularly popular in metrology, the latter represents a standard tool used in statistics for meta-analysis. We investigate the reliability and robustness of the location-scale model and the random effects model with particular focus on resulting coverage or credible intervals. The interval estimates are obtained by adopting a Bayesian point of view in conjunction with a non-informative prior that is determined by a currently favored principle for selecting non-informative priors. Both approaches are compared by applying them to simulated data as well as to data for the Planck constant and the Newtonian constant of gravitation. Our results suggest that the proposed Bayesian inference based on the random effects model is more reliable and less sensitive to model misspecifications than the approach based on the location-scale model.
The discounting model selector: Statistical software for delay discounting applications.
Gilroy, Shawn P; Franck, Christopher T; Hantula, Donald A
2017-05-01
Original, open-source computer software was developed and validated against established delay discounting methods in the literature. The software executed approximate Bayesian model selection methods from user-supplied temporal discounting data and computed the effective delay 50 (ED50) from the best performing model. Software was custom-designed to enable behavior analysts to conveniently apply recent statistical methods to temporal discounting data with the aid of a graphical user interface (GUI). The results of independent validation of the approximate Bayesian model selection methods indicated that the program provided results identical to that of the original source paper and its methods. Monte Carlo simulation (n = 50,000) confirmed that true model was selected most often in each setting. Simulation code and data for this study were posted to an online repository for use by other researchers. The model selection approach was applied to three existing delay discounting data sets from the literature in addition to the data from the source paper. Comparisons of model selected ED50 were consistent with traditional indices of discounting. Conceptual issues related to the development and use of computer software by behavior analysts and the opportunities afforded by free and open-sourced software are discussed and a review of possible expansions of this software are provided. © 2017 Society for the Experimental Analysis of Behavior.
Espino-Hernandez, Gabriela; Gustafson, Paul; Burstyn, Igor
2011-05-14
In epidemiological studies explanatory variables are frequently subject to measurement error. The aim of this paper is to develop a Bayesian method to correct for measurement error in multiple continuous exposures in individually matched case-control studies. This is a topic that has not been widely investigated. The new method is illustrated using data from an individually matched case-control study of the association between thyroid hormone levels during pregnancy and exposure to perfluorinated acids. The objective of the motivating study was to examine the risk of maternal hypothyroxinemia due to exposure to three perfluorinated acids measured on a continuous scale. Results from the proposed method are compared with those obtained from a naive analysis. Using a Bayesian approach, the developed method considers a classical measurement error model for the exposures, as well as the conditional logistic regression likelihood as the disease model, together with a random-effect exposure model. Proper and diffuse prior distributions are assigned, and results from a quality control experiment are used to estimate the perfluorinated acids' measurement error variability. As a result, posterior distributions and 95% credible intervals of the odds ratios are computed. A sensitivity analysis of method's performance in this particular application with different measurement error variability was performed. The proposed Bayesian method to correct for measurement error is feasible and can be implemented using statistical software. For the study on perfluorinated acids, a comparison of the inferences which are corrected for measurement error to those which ignore it indicates that little adjustment is manifested for the level of measurement error actually exhibited in the exposures. Nevertheless, a sensitivity analysis shows that more substantial adjustments arise if larger measurement errors are assumed. In individually matched case-control studies, the use of conditional logistic regression likelihood as a disease model in the presence of measurement error in multiple continuous exposures can be justified by having a random-effect exposure model. The proposed method can be successfully implemented in WinBUGS to correct individually matched case-control studies for several mismeasured continuous exposures under a classical measurement error model.
Reasoning in Reference Games: Individual- vs. Population-Level Probabilistic Modeling
Franke, Michael; Degen, Judith
2016-01-01
Recent advances in probabilistic pragmatics have achieved considerable success in modeling speakers’ and listeners’ pragmatic reasoning as probabilistic inference. However, these models are usually applied to population-level data, and so implicitly suggest a homogeneous population without individual differences. Here we investigate potential individual differences in Theory-of-Mind related depth of pragmatic reasoning in so-called reference games that require drawing ad hoc Quantity implicatures of varying complexity. We show by Bayesian model comparison that a model that assumes a heterogenous population is a better predictor of our data, especially for comprehension. We discuss the implications for the treatment of individual differences in probabilistic models of language use. PMID:27149675
Probing Quark-Gluon-Plasma properties with a Bayesian model-to-data comparison
NASA Astrophysics Data System (ADS)
Cai, Tianji; Bernhard, Jonah; Ke, Weiyao; Bass, Steffen; Duke QCD Group Team
2016-09-01
Experiments at RHIC and LHC study a special state of matter called the Quark Gluon Plasma (QGP), where quarks and gluons roam freely, by colliding relativistic heavy-ions. Given the transitory nature of the QGP, its properties can only be explored by comparing computational models of its formation and evolution to experimental data. The models fall, roughly speaking, under two categories-those solely using relativistic viscous hydrodynamics (pure hydro model) and those that in addition couple to a microscopic Boltzmann transport for the later evolution of the hadronic decay products (hybrid model). Each of these models has multiple parameters that encode the physical properties we want to probe and that need to be calibrated to experimental data, a task which is computationally expensive, but necessary for the knowledge extraction and determination of the models' quality. Our group has developed an analysis technique based on Bayesian Statistics to perform the model calibration and to extract probability distributions for each model parameter. Following the previous work that applies the technique to the hybrid model, we now perform a similar analysis on a pure-hydro model and display the posterior distributions for the same set of model parameters. We also develop a set of criteria to assess the quality of the two models with respect to their ability to describe current experimental data. Funded by Duke University Goldman Sachs Research Fellowship.
Kärkkäinen, Hanni P; Sillanpää, Mikko J
2013-09-04
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.
Kärkkäinen, Hanni P.; Sillanpää, Mikko J.
2013-01-01
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618
Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
A Bayesian alternative for multi-objective ecohydrological model specification
NASA Astrophysics Data System (ADS)
Tang, Yating; Marshall, Lucy; Sharma, Ashish; Ajami, Hoori
2018-01-01
Recent studies have identified the importance of vegetation processes in terrestrial hydrologic systems. Process-based ecohydrological models combine hydrological, physical, biochemical and ecological processes of the catchments, and as such are generally more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov chain Monte Carlo (MCMC) techniques. The Bayesian approach offers an appealing alternative to traditional multi-objective hydrologic model calibrations by defining proper prior distributions that can be considered analogous to the ad-hoc weighting often prescribed in multi-objective calibration. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological modeling framework based on a traditional Pareto-based model calibration technique. In our study, a Pareto-based multi-objective optimization and a formal Bayesian framework are implemented in a conceptual ecohydrological model that combines a hydrological model (HYMOD) and a modified Bucket Grassland Model (BGM). Simulations focused on one objective (streamflow/LAI) and multiple objectives (streamflow and LAI) with different emphasis defined via the prior distribution of the model error parameters. Results show more reliable outputs for both predicted streamflow and LAI using Bayesian multi-objective calibration with specified prior distributions for error parameters based on results from the Pareto front in the ecohydrological modeling. The methodology implemented here provides insight into the usefulness of multiobjective Bayesian calibration for ecohydrologic systems and the importance of appropriate prior distributions in such approaches.
Ortega, Alonso; Labrenz, Stephan; Markowitsch, Hans J; Piefke, Martina
2013-01-01
In the last decade, different statistical techniques have been introduced to improve assessment of malingering-related poor effort. In this context, we have recently shown preliminary evidence that a Bayesian latent group model may help to optimize classification accuracy using a simulation research design. In the present study, we conducted two analyses. Firstly, we evaluated how accurately this Bayesian approach can distinguish between participants answering in an honest way (honest response group) and participants feigning cognitive impairment (experimental malingering group). Secondly, we tested the accuracy of our model in the differentiation between patients who had real cognitive deficits (cognitively impaired group) and participants who belonged to the experimental malingering group. All Bayesian analyses were conducted using the raw scores of a visual recognition forced-choice task (2AFC), the Test of Memory Malingering (TOMM, Trial 2), and the Word Memory Test (WMT, primary effort subtests). The first analysis showed 100% accuracy for the Bayesian model in distinguishing participants of both groups with all effort measures. The second analysis showed outstanding overall accuracy of the Bayesian model when estimates were obtained from the 2AFC and the TOMM raw scores. Diagnostic accuracy of the Bayesian model diminished when using the WMT total raw scores. Despite, overall diagnostic accuracy can still be considered excellent. The most plausible explanation for this decrement is the low performance in verbal recognition and fluency tasks of some patients of the cognitively impaired group. Additionally, the Bayesian model provides individual estimates, p(zi |D), of examinees' effort levels. In conclusion, both high classification accuracy levels and Bayesian individual estimates of effort may be very useful for clinicians when assessing for effort in medico-legal settings.
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision. PMID:27303323
Testing Models of Stellar Structure and Evolution I. Comparison with Detached Eclipsing Binaries
NASA Astrophysics Data System (ADS)
del Burgo, C.; Allende Prieto, C.
2018-05-01
We present the results of an analysis aimed at testing the accuracy and precision of the PARSEC v1.2S library of stellar evolution models, combined with a Bayesian approach, to infer stellar parameters. We mainly employ the online DEBCat catalogue by Southworth, a compilation of detached eclipsing binary systems with published measurements of masses and radii to ˜ 2 per cent precision. We select a sample of 318 binary components, with masses between 0.10 and 14.5 solar units, and distances between 1.3 pc and ˜ 8 kpc for Galactic objects and ˜ 44-68 kpc for the extragalactic ones. The Bayesian analysis applied takes on input effective temperature, radius, and [Fe/H], and their uncertainties, returning theoretical predictions for other stellar parameters. From the comparison with dynamical masses, we conclude inferred masses are precisely derived for stars on the main-sequence and in the core-helium-burning phase, with respective uncertainties of 4 per cent and 7 per cent, on average. Subgiants and red giants masses are predicted within 14 per cent, and early asymptotic giant branch stars within 24 per cent. These results are helpful to further improve the models, in particular for advanced evolutionary stages for which our understanding is limited. We obtain distances and ages for the binary systems and compare them, whenever possible, with precise literature estimates, finding excellent agreement. We discuss evolutionary effects and the challenges associated with the inference of stellar ages from evolutionary models. We also provide useful polynomial fittings to theoretical zero-age main-sequence relations.
Comparison of different uncertainty techniques in urban stormwater quantity and quality modelling.
Dotto, Cintia B S; Mannina, Giorgio; Kleidorfer, Manfred; Vezzaro, Luca; Henrichs, Malte; McCarthy, David T; Freni, Gabriele; Rauch, Wolfgang; Deletic, Ana
2012-05-15
Urban drainage models are important tools used by both practitioners and scientists in the field of stormwater management. These models are often conceptual and usually require calibration using local datasets. The quantification of the uncertainty associated with the models is a must, although it is rarely practiced. The International Working Group on Data and Models, which works under the IWA/IAHR Joint Committee on Urban Drainage, has been working on the development of a framework for defining and assessing uncertainties in the field of urban drainage modelling. A part of that work is the assessment and comparison of different techniques generally used in the uncertainty assessment of the parameters of water models. This paper compares a number of these techniques: the Generalized Likelihood Uncertainty Estimation (GLUE), the Shuffled Complex Evolution Metropolis algorithm (SCEM-UA), an approach based on a multi-objective auto-calibration (a multialgorithm, genetically adaptive multi-objective method, AMALGAM) and a Bayesian approach based on a simplified Markov Chain Monte Carlo method (implemented in the software MICA). To allow a meaningful comparison among the different uncertainty techniques, common criteria have been set for the likelihood formulation, defining the number of simulations, and the measure of uncertainty bounds. Moreover, all the uncertainty techniques were implemented for the same case study, in which the same stormwater quantity and quality model was used alongside the same dataset. The comparison results for a well-posed rainfall/runoff model showed that the four methods provide similar probability distributions of model parameters, and model prediction intervals. For ill-posed water quality model the differences between the results were much wider; and the paper provides the specific advantages and disadvantages of each method. In relation to computational efficiency (i.e. number of iterations required to generate the probability distribution of parameters), it was found that SCEM-UA and AMALGAM produce results quicker than GLUE in terms of required number of simulations. However, GLUE requires the lowest modelling skills and is easy to implement. All non-Bayesian methods have problems with the way they accept behavioural parameter sets, e.g. GLUE, SCEM-UA and AMALGAM have subjective acceptance thresholds, while MICA has usually problem with its hypothesis on normality of residuals. It is concluded that modellers should select the method which is most suitable for the system they are modelling (e.g. complexity of the model's structure including the number of parameters), their skill/knowledge level, the available information, and the purpose of their study. Copyright © 2012 Elsevier Ltd. All rights reserved.
Accurate Biomass Estimation via Bayesian Adaptive Sampling
NASA Technical Reports Server (NTRS)
Wheeler, Kevin R.; Knuth, Kevin H.; Castle, Joseph P.; Lvov, Nikolay
2005-01-01
The following concepts were introduced: a) Bayesian adaptive sampling for solving biomass estimation; b) Characterization of MISR Rahman model parameters conditioned upon MODIS landcover. c) Rigorous non-parametric Bayesian approach to analytic mixture model determination. d) Unique U.S. asset for science product validation and verification.
A Two-Step Bayesian Approach for Propensity Score Analysis: Simulations and Case Study.
Kaplan, David; Chen, Jianshen
2012-07-01
A two-step Bayesian propensity score approach is introduced that incorporates prior information in the propensity score equation and outcome equation without the problems associated with simultaneous Bayesian propensity score approaches. The corresponding variance estimators are also provided. The two-step Bayesian propensity score is provided for three methods of implementation: propensity score stratification, weighting, and optimal full matching. Three simulation studies and one case study are presented to elaborate the proposed two-step Bayesian propensity score approach. Results of the simulation studies reveal that greater precision in the propensity score equation yields better recovery of the frequentist-based treatment effect. A slight advantage is shown for the Bayesian approach in small samples. Results also reveal that greater precision around the wrong treatment effect can lead to seriously distorted results. However, greater precision around the correct treatment effect parameter yields quite good results, with slight improvement seen with greater precision in the propensity score equation. A comparison of coverage rates for the conventional frequentist approach and proposed Bayesian approach is also provided. The case study reveals that credible intervals are wider than frequentist confidence intervals when priors are non-informative.
Application of the quantum spin glass theory to image restoration.
Inoue, J I
2001-04-01
Quantum fluctuation is introduced into the Markov random-field model for image restoration in the context of a Bayesian approach. We investigate the dependence of the quantum fluctuation on the quality of a black and white image restoration by making use of statistical mechanics. We find that the maximum posterior marginal (MPM) estimate based on the quantum fluctuation gives a fine restoration in comparison with the maximum a posteriori estimate or the thermal fluctuation based MPM estimate.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karagiannis, Georgios; Lin, Guang
2014-02-15
Generalized polynomial chaos (gPC) expansions allow the representation of the solution of a stochastic system as a series of polynomial terms. The number of gPC terms increases dramatically with the dimension of the random input variables. When the number of the gPC terms is larger than that of the available samples, a scenario that often occurs if the evaluations of the system are expensive, the evaluation of the gPC expansion can be inaccurate due to over-fitting. We propose a fully Bayesian approach that allows for global recovery of the stochastic solution, both in spacial and random domains, by coupling Bayesianmore » model uncertainty and regularization regression methods. It allows the evaluation of the PC coefficients on a grid of spacial points via (1) Bayesian model average or (2) medial probability model, and their construction as functions on the spacial domain via spline interpolation. The former accounts the model uncertainty and provides Bayes-optimal predictions; while the latter, additionally, provides a sparse representation of the solution by evaluating the expansion on a subset of dominating gPC bases when represented as a gPC expansion. Moreover, the method quantifies the importance of the gPC bases through inclusion probabilities. We design an MCMC sampler that evaluates all the unknown quantities without the need of ad-hoc techniques. The proposed method is suitable for, but not restricted to, problems whose stochastic solution is sparse at the stochastic level with respect to the gPC bases while the deterministic solver involved is expensive. We demonstrate the good performance of the proposed method and make comparisons with others on 1D, 14D and 40D in random space elliptic stochastic partial differential equations.« less
Sparse Event Modeling with Hierarchical Bayesian Kernel Methods
2016-01-05
SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model , is able to model the rate of occurrence of...which adds specificity to the model and can make nonlinear data more manageable. Early results show that the 1. REPORT DATE (DD-MM-YYYY) 4. TITLE
Bayesian generalized linear mixed modeling of Tuberculosis using informative priors
Woldegerima, Woldegebriel Assefa
2017-01-01
TB is rated as one of the world’s deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014. PMID:28257437
Zador, Zsolt; Sperrin, Matthew; King, Andrew T
2016-01-01
Traumatic brain injury remains a global health problem. Understanding the relative importance of outcome predictors helps optimize our treatment strategies by informing assessment protocols, clinical decisions and trial designs. In this study we establish importance ranking for outcome predictors based on receiver operating indices to identify key predictors of outcome and create simple predictive models. We then explore the associations between key outcome predictors using Bayesian networks to gain further insight into predictor importance. We analyzed the corticosteroid randomization after significant head injury (CRASH) trial database of 10008 patients and included patients for whom demographics, injury characteristics, computer tomography (CT) findings and Glasgow Outcome Scale (GCS) were recorded (total of 13 predictors, which would be available to clinicians within a few hours following the injury in 6945 patients). Predictions of clinical outcome (death or severe disability at 6 months) were performed using logistic regression models with 5-fold cross validation. Predictive performance was measured using standardized partial area (pAUC) under the receiver operating curve (ROC) and we used Delong test for comparisons. Variable importance ranking was based on pAUC targeted at specificity (pAUCSP) and sensitivity (pAUCSE) intervals of 90-100%. Probabilistic associations were depicted using Bayesian networks. Complete AUC analysis showed very good predictive power (AUC = 0.8237, 95% CI: 0.8138-0.8336) for the complete model. Specificity focused importance ranking highlighted age, pupillary, motor responses, obliteration of basal cisterns/3rd ventricle and midline shift. Interestingly when targeting model sensitivity, the highest-ranking variables were age, severe extracranial injury, verbal response, hematoma on CT and motor response. Simplified models, which included only these key predictors, had similar performance (pAUCSP = 0.6523, 95% CI: 0.6402-0.6641 and pAUCSE = 0.6332, 95% CI: 0.62-0.6477) compared to the complete models (pAUCSP = 0.6664, 95% CI: 0.6543-0.679, pAUCSE = 0.6436, 95% CI: 0.6289-0.6585, de Long p value 0.1165 and 0.3448 respectively). Bayesian networks showed the predictors that did not feature in the simplified models were associated with those that did. We demonstrate that importance based variable selection allows simplified predictive models to be created while maintaining prediction accuracy. Variable selection targeting specificity confirmed key components of clinical assessment in TBI whereas sensitivity based ranking suggested extracranial injury as one of the important predictors. These results help refine our approach to head injury assessment, decision-making and outcome prediction targeted at model sensitivity and specificity. Bayesian networks proved to be a comprehensive tool for depicting probabilistic associations for key predictors giving insight into why the simplified model has maintained accuracy.
NASA Astrophysics Data System (ADS)
Dawson, A.; Trachsel, M.; Goring, S. J.; Paciorek, C. J.; McLachlan, J. S.; Jackson, S. T.; Williams, J. W.
2017-12-01
Pollen records have been extensively used to reconstruct past changes in vegetation and study the underlying processes. However, developing the statistical techniques needed to accurately represent both data and process uncertainties is a formidable challenge. Recent advances in paleoecoinformatics (e.g. the Neotoma Paleoecology Database and the European Pollen Database), Bayesian age-depth models, and process-based pollen-vegetation models, and Bayesian hierarchical modeling have pushed paleovegetation reconstructions forward to a point where multiple sources of uncertainty can be incorporated into reconstructions, which in turn enables new hypotheses to be asked and more rigorous integration of paleovegetation data with earth system models and terrestrial ecosystem models. Several kinds of pollen-vegetation models have been developed, notably LOVE/REVEALS, STEPPS, and classical transfer functions such as the modern analog technique. LOVE/REVEALS has been adopted as the standard method for the LandCover6k effort to develop quantitative reconstructions of land cover for the Holocene, while STEPPS has been developed recently as part of the PalEON project and applied to reconstruct with uncertainty shifts in forest composition in New England and the upper Midwest during the late Holocene. Each PVM has different assumptions and structure and uses different input data, but few comparisons among approaches yet exist. Here, we present new reconstructions of land cover change in northern North America during the Holocene based on LOVE/REVEALS and data drawn from the Neotoma database and compare STEPPS-based reconstructions to those from LOVE/REVEALS. These parallel developments with LOVE/REVEALS provide an opportunity to compare and contrast models, and to begin to generate continental scale reconstructions, with explicit uncertainties, that can provide a base for interdisciplinary research within the biogeosciences. We show how STEPPS provides an important benchmark for past land-cover reconstruction, and how the LandCover 6k effort in North America advances our understanding of the past by allowing cross-continent comparisons using standardized methods and quantifying the impact of humans in the early Anthropocene.
NASA Astrophysics Data System (ADS)
Xu, T.; Valocchi, A. J.; Ye, M.; Liang, F.
2016-12-01
Due to simplification and/or misrepresentation of the real aquifer system, numerical groundwater flow and solute transport models are usually subject to model structural error. During model calibration, the hydrogeological parameters may be overly adjusted to compensate for unknown structural error. This may result in biased predictions when models are used to forecast aquifer response to new forcing. In this study, we extend a fully Bayesian method [Xu and Valocchi, 2015] to calibrate a real-world, regional groundwater flow model. The method uses a data-driven error model to describe model structural error and jointly infers model parameters and structural error. In this study, Bayesian inference is facilitated using high performance computing and fast surrogate models. The surrogate models are constructed using machine learning techniques to emulate the response simulated by the computationally expensive groundwater model. We demonstrate in the real-world case study that explicitly accounting for model structural error yields parameter posterior distributions that are substantially different from those derived by the classical Bayesian calibration that does not account for model structural error. In addition, the Bayesian with error model method gives significantly more accurate prediction along with reasonable credible intervals.
The Bayesian Revolution Approaches Psychological Development
ERIC Educational Resources Information Center
Shultz, Thomas R.
2007-01-01
This commentary reviews five articles that apply Bayesian ideas to psychological development, some with psychology experiments, some with computational modeling, and some with both experiments and modeling. The reviewed work extends the current Bayesian revolution into tasks often studied in children, such as causal learning and word learning, and…
Reducing uncertainty in Climate Response Time Scale by Bayesian Analysis of the 8.2 ka event
NASA Astrophysics Data System (ADS)
Lorenz, A.; Held, H.; Bauer, E.; Schneider von Deimling, T.
2009-04-01
We analyze the possibility of uncertainty reduction in Climate Response Time Scale by utilizing Greenland ice-core data that contain the 8.2 ka event within a Bayesian model-data intercomparison with the Earth system model of intermediate complexity, CLIMBER-2.3. Within a stochastic version of the model it has been possible to mimic the 8.2 ka event within a plausible experimental setting and with relatively good accuracy considering the timing of the event in comparison to other modeling exercises [1]. The simulation of the centennial cold event is effectively determined by the oceanic cooling rate which depends largely on the ocean diffusivity described by diffusion coefficients of relatively wide uncertainty ranges. The idea now is to discriminate between the different values of diffusivities according to their likelihood to rightly represent the duration of the 8.2 ka event and thus to exploit the paleo data to constrain uncertainty in model parameters in analogue to [2]. Implementing this inverse Bayesian Analysis with this model the technical difficulty arises to establish the related likelihood numerically in addition to the uncertain model parameters: While mainstream uncertainty analyses can assume a quasi-Gaussian shape of likelihood, with weather fluctuating around a long term mean, the 8.2 ka event as a highly nonlinear effect precludes such an a priori assumption. As a result of this study [3] the Bayesian Analysis showed a reduction of uncertainty in vertical ocean diffusivity parameters of factor 2 compared to prior knowledge. This learning effect on the model parameters is propagated to other model outputs of interest; e.g. the inverse ocean heat capacity, which is important for the dominant time scale of climate response to anthropogenic forcing which, in combination with climate sensitivity, strongly influences the climate systems reaction for the near- and medium-term future. 1 References [1] E. Bauer, A. Ganopolski, M. Montoya: Simulation of the cold climate event 8200 years ago by meltwater outburst from lake Agassiz. Paleoceanography 19:PA3014, (2004) [2] T. Schneider von Deimling, H. Held, A. Ganopolski, S. Rahmstorf, Climate sensitivity estimated from ensemble simulations of glacial climates, Climate Dynamics 27, 149-163, DOI 10.1007/s00382-006-0126-8 (2006). [3] A. Lorenz, Diploma Thesis, U Potsdam (2007).
Sandoval-Castellanos, Edson; Palkopoulou, Eleftheria; Dalén, Love
2014-01-01
Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martínez-García, Eric E.; González-Lópezlira, Rosa A.; Bruzual A, Gustavo
2017-01-20
Stellar masses of galaxies are frequently obtained by fitting stellar population synthesis models to galaxy photometry or spectra. The state of the art method resolves spatial structures within a galaxy to assess the total stellar mass content. In comparison to unresolved studies, resolved methods yield, on average, higher fractions of stellar mass for galaxies. In this work we improve the current method in order to mitigate a bias related to the resolved spatial distribution derived for the mass. The bias consists in an apparent filamentary mass distribution and a spatial coincidence between mass structures and dust lanes near spiral arms.more » The improved method is based on iterative Bayesian marginalization, through a new algorithm we have named Bayesian Successive Priors (BSP). We have applied BSP to M51 and to a pilot sample of 90 spiral galaxies from the Ohio State University Bright Spiral Galaxy Survey. By quantitatively comparing both methods, we find that the average fraction of stellar mass missed by unresolved studies is only half what previously thought. In contrast with the previous method, the output BSP mass maps bear a better resemblance to near-infrared images.« less
Incorporating approximation error in surrogate based Bayesian inversion
NASA Astrophysics Data System (ADS)
Zhang, J.; Zeng, L.; Li, W.; Wu, L.
2015-12-01
There are increasing interests in applying surrogates for inverse Bayesian modeling to reduce repetitive evaluations of original model. In this way, the computational cost is expected to be saved. However, the approximation error of surrogate model is usually overlooked. This is partly because that it is difficult to evaluate the approximation error for many surrogates. Previous studies have shown that, the direct combination of surrogates and Bayesian methods (e.g., Markov Chain Monte Carlo, MCMC) may lead to biased estimations when the surrogate cannot emulate the highly nonlinear original system. This problem can be alleviated by implementing MCMC in a two-stage manner. However, the computational cost is still high since a relatively large number of original model simulations are required. In this study, we illustrate the importance of incorporating approximation error in inverse Bayesian modeling. Gaussian process (GP) is chosen to construct the surrogate for its convenience in approximation error evaluation. Numerical cases of Bayesian experimental design and parameter estimation for contaminant source identification are used to illustrate this idea. It is shown that, once the surrogate approximation error is well incorporated into Bayesian framework, promising results can be obtained even when the surrogate is directly used, and no further original model simulations are required.
A Bayesian Model of the Memory Colour Effect.
Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects.
A Bayesian Model of the Memory Colour Effect
Olkkonen, Maria; Gegenfurtner, Karl R.
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects. PMID:29760874
Using Bayesian Networks to Improve Knowledge Assessment
ERIC Educational Resources Information Center
Millan, Eva; Descalco, Luis; Castillo, Gladys; Oliveira, Paula; Diogo, Sandra
2013-01-01
In this paper, we describe the integration and evaluation of an existing generic Bayesian student model (GBSM) into an existing computerized testing system within the Mathematics Education Project (PmatE--Projecto Matematica Ensino) of the University of Aveiro. This generic Bayesian student model had been previously evaluated with simulated…
Bayesian Posterior Odds Ratios: Statistical Tools for Collaborative Evaluations
ERIC Educational Resources Information Center
Hicks, Tyler; Rodríguez-Campos, Liliana; Choi, Jeong Hoon
2018-01-01
To begin statistical analysis, Bayesians quantify their confidence in modeling hypotheses with priors. A prior describes the probability of a certain modeling hypothesis apart from the data. Bayesians should be able to defend their choice of prior to a skeptical audience. Collaboration between evaluators and stakeholders could make their choices…
Screening for SNPs with Allele-Specific Methylation based on Next-Generation Sequencing Data
Hu, Bo; Xu, Yaomin
2013-01-01
Allele-specific methylation (ASM) has long been studied but mainly documented in the context of genomic imprinting and X chromosome inactivation. Taking advantage of the next-generation sequencing technology, we conduct a high-throughput sequencing experiment with four prostate cell lines to survey the whole genome and identify single nucleotide polymorphisms (SNPs) with ASM. A Bayesian approach is proposed to model the counts of short reads for each SNP conditional on its genotypes of multiple subjects, leading to a posterior probability of ASM. We flag SNPs with high posterior probabilities of ASM by accounting for multiple comparisons based on posterior false discovery rates. Applying the Bayesian approach to the in-house prostate cell line data, we identify 269 SNPs as candidates of ASM. A simulation study is carried out to demonstrate the quantitative performance of the proposed approach. PMID:23710259
Ahn, Woo-Young; Haines, Nathaniel; Zhang, Lei
2017-01-01
Reinforcement learning and decision-making (RLDM) provide a quantitative framework and computational theories with which we can disentangle psychiatric conditions into the basic dimensions of neurocognitive functioning. RLDM offer a novel approach to assessing and potentially diagnosing psychiatric patients, and there is growing enthusiasm for both RLDM and computational psychiatry among clinical researchers. Such a framework can also provide insights into the brain substrates of particular RLDM processes, as exemplified by model-based analysis of data from functional magnetic resonance imaging (fMRI) or electroencephalography (EEG). However, researchers often find the approach too technical and have difficulty adopting it for their research. Thus, a critical need remains to develop a user-friendly tool for the wide dissemination of computational psychiatric methods. We introduce an R package called hBayesDM (hierarchical Bayesian modeling of Decision-Making tasks), which offers computational modeling of an array of RLDM tasks and social exchange games. The hBayesDM package offers state-of-the-art hierarchical Bayesian modeling, in which both individual and group parameters (i.e., posterior distributions) are estimated simultaneously in a mutually constraining fashion. At the same time, the package is extremely user-friendly: users can perform computational modeling, output visualization, and Bayesian model comparisons, each with a single line of coding. Users can also extract the trial-by-trial latent variables (e.g., prediction errors) required for model-based fMRI/EEG. With the hBayesDM package, we anticipate that anyone with minimal knowledge of programming can take advantage of cutting-edge computational-modeling approaches to investigate the underlying processes of and interactions between multiple decision-making (e.g., goal-directed, habitual, and Pavlovian) systems. In this way, we expect that the hBayesDM package will contribute to the dissemination of advanced modeling approaches and enable a wide range of researchers to easily perform computational psychiatric research within different populations. PMID:29601060
BCM: toolkit for Bayesian analysis of Computational Models using samplers.
Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A
2016-10-21
Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.
Greenhouse Gas Source Attribution: Measurements Modeling and Uncertainty Quantification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Zhen; Safta, Cosmin; Sargsyan, Khachik
2014-09-01
In this project we have developed atmospheric measurement capabilities and a suite of atmospheric modeling and analysis tools that are well suited for verifying emissions of green- house gases (GHGs) on an urban-through-regional scale. We have for the first time applied the Community Multiscale Air Quality (CMAQ) model to simulate atmospheric CO 2 . This will allow for the examination of regional-scale transport and distribution of CO 2 along with air pollutants traditionally studied using CMAQ at relatively high spatial and temporal resolution with the goal of leveraging emissions verification efforts for both air quality and climate. We have developedmore » a bias-enhanced Bayesian inference approach that can remedy the well-known problem of transport model errors in atmospheric CO 2 inversions. We have tested the approach using data and model outputs from the TransCom3 global CO 2 inversion comparison project. We have also performed two prototyping studies on inversion approaches in the generalized convection-diffusion context. One of these studies employed Polynomial Chaos Expansion to accelerate the evaluation of a regional transport model and enable efficient Markov Chain Monte Carlo sampling of the posterior for Bayesian inference. The other approach uses de- terministic inversion of a convection-diffusion-reaction system in the presence of uncertainty. These approaches should, in principle, be applicable to realistic atmospheric problems with moderate adaptation. We outline a regional greenhouse gas source inference system that integrates (1) two ap- proaches of atmospheric dispersion simulation and (2) a class of Bayesian inference and un- certainty quantification algorithms. We use two different and complementary approaches to simulate atmospheric dispersion. Specifically, we use a Eulerian chemical transport model CMAQ and a Lagrangian Particle Dispersion Model - FLEXPART-WRF. These two models share the same WRF assimilated meteorology fields, making it possible to perform a hybrid simulation, in which the Eulerian model (CMAQ) can be used to compute the initial condi- tion needed by the Lagrangian model, while the source-receptor relationships for a large state vector can be efficiently computed using the Lagrangian model in its backward mode. In ad- dition, CMAQ has a complete treatment of atmospheric chemistry of a suite of traditional air pollutants, many of which could help attribute GHGs from different sources. The inference of emissions sources using atmospheric observations is cast as a Bayesian model calibration problem, which is solved using a variety of Bayesian techniques, such as the bias-enhanced Bayesian inference algorithm, which accounts for the intrinsic model deficiency, Polynomial Chaos Expansion to accelerate model evaluation and Markov Chain Monte Carlo sampling, and Karhunen-Lo %60 eve (KL) Expansion to reduce the dimensionality of the state space. We have established an atmospheric measurement site in Livermore, CA and are collect- ing continuous measurements of CO 2 , CH 4 and other species that are typically co-emitted with these GHGs. Measurements of co-emitted species can assist in attributing the GHGs to different emissions sectors. Automatic calibrations using traceable standards are performed routinely for the gas-phase measurements. We are also collecting standard meteorological data at the Livermore site as well as planetary boundary height measurements using a ceilometer. The location of the measurement site is well suited to sample air transported between the San Francisco Bay area and the California Central Valley.« less
Two Approaches to Calibration in Metrology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Campanelli, Mark
2014-04-01
Inferring mathematical relationships with quantified uncertainty from measurement data is common to computational science and metrology. Sufficient knowledge of measurement process noise enables Bayesian inference. Otherwise, an alternative approach is required, here termed compartmentalized inference, because collection of uncertain data and model inference occur independently. Bayesian parameterized model inference is compared to a Bayesian-compatible compartmentalized approach for ISO-GUM compliant calibration problems in renewable energy metrology. In either approach, model evidence can help reduce model discrepancy.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
NASA Astrophysics Data System (ADS)
Validi, AbdoulAhad
2014-03-01
This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow.
Bayesian Normalization Model for Label-Free Quantitative Analysis by LC-MS
Nezami Ranjbar, Mohammad R.; Tadesse, Mahlet G.; Wang, Yue; Ressom, Habtom W.
2016-01-01
We introduce a new method for normalization of data acquired by liquid chromatography coupled with mass spectrometry (LC-MS) in label-free differential expression analysis. Normalization of LC-MS data is desired prior to subsequent statistical analysis to adjust variabilities in ion intensities that are not caused by biological differences but experimental bias. There are different sources of bias including variabilities during sample collection and sample storage, poor experimental design, noise, etc. In addition, instrument variability in experiments involving a large number of LC-MS runs leads to a significant drift in intensity measurements. Although various methods have been proposed for normalization of LC-MS data, there is no universally applicable approach. In this paper, we propose a Bayesian normalization model (BNM) that utilizes scan-level information from LC-MS data. Specifically, the proposed method uses peak shapes to model the scan-level data acquired from extracted ion chromatograms (EIC) with parameters considered as a linear mixed effects model. We extended the model into BNM with drift (BNMD) to compensate for the variability in intensity measurements due to long LC-MS runs. We evaluated the performance of our method using synthetic and experimental data. In comparison with several existing methods, the proposed BNM and BNMD yielded significant improvement. PMID:26357332
ERIC Educational Resources Information Center
Hsieh, Chueh-An; Maier, Kimberly S.
2009-01-01
The capacity of Bayesian methods in estimating complex statistical models is undeniable. Bayesian data analysis is seen as having a range of advantages, such as an intuitive probabilistic interpretation of the parameters of interest, the efficient incorporation of prior information to empirical data analysis, model averaging and model selection.…
Verveer, P. J; Gemkow, M. J; Jovin, T. M
1999-01-01
We have compared different image restoration approaches for fluorescence microscopy. The most widely used algorithms were classified with a Bayesian theory according to the assumed noise model and the type of regularization imposed. We considered both Gaussian and Poisson models for the noise in combination with Tikhonov regularization, entropy regularization, Good's roughness and without regularization (maximum likelihood estimation). Simulations of fluorescence confocal imaging were used to examine the different noise models and regularization approaches using the mean squared error criterion. The assumption of a Gaussian noise model yielded only slightly higher errors than the Poisson model. Good's roughness was the best choice for the regularization. Furthermore, we compared simulated confocal and wide-field data. In general, restored confocal data are superior to restored wide-field data, but given sufficient higher signal level for the wide-field data the restoration result may rival confocal data in quality. Finally, a visual comparison of experimental confocal and wide-field data is presented.
Brown, Alexandra R; Gajewski, Byron J; Aaronson, Lauren S; Mudaranthakam, Dinesh Pal; Hunt, Suzanne L; Berry, Scott M; Quintana, Melanie; Pasnoor, Mamatha; Dimachkie, Mazen M; Jawdat, Omar; Herbelin, Laura; Barohn, Richard J
2016-08-31
In the last few decades, the number of trials using Bayesian methods has grown rapidly. Publications prior to 1990 included only three clinical trials that used Bayesian methods, but that number quickly jumped to 19 in the 1990s and to 99 from 2000 to 2012. While this literature provides many examples of Bayesian Adaptive Designs (BAD), none of the papers that are available walks the reader through the detailed process of conducting a BAD. This paper fills that gap by describing the BAD process used for one comparative effectiveness trial (Patient Assisted Intervention for Neuropathy: Comparison of Treatment in Real Life Situations) that can be generalized for use by others. A BAD was chosen with efficiency in mind. Response-adaptive randomization allows the potential for substantially smaller sample sizes, and can provide faster conclusions about which treatment or treatments are most effective. An Internet-based electronic data capture tool, which features a randomization module, facilitated data capture across study sites and an in-house computation software program was developed to implement the response-adaptive randomization. A process for adapting randomization with minimal interruption to study sites was developed. A new randomization table can be generated quickly and can be seamlessly integrated in the data capture tool with minimal interruption to study sites. This manuscript is the first to detail the technical process used to evaluate a multisite comparative effectiveness trial using adaptive randomization. An important opportunity for the application of Bayesian trials is in comparative effectiveness trials. The specific case study presented in this paper can be used as a model for conducting future clinical trials using a combination of statistical software and a web-based application. ClinicalTrials.gov Identifier: NCT02260388 , registered on 6 October 2014.
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-04-01
Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
NASA Astrophysics Data System (ADS)
Laloy, Eric; Beerten, Koen; Vanacker, Veerle; Christl, Marcus; Rogiers, Bart; Wouters, Laurent
2017-07-01
The rate at which low-lying sandy areas in temperate regions, such as the Campine Plateau (NE Belgium), have been eroding during the Quaternary is a matter of debate. Current knowledge on the average pace of landscape evolution in the Campine area is largely based on geological inferences and modern analogies. We performed a Bayesian inversion of an in situ-produced 10Be concentration depth profile to infer the average long-term erosion rate together with two other parameters: the surface exposure age and the inherited 10Be concentration. Compared to the latest advances in probabilistic inversion of cosmogenic radionuclide (CRN) data, our approach has the following two innovative components: it (1) uses Markov chain Monte Carlo (MCMC) sampling and (2) accounts (under certain assumptions) for the contribution of model errors to posterior uncertainty. To investigate to what extent our approach differs from the state of the art in practice, a comparison against the Bayesian inversion method implemented in the CRONUScalc program is made. Both approaches identify similar maximum a posteriori (MAP) parameter values, but posterior parameter and predictive uncertainty derived using the method taken in CRONUScalc is moderately underestimated. A simple way for producing more consistent uncertainty estimates with the CRONUScalc-like method in the presence of model errors is therefore suggested. Our inferred erosion rate of 39 ± 8. 9 mm kyr-1 (1σ) is relatively large in comparison with landforms that erode under comparable (paleo-)climates elsewhere in the world. We evaluate this value in the light of the erodibility of the substrate and sudden base level lowering during the Middle Pleistocene. A denser sampling scheme of a two-nuclide concentration depth profile would allow for better inferred erosion rate resolution, and including more uncertain parameters in the MCMC inversion.
Dynamic Bayesian network modeling for longitudinal brain morphometry
Chen, Rong; Resnick, Susan M; Davatzikos, Christos; Herskovits, Edward H
2011-01-01
Identifying interactions among brain regions from structural magnetic-resonance images presents one of the major challenges in computational neuroanatomy. We propose a Bayesian data-mining approach to the detection of longitudinal morphological changes in the human brain. Our method uses a dynamic Bayesian network to represent evolving inter-regional dependencies. The major advantage of dynamic Bayesian network modeling is that it can represent complicated interactions among temporal processes. We validated our approach by analyzing a simulated atrophy study, and found that this approach requires only a small number of samples to detect the ground-truth temporal model. We further applied dynamic Bayesian network modeling to a longitudinal study of normal aging and mild cognitive impairment — the Baltimore Longitudinal Study of Aging. We found that interactions among regional volume-change rates for the mild cognitive impairment group are different from those for the normal-aging group. PMID:21963916
Hollenbeak, Christopher S
2005-10-15
While risk-adjusted outcomes are often used to compare the performance of hospitals and physicians, the most appropriate functional form for the risk adjustment process is not always obvious for continuous outcomes such as costs. Semi-log models are used most often to correct skewness in cost data, but there has been limited research to determine whether the log transformation is sufficient or whether another transformation is more appropriate. This study explores the most appropriate functional form for risk-adjusting the cost of coronary artery bypass graft (CABG) surgery. Data included patients undergoing CABG surgery at four hospitals in the midwest and were fit to a Box-Cox model with random coefficients (BCRC) using Markov chain Monte Carlo methods. Marginal likelihoods and Bayes factors were computed to perform model comparison of alternative model specifications. Rankings of hospital performance were created from the simulation output and the rankings produced by Bayesian estimates were compared to rankings produced by standard models fit using classical methods. Results suggest that, for these data, the most appropriate functional form is not logarithmic, but corresponds to a Box-Cox transformation of -1. Furthermore, Bayes factors overwhelmingly rejected the natural log transformation. However, the hospital ranking induced by the BCRC model was not different from the ranking produced by maximum likelihood estimates of either the linear or semi-log model. Copyright (c) 2005 John Wiley & Sons, Ltd.
J-Plus: Morphological Classification Of Compact And Extended Sources By Pdf Analysis
NASA Astrophysics Data System (ADS)
López-Sanjuan, C.; Vázquez-Ramió, H.; Varela, J.; Spinoso, D.; Cristóbal-Hornillos, D.; Viironen, K.; Muniesa, D.; J-PLUS Collaboration
2017-10-01
We present a morphological classification of J-PLUS EDR sources into compact (i.e. stars) and extended (i.e. galaxies). Such classification is based on the Bayesian modelling of the concentration distribution, including observational errors and magnitude + sky position priors. We provide the star / galaxy probability of each source computed from the gri images. The comparison with the SDSS number counts support our classification up to r 21. The 31.7 deg² analised comprises 150k stars and 101k galaxies.
Classifying emotion in Twitter using Bayesian network
NASA Astrophysics Data System (ADS)
Surya Asriadie, Muhammad; Syahrul Mubarok, Mohamad; Adiwijaya
2018-03-01
Language is used to express not only facts, but also emotions. Emotions are noticeable from behavior up to the social media statuses written by a person. Analysis of emotions in a text is done in a variety of media such as Twitter. This paper studies classification of emotions on twitter using Bayesian network because of its ability to model uncertainty and relationships between features. The result is two models based on Bayesian network which are Full Bayesian Network (FBN) and Bayesian Network with Mood Indicator (BNM). FBN is a massive Bayesian network where each word is treated as a node. The study shows the method used to train FBN is not very effective to create the best model and performs worse compared to Naive Bayes. F1-score for FBN is 53.71%, while for Naive Bayes is 54.07%. BNM is proposed as an alternative method which is based on the improvement of Multinomial Naive Bayes and has much lower computational complexity compared to FBN. Even though it’s not better compared to FBN, the resulting model successfully improves the performance of Multinomial Naive Bayes. F1-Score for Multinomial Naive Bayes model is 51.49%, while for BNM is 52.14%.
A Bayesian analysis of the 2016 Pedernales (Ecuador) earthquake rupture process
NASA Astrophysics Data System (ADS)
Gombert, B.; Duputel, Z.; Jolivet, R.; Rivera, L. A.; Simons, M.; Jiang, J.; Liang, C.; Fielding, E. J.
2017-12-01
The 2016 Mw = 7.8 Pedernales earthquake is the largest event to strike Ecuador since 1979. Long period W-phase and Global CMT solutions suggest that slip is not perpendicular to the trench axis, in agreement with the convergence obliquity of the Ecuadorian subduction. In this study, we propose a new co-seismic kinematic slip model obtained from the joint inversion of multiple observations in an unregularized and fully Bayesian framework. We use a comprehensive static dataset composed of several InSAR scenes, GPS static offsets, and tsunami waveforms from two nearby DART stations. The kinematic component of the rupture process is constrained by an extensive network of High-Rate GPS and accelerometers. Our solution includes the ensemble of all plausible models that are consistent with our prior information and fit the available observations within data and prediction uncertainties. We analyse the source process in light of the historical seismicity, in particular the Mw = 7.8 1942 earthquake for which the rupture extent overlaps with the 2016 event. In addition, we conduct a probabilistic comparison of co-seismic slip with a stochastic interseismic coupling model obtained from GPS data, putting a light on the processes at play within the Ecuadorian subduction margin.
Algorithmic procedures for Bayesian MEG/EEG source reconstruction in SPM.
López, J D; Litvak, V; Espinosa, J J; Friston, K; Barnes, G R
2014-01-01
The MEG/EEG inverse problem is ill-posed, giving different source reconstructions depending on the initial assumption sets. Parametric Empirical Bayes allows one to implement most popular MEG/EEG inversion schemes (Minimum Norm, LORETA, etc.) within the same generic Bayesian framework. It also provides a cost-function in terms of the variational Free energy-an approximation to the marginal likelihood or evidence of the solution. In this manuscript, we revisit the algorithm for MEG/EEG source reconstruction with a view to providing a didactic and practical guide. The aim is to promote and help standardise the development and consolidation of other schemes within the same framework. We describe the implementation in the Statistical Parametric Mapping (SPM) software package, carefully explaining each of its stages with the help of a simple simulated data example. We focus on the Multiple Sparse Priors (MSP) model, which we compare with the well-known Minimum Norm and LORETA models, using the negative variational Free energy for model comparison. The manuscript is accompanied by Matlab scripts to allow the reader to test and explore the underlying algorithm. © 2013. Published by Elsevier Inc. All rights reserved.
Mohammed, Seid; Asfaw, Zeytu G
2018-01-01
The term malnutrition generally refers to both under-nutrition and over-nutrition, but this study uses the term to refer solely to a deficiency of nutrition. In Ethiopia, child malnutrition is one of the most serious public health problem and the highest in the world. The purpose of the present study was to identify the high risk factors of malnutrition and test different statistical models for childhood malnutrition and, thereafter weighing the preferable model through model comparison criteria. Bayesian Gaussian regression model was used to analyze the effect of selected socioeconomic, demographic, health and environmental covariates on malnutrition under five years old child's. Inference was made using Bayesian approach based on Markov Chain Monte Carlo (MCMC) simulation techniques in BayesX. The study found that the variables such as sex of a child, preceding birth interval, age of the child, father's education level, source of water, mother's body mass index, head of household sex, mother's age at birth, wealth index, birth order, diarrhea, child's size at birth and duration of breast feeding showed significant effects on children's malnutrition in Ethiopia. The age of child, mother's age at birth and mother's body mass index could also be important factors with a non linear effect for the child's malnutrition in Ethiopia. Thus, the present study emphasizes a special care on variables such as sex of child, preceding birth interval, father's education level, source of water, sex of head of household, wealth index, birth order, diarrhea, child's size at birth, duration of breast feeding, age of child, mother's age at birth and mother's body mass index to combat childhood malnutrition in developing countries.
A Bayesian Supertree Model for Genome-Wide Species Tree Reconstruction
De Oliveira Martins, Leonardo; Mallo, Diego; Posada, David
2016-01-01
Current phylogenomic data sets highlight the need for species tree methods able to deal with several sources of gene tree/species tree incongruence. At the same time, we need to make most use of all available data. Most species tree methods deal with single processes of phylogenetic discordance, namely, gene duplication and loss, incomplete lineage sorting (ILS) or horizontal gene transfer. In this manuscript, we address the problem of species tree inference from multilocus, genome-wide data sets regardless of the presence of gene duplication and loss and ILS therefore without the need to identify orthologs or to use a single individual per species. We do this by extending the idea of Maximum Likelihood (ML) supertrees to a hierarchical Bayesian model where several sources of gene tree/species tree disagreement can be accounted for in a modular manner. We implemented this model in a computer program called guenomu whose inputs are posterior distributions of unrooted gene tree topologies for multiple gene families, and whose output is the posterior distribution of rooted species tree topologies. We conducted extensive simulations to evaluate the performance of our approach in comparison with other species tree approaches able to deal with more than one leaf from the same species. Our method ranked best under simulated data sets, in spite of ignoring branch lengths, and performed well on empirical data, as well as being fast enough to analyze relatively large data sets. Our Bayesian supertree method was also very successful in obtaining better estimates of gene trees, by reducing the uncertainty in their distributions. In addition, our results show that under complex simulation scenarios, gene tree parsimony is also a competitive approach once we consider its speed, in contrast to more sophisticated models. PMID:25281847
Bayesian network models for error detection in radiotherapy plans
NASA Astrophysics Data System (ADS)
Kalet, Alan M.; Gennari, John H.; Ford, Eric C.; Phillips, Mark H.
2015-04-01
The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network’s conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures.
Additive Genetic Variability and the Bayesian Alphabet
Gianola, Daniel; de los Campos, Gustavo; Hill, William G.; Manfredi, Eduardo; Fernando, Rohan
2009-01-01
The use of all available molecular markers in statistical models for prediction of quantitative traits has led to what could be termed a genomic-assisted selection paradigm in animal and plant breeding. This article provides a critical review of some theoretical and statistical concepts in the context of genomic-assisted genetic evaluation of animals and crops. First, relationships between the (Bayesian) variance of marker effects in some regression models and additive genetic variance are examined under standard assumptions. Second, the connection between marker genotypes and resemblance between relatives is explored, and linkages between a marker-based model and the infinitesimal model are reviewed. Third, issues associated with the use of Bayesian models for marker-assisted selection, with a focus on the role of the priors, are examined from a theoretical angle. The sensitivity of a Bayesian specification that has been proposed (called “Bayes A”) with respect to priors is illustrated with a simulation. Methods that can solve potential shortcomings of some of these Bayesian regression procedures are discussed briefly. PMID:19620397
Analyzing thresholds and efficiency with hierarchical Bayesian logistic regression.
Houpt, Joseph W; Bittner, Jennifer L
2018-07-01
Ideal observer analysis is a fundamental tool used widely in vision science for analyzing the efficiency with which a cognitive or perceptual system uses available information. The performance of an ideal observer provides a formal measure of the amount of information in a given experiment. The ratio of human to ideal performance is then used to compute efficiency, a construct that can be directly compared across experimental conditions while controlling for the differences due to the stimuli and/or task specific demands. In previous research using ideal observer analysis, the effects of varying experimental conditions on efficiency have been tested using ANOVAs and pairwise comparisons. In this work, we present a model that combines Bayesian estimates of psychometric functions with hierarchical logistic regression for inference about both unadjusted human performance metrics and efficiencies. Our approach improves upon the existing methods by constraining the statistical analysis using a standard model connecting stimulus intensity to human observer accuracy and by accounting for variability in the estimates of human and ideal observer performance scores. This allows for both individual and group level inferences. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Plant, N. G.; Thieler, E. R.; Gutierrez, B.; Lentz, E. E.; Zeigler, S. L.; Van Dongeren, A.; Fienen, M. N.
2016-12-01
We evaluate the strengths and weaknesses of Bayesian networks that have been used to address scientific and decision-support questions related to coastal geomorphology. We will provide an overview of coastal geomorphology research that has used Bayesian networks and describe what this approach can do and when it works (or fails to work). Over the past decade, Bayesian networks have been formulated to analyze the multi-variate structure and evolution of coastal morphology and associated human and ecological impacts. The approach relates observable system variables to each other by estimating discrete correlations. The resulting Bayesian-networks make predictions that propagate errors, conduct inference via Bayes rule, or both. In scientific applications, the model results are useful for hypothesis testing, using confidence estimates to gage the strength of tests while applications to coastal resource management are aimed at decision-support, where the probabilities of desired ecosystems outcomes are evaluated. The range of Bayesian-network applications to coastal morphology includes emulation of high-resolution wave transformation models to make oceanographic predictions, morphologic response to storms and/or sea-level rise, groundwater response to sea-level rise and morphologic variability, habitat suitability for endangered species, and assessment of monetary or human-life risk associated with storms. All of these examples are based on vast observational data sets, numerical model output, or both. We will discuss the progression of our experiments, which has included testing whether the Bayesian-network approach can be implemented and is appropriate for addressing basic and applied scientific problems and evaluating the hindcast and forecast skill of these implementations. We will present and discuss calibration/validation tests that are used to assess the robustness of Bayesian-network models and we will compare these results to tests of other models. This will demonstrate how Bayesian networks are used to extract new insights about coastal morphologic behavior, assess impacts to societal and ecological systems, and communicate probabilistic predictions to decision makers.
Comparison of dark energy models: A perspective from the latest observational data
NASA Astrophysics Data System (ADS)
Li, Miao; Li, Xiaodong; Zhang, Xin
2010-09-01
We compare some popular dark energy models under the assumption of a flat universe by using the latest observational data including the type Ia supernovae Constitution compilation, the baryon acoustic oscillation measurement from the Sloan Digital Sky Survey, the cosmic microwave background measurement given by the seven-year Wilkinson Microwave Anisotropy Probe observations and the determination of H 0 from the Hubble Space Telescope. Model comparison statistics such as the Bayesian and Akaike information criteria are applied to assess the worth of the models. These statistics favor models that give a good fit with fewer parameters. Based on this analysis, we find that the simplest cosmological constant model that has only one free parameter is still preferred by the current data. For other dynamical dark energy models, we find that some of them, such as the α dark energy, constant w, generalized Chaplygin gas, Chevalliear-Polarski-Linder parametrization, and holographic dark energy models, can provide good fits to the current data, and three of them, namely, the Ricci dark energy, agegraphic dark energy, and Dvali-Gabadadze-Porrati models, are clearly disfavored by the data.
A comment on priors for Bayesian occupancy models.
Northrup, Joseph M; Gerber, Brian D
2018-01-01
Understanding patterns of species occurrence and the processes underlying these patterns is fundamental to the study of ecology. One of the more commonly used approaches to investigate species occurrence patterns is occupancy modeling, which can account for imperfect detection of a species during surveys. In recent years, there has been a proliferation of Bayesian modeling in ecology, which includes fitting Bayesian occupancy models. The Bayesian framework is appealing to ecologists for many reasons, including the ability to incorporate prior information through the specification of prior distributions on parameters. While ecologists almost exclusively intend to choose priors so that they are "uninformative" or "vague", such priors can easily be unintentionally highly informative. Here we report on how the specification of a "vague" normally distributed (i.e., Gaussian) prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation. Using both simulated data and empirical examples, we illustrate how this issue likely compromises inference about species-habitat relationships. While the extent to which these informative priors influence inference depends on the data set, researchers fitting Bayesian occupancy models should conduct sensitivity analyses to ensure intended inference, or employ less commonly used priors that are less informative (e.g., logistic or t prior distributions). We provide suggestions for addressing this issue in occupancy studies, and an online tool for exploring this issue under different contexts.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marzouk, Youssef
Predictive simulation of complex physical systems increasingly rests on the interplay of experimental observations with computational models. Key inputs, parameters, or structural aspects of models may be incomplete or unknown, and must be developed from indirect and limited observations. At the same time, quantified uncertainties are needed to qualify computational predictions in the support of design and decision-making. In this context, Bayesian statistics provides a foundation for inference from noisy and limited data, but at prohibitive computional expense. This project intends to make rigorous predictive modeling *feasible* in complex physical systems, via accelerated and scalable tools for uncertainty quantification, Bayesianmore » inference, and experimental design. Specific objectives are as follows: 1. Develop adaptive posterior approximations and dimensionality reduction approaches for Bayesian inference in high-dimensional nonlinear systems. 2. Extend accelerated Bayesian methodologies to large-scale {\\em sequential} data assimilation, fully treating nonlinear models and non-Gaussian state and parameter distributions. 3. Devise efficient surrogate-based methods for Bayesian model selection and the learning of model structure. 4. Develop scalable simulation/optimization approaches to nonlinear Bayesian experimental design, for both parameter inference and model selection. 5. Demonstrate these inferential tools on chemical kinetic models in reacting flow, constructing and refining thermochemical and electrochemical models from limited data. Demonstrate Bayesian filtering on canonical stochastic PDEs and in the dynamic estimation of inhomogeneous subsurface properties and flow fields.« less
Bayesian estimation inherent in a Mexican-hat-type neural network
NASA Astrophysics Data System (ADS)
Takiyama, Ken
2016-05-01
Brain functions, such as perception, motor control and learning, and decision making, have been explained based on a Bayesian framework, i.e., to decrease the effects of noise inherent in the human nervous system or external environment, our brain integrates sensory and a priori information in a Bayesian optimal manner. However, it remains unclear how Bayesian computations are implemented in the brain. Herein, I address this issue by analyzing a Mexican-hat-type neural network, which was used as a model of the visual cortex, motor cortex, and prefrontal cortex. I analytically demonstrate that the dynamics of an order parameter in the model corresponds exactly to a variational inference of a linear Gaussian state-space model, a Bayesian estimation, when the strength of recurrent synaptic connectivity is appropriately stronger than that of an external stimulus, a plausible condition in the brain. This exact correspondence can reveal the relationship between the parameters in the Bayesian estimation and those in the neural network, providing insight for understanding brain functions.
Bayesian Modeling of Exposure and Airflow Using Two-Zone Models
Zhang, Yufen; Banerjee, Sudipto; Yang, Rui; Lungu, Claudiu; Ramachandran, Gurumurthy
2009-01-01
Mathematical modeling is being increasingly used as a means for assessing occupational exposures. However, predicting exposure in real settings is constrained by lack of quantitative knowledge of exposure determinants. Validation of models in occupational settings is, therefore, a challenge. Not only do the model parameters need to be known, the models also need to predict the output with some degree of accuracy. In this paper, a Bayesian statistical framework is used for estimating model parameters and exposure concentrations for a two-zone model. The model predicts concentrations in a zone near the source and far away from the source as functions of the toluene generation rate, air ventilation rate through the chamber, and the airflow between near and far fields. The framework combines prior or expert information on the physical model along with the observed data. The framework is applied to simulated data as well as data obtained from the experiments conducted in a chamber. Toluene vapors are generated from a source under different conditions of airflow direction, the presence of a mannequin, and simulated body heat of the mannequin. The Bayesian framework accounts for uncertainty in measurement as well as in the unknown rate of airflow between the near and far fields. The results show that estimates of the interzonal airflow are always close to the estimated equilibrium solutions, which implies that the method works efficiently. The predictions of near-field concentration for both the simulated and real data show nice concordance with the true values, indicating that the two-zone model assumptions agree with the reality to a large extent and the model is suitable for predicting the contaminant concentration. Comparison of the estimated model and its margin of error with the experimental data thus enables validation of the physical model assumptions. The approach illustrates how exposure models and information on model parameters together with the knowledge of uncertainty and variability in these quantities can be used to not only provide better estimates of model outputs but also model parameters. PMID:19403840
2016-10-01
and implementation of embedded, adaptive feedback and performance assessment. The investigators also initiated work designing a Bayesian Belief ...training; Teamwork; Adaptive performance; Leadership; Simulation; Modeling; Bayesian belief networks (BBN) 16. SECURITY CLASSIFICATION OF: 17. LIMITATION...Trauma teams Team training Teamwork Adaptability Adaptive performance Leadership Simulation Modeling Bayesian belief networks (BBN) 6
ERIC Educational Resources Information Center
West, Patti; Rutstein, Daisy Wise; Mislevy, Robert J.; Liu, Junhui; Choi, Younyoung; Levy, Roy; Crawford, Aaron; DiCerbo, Kristen E.; Chappel, Kristina; Behrens, John T.
2010-01-01
A major issue in the study of learning progressions (LPs) is linking student performance on assessment tasks to the progressions. This report describes the challenges faced in making this linkage using Bayesian networks to model LPs in the field of computer networking. The ideas are illustrated with exemplar Bayesian networks built on Cisco…
Nonparametric Bayesian Modeling for Automated Database Schema Matching
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferragut, Erik M; Laska, Jason A
2015-01-01
The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.
Development of dynamic Bayesian models for web application test management
NASA Astrophysics Data System (ADS)
Azarnova, T. V.; Polukhin, P. V.; Bondarenko, Yu V.; Kashirina, I. L.
2018-03-01
The mathematical apparatus of dynamic Bayesian networks is an effective and technically proven tool that can be used to model complex stochastic dynamic processes. According to the results of the research, mathematical models and methods of dynamic Bayesian networks provide a high coverage of stochastic tasks associated with error testing in multiuser software products operated in a dynamically changing environment. Formalized representation of the discrete test process as a dynamic Bayesian model allows us to organize the logical connection between individual test assets for multiple time slices. This approach gives an opportunity to present testing as a discrete process with set structural components responsible for the generation of test assets. Dynamic Bayesian network-based models allow us to combine in one management area individual units and testing components with different functionalities and a direct influence on each other in the process of comprehensive testing of various groups of computer bugs. The application of the proposed models provides an opportunity to use a consistent approach to formalize test principles and procedures, methods used to treat situational error signs, and methods used to produce analytical conclusions based on test results.
NASA Astrophysics Data System (ADS)
Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean
2016-04-01
A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.
Bayesian Models Leveraging Bioactivity and Cytotoxicity Information for Drug Discovery
Ekins, Sean; Reynolds, Robert C.; Kim, Hiyun; Koo, Mi-Sun; Ekonomidis, Marilyn; Talaue, Meliza; Paget, Steve D.; Woolhiser, Lisa K.; Lenaerts, Anne J.; Bunin, Barry A.; Connell, Nancy; Freundlich, Joel S.
2013-01-01
SUMMARY Identification of unique leads represents a significant challenge in drug discovery. This hurdle is magnified in neglected diseases such as tuberculosis. We have leveraged public high-throughput screening (HTS) data, to experimentally validate virtual screening approach employing Bayesian models built with bioactivity information (single-event model) as well as bioactivity and cytotoxicity information (dual-event model). We virtually screen a commercial library and experimentally confirm actives with hit rates exceeding typical HTS results by 1-2 orders of magnitude. The first dual-event Bayesian model identified compounds with antitubercular whole-cell activity and low mammalian cell cytotoxicity from a published set of antimalarials. The most potent hit exhibits the in vitro activity and in vitro/in vivo safety profile of a drug lead. These Bayesian models offer significant economies in time and cost to drug discovery. PMID:23521795
Bayesian data analysis for newcomers.
Kruschke, John K; Liddell, Torrin M
2018-02-01
This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.
Carvajal, Guido; Roser, David J; Sisson, Scott A; Keegan, Alexandra; Khan, Stuart J
2015-11-15
Risk management for wastewater treatment and reuse have led to growing interest in understanding and optimising pathogen reduction during biological treatment processes. However, modelling pathogen reduction is often limited by poor characterization of the relationships between variables and incomplete knowledge of removal mechanisms. The aim of this paper was to assess the applicability of Bayesian belief network models to represent associations between pathogen reduction, and operating conditions and monitoring parameters and predict AS performance. Naïve Bayes and semi-naïve Bayes networks were constructed from an activated sludge dataset including operating and monitoring parameters, and removal efficiencies for two pathogens (native Giardia lamblia and seeded Cryptosporidium parvum) and five native microbial indicators (F-RNA bacteriophage, Clostridium perfringens, Escherichia coli, coliforms and enterococci). First we defined the Bayesian network structures for the two pathogen log10 reduction values (LRVs) class nodes discretized into two states (< and ≥ 1 LRV) using two different learning algorithms. Eight metrics, such as Prediction Accuracy (PA) and Area Under the receiver operating Curve (AUC), provided a comparison of model prediction performance, certainty and goodness of fit. This comparison was used to select the optimum models. The optimum Tree Augmented naïve models predicted removal efficiency with high AUC when all system parameters were used simultaneously (AUCs for C. parvum and G. lamblia LRVs of 0.95 and 0.87 respectively). However, metrics for individual system parameters showed only the C. parvum model was reliable. By contrast individual parameters for G. lamblia LRV prediction typically obtained low AUC scores (AUC < 0.81). Useful predictors for C. parvum LRV included solids retention time, turbidity and total coliform LRV. The methodology developed appears applicable for predicting pathogen removal efficiency in water treatment systems generally. Copyright © 2015 Elsevier Ltd. All rights reserved.
Evaluation of calibration efficacy under different levels of uncertainty
Heo, Yeonsook; Graziano, Diane J.; Guzowski, Leah; ...
2014-06-10
This study examines how calibration performs under different levels of uncertainty in model input data. It specifically assesses the efficacy of Bayesian calibration to enhance the reliability of EnergyPlus model predictions. A Bayesian approach can be used to update uncertain values of parameters, given measured energy-use data, and to quantify the associated uncertainty.We assess the efficacy of Bayesian calibration under a controlled virtual-reality setup, which enables rigorous validation of the accuracy of calibration results in terms of both calibrated parameter values and model predictions. Case studies demonstrate the performance of Bayesian calibration of base models developed from audit data withmore » differing levels of detail in building design, usage, and operation.« less
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Universal Darwinism As a Process of Bayesian Inference
Campbell, John O.
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an “experiment” in the external world environment, and the results of that “experiment” or the “surprise” entailed by predicted and actual outcomes of the “experiment.” Minimization of free energy implies that the implicit measure of “surprise” experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature. PMID:27375438
A Bayesian hierarchical diffusion model decomposition of performance in Approach–Avoidance Tasks
Krypotos, Angelos-Miltiadis; Beckers, Tom; Kindt, Merel; Wagenmakers, Eric-Jan
2015-01-01
Common methods for analysing response time (RT) tasks, frequently used across different disciplines of psychology, suffer from a number of limitations such as the failure to directly measure the underlying latent processes of interest and the inability to take into account the uncertainty associated with each individual's point estimate of performance. Here, we discuss a Bayesian hierarchical diffusion model and apply it to RT data. This model allows researchers to decompose performance into meaningful psychological processes and to account optimally for individual differences and commonalities, even with relatively sparse data. We highlight the advantages of the Bayesian hierarchical diffusion model decomposition by applying it to performance on Approach–Avoidance Tasks, widely used in the emotion and psychopathology literature. Model fits for two experimental data-sets demonstrate that the model performs well. The Bayesian hierarchical diffusion model overcomes important limitations of current analysis procedures and provides deeper insight in latent psychological processes of interest. PMID:25491372
Shen, Yanna; Cooper, Gregory F
2012-09-01
This paper investigates Bayesian modeling of known and unknown causes of events in the context of disease-outbreak detection. We introduce a multivariate Bayesian approach that models multiple evidential features of every person in the population. This approach models and detects (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A contribution of this paper is that it introduces a multivariate Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has general applicability in domains where the space of known causes is incomplete. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Tutorial: Asteroseismic Data Analysis with DIAMONDS
NASA Astrophysics Data System (ADS)
Corsaro, Enrico
Since the advent of the space-based photometric missions such as CoRoT and NASA's Kepler, asteroseismology has acquired a central role in our understanding about stellar physics. The Kepler spacecraft, especially, is still releasing excellent photometric observations that contain a large amount of information not yet investigated. For exploiting the full potential of these data, sophisticated and robust analysis tools are now essential, so that further constraining of stellar structure and evolutionary models can be obtained. In addition, extracting detailed asteroseismic properties for many stars can yield new insights on their correlations to fundamental stellar properties and dynamics. After a brief introduction to the Bayesian notion of probability, I describe the code Diamonds for Bayesian parameter estimation and model comparison by means of the nested sampling Monte Carlo (NSMC) algorithm. NSMC constitutes an efficient and powerful method, in replacement to standard Markov chain Monte Carlo, very suitable for high-dimensional and multimodal problems that are typical of detailed asteroseismic analyses, such as the fitting and mode identification of individual oscillation modes in stars (known as peak-bagging). Diamonds is able to provide robust results for statistical inferences involving tens of individual oscillation modes, while at the same time preserving a considerable computational efficiency for identifying the solution. In the tutorial, I will present the fitting of the stellar background signal and the peak-bagging analysis of the oscillation modes in a red-giant star, providing an example to use Bayesian evidence for assessing the peak significance of the fitted oscillation peaks.
Pressman, Alice R; Avins, Andrew L; Hubbard, Alan; Satariano, William A
2011-07-01
There is a paucity of literature comparing Bayesian analytic techniques with traditional approaches for analyzing clinical trials using real trial data. We compared Bayesian and frequentist group sequential methods using data from two published clinical trials. We chose two widely accepted frequentist rules, O'Brien-Fleming and Lan-DeMets, and conjugate Bayesian priors. Using the nonparametric bootstrap, we estimated a sampling distribution of stopping times for each method. Because current practice dictates the preservation of an experiment-wise false positive rate (Type I error), we approximated these error rates for our Bayesian and frequentist analyses with the posterior probability of detecting an effect in a simulated null sample. Thus for the data-generated distribution represented by these trials, we were able to compare the relative performance of these techniques. No final outcomes differed from those of the original trials. However, the timing of trial termination differed substantially by method and varied by trial. For one trial, group sequential designs of either type dictated early stopping of the study. In the other, stopping times were dependent upon the choice of spending function and prior distribution. Results indicate that trialists ought to consider Bayesian methods in addition to traditional approaches for analysis of clinical trials. Though findings from this small sample did not demonstrate either method to consistently outperform the other, they did suggest the need to replicate these comparisons using data from varied clinical trials in order to determine the conditions under which the different methods would be most efficient. Copyright © 2011 Elsevier Inc. All rights reserved.
Pressman, Alice R.; Avins, Andrew L.; Hubbard, Alan; Satariano, William A.
2014-01-01
Background There is a paucity of literature comparing Bayesian analytic techniques with traditional approaches for analyzing clinical trials using real trial data. Methods We compared Bayesian and frequentist group sequential methods using data from two published clinical trials. We chose two widely accepted frequentist rules, O'Brien–Fleming and Lan–DeMets, and conjugate Bayesian priors. Using the nonparametric bootstrap, we estimated a sampling distribution of stopping times for each method. Because current practice dictates the preservation of an experiment-wise false positive rate (Type I error), we approximated these error rates for our Bayesian and frequentist analyses with the posterior probability of detecting an effect in a simulated null sample. Thus for the data-generated distribution represented by these trials, we were able to compare the relative performance of these techniques. Results No final outcomes differed from those of the original trials. However, the timing of trial termination differed substantially by method and varied by trial. For one trial, group sequential designs of either type dictated early stopping of the study. In the other, stopping times were dependent upon the choice of spending function and prior distribution. Conclusions Results indicate that trialists ought to consider Bayesian methods in addition to traditional approaches for analysis of clinical trials. Though findings from this small sample did not demonstrate either method to consistently outperform the other, they did suggest the need to replicate these comparisons using data from varied clinical trials in order to determine the conditions under which the different methods would be most efficient. PMID:21453792
Hadwin, Paul J; Peterson, Sean D
2017-04-01
The Bayesian framework for parameter inference provides a basis from which subject-specific reduced-order vocal fold models can be generated. Previously, it has been shown that a particle filter technique is capable of producing estimates and associated credibility intervals of time-varying reduced-order vocal fold model parameters. However, the particle filter approach is difficult to implement and has a high computational cost, which can be barriers to clinical adoption. This work presents an alternative estimation strategy based upon Kalman filtering aimed at reducing the computational cost of subject-specific model development. The robustness of this approach to Gaussian and non-Gaussian noise is discussed. The extended Kalman filter (EKF) approach is found to perform very well in comparison with the particle filter technique at dramatically lower computational cost. Based upon the test cases explored, the EKF is comparable in terms of accuracy to the particle filter technique when greater than 6000 particles are employed; if less particles are employed, the EKF actually performs better. For comparable levels of accuracy, the solution time is reduced by 2 orders of magnitude when employing the EKF. By virtue of the approximations used in the EKF, however, the credibility intervals tend to be slightly underpredicted.
Predicting the geographical distribution of two invasive termite species from occurrence data.
Tonini, Francesco; Divino, Fabio; Lasinio, Giovanna Jona; Hochmair, Hartwig H; Scheffrahn, Rudolf H
2014-10-01
Predicting the potential habitat of species under both current and future climate change scenarios is crucial for monitoring invasive species and understanding a species' response to different environmental conditions. Frequently, the only data available on a species is the location of its occurrence (presence-only data). Using occurrence records only, two models were used to predict the geographical distribution of two destructive invasive termite species, Coptotermes gestroi (Wasmann) and Coptotermes formosanus Shiraki. The first model uses a Bayesian linear logistic regression approach adjusted for presence-only data while the second one is the widely used maximum entropy approach (Maxent). Results show that the predicted distributions of both C. gestroi and C. formosanus are strongly linked to urban development. The impact of future scenarios such as climate warming and population growth on the biotic distribution of both termite species was also assessed. Future climate warming seems to affect their projected probability of presence to a lesser extent than population growth. The Bayesian logistic approach outperformed Maxent consistently in all models according to evaluation criteria such as model sensitivity and ecological realism. The importance of further studies for an explicit treatment of residual spatial autocorrelation and a more comprehensive comparison between both statistical approaches is suggested.
Application of Dynamic naïve Bayesian classifier to comprehensive drought assessment
NASA Astrophysics Data System (ADS)
Park, D. H.; Lee, J. Y.; Lee, J. H.; KIm, T. W.
2017-12-01
Drought monitoring has already been extensively studied due to the widespread impacts and complex causes of drought. The most important component of drought monitoring is to estimate the characteristics and extent of drought by quantitatively measuring the characteristics of drought. Drought assessment considering different aspects of the complicated drought condition and uncertainty of drought index is great significance in accurate drought monitoring. This study used the dynamic Naïve Bayesian Classifier (DNBC) which is an extension of the Hidden Markov Model (HMM), to model and classify drought by using various drought indices for integrated drought assessment. To provide a stable model for combined use of multiple drought indices, this study employed the DNBC to perform multi-index drought assessment by aggregating the effect of different type of drought and considering the inherent uncertainty. Drought classification was performed by the DNBC using several drought indices: Standardized Precipitation Index (SPI), Streamflow Drought Index (SDI), and Normalized Vegetation Supply Water Index (NVSWI)) that reflect meteorological, hydrological, and agricultural drought characteristics. Overall results showed that in comparison unidirectional (SPI, SDI, and NVSWI) or multivariate (Composite Drought Index, CDI) drought assessment, the proposed DNBC was able to synthetically classify of drought considering uncertainty. Model provided method for comprehensive drought assessment with combined use of different drought indices.
Scale Mixture Models with Applications to Bayesian Inference
NASA Astrophysics Data System (ADS)
Qin, Zhaohui S.; Damien, Paul; Walker, Stephen
2003-11-01
Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.
Comparison of two integration methods for dynamic causal modeling of electrophysiological data.
Lemaréchal, Jean-Didier; George, Nathalie; David, Olivier
2018-06-01
Dynamic causal modeling (DCM) is a methodological approach to study effective connectivity among brain regions. Based on a set of observations and a biophysical model of brain interactions, DCM uses a Bayesian framework to estimate the posterior distribution of the free parameters of the model (e.g. modulation of connectivity) and infer architectural properties of the most plausible model (i.e. model selection). When modeling electrophysiological event-related responses, the estimation of the model relies on the integration of the system of delay differential equations (DDEs) that describe the dynamics of the system. In this technical note, we compared two numerical schemes for the integration of DDEs. The first, and standard, scheme approximates the DDEs (more precisely, the state of the system, with respect to conduction delays among brain regions) using ordinary differential equations (ODEs) and solves it with a fixed step size. The second scheme uses a dedicated DDEs solver with adaptive step sizes to control error, making it theoretically more accurate. To highlight the effects of the approximation used by the first integration scheme in regard to parameter estimation and Bayesian model selection, we performed simulations of local field potentials using first, a simple model comprising 2 regions and second, a more complex model comprising 6 regions. In these simulations, the second integration scheme served as the standard to which the first one was compared. Then, the performances of the two integration schemes were directly compared by fitting a public mismatch negativity EEG dataset with different models. The simulations revealed that the use of the standard DCM integration scheme was acceptable for Bayesian model selection but underestimated the connectivity parameters and did not allow an accurate estimation of conduction delays. Fitting to empirical data showed that the models systematically obtained an increased accuracy when using the second integration scheme. We conclude that inference on connectivity strength and delay based on DCM for EEG/MEG requires an accurate integration scheme. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Mustac, M.; Kim, S.; Tkalcic, H.; Rhie, J.; Chen, Y.; Ford, S. R.; Sebastian, N.
2015-12-01
Conventional approaches to inverse problems suffer from non-linearity and non-uniqueness in estimations of seismic structures and source properties. Estimated results and associated uncertainties are often biased by applied regularizations and additional constraints, which are commonly introduced to solve such problems. Bayesian methods, however, provide statistically meaningful estimations of models and their uncertainties constrained by data information. In addition, hierarchical and trans-dimensional (trans-D) techniques are inherently implemented in the Bayesian framework to account for involved error statistics and model parameterizations, and, in turn, allow more rigorous estimations of the same. Here, we apply Bayesian methods throughout the entire inference process to estimate seismic structures and source properties in Northeast Asia including east China, the Korean peninsula, and the Japanese islands. Ambient noise analysis is first performed to obtain a base three-dimensional (3-D) heterogeneity model using continuous broadband waveforms from more than 300 stations. As for the tomography of surface wave group and phase velocities in the 5-70 s band, we adopt a hierarchical and trans-D Bayesian inversion method using Voronoi partition. The 3-D heterogeneity model is further improved by joint inversions of teleseismic receiver functions and dispersion data using a newly developed high-efficiency Bayesian technique. The obtained model is subsequently used to prepare 3-D structural Green's functions for the source characterization. A hierarchical Bayesian method for point source inversion using regional complete waveform data is applied to selected events from the region. The seismic structure and source characteristics with rigorously estimated uncertainties from the novel Bayesian methods provide enhanced monitoring and discrimination of seismic events in northeast Asia.
Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis
ERIC Educational Resources Information Center
Ansari, Asim; Iyengar, Raghuram
2006-01-01
We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…
ERIC Educational Resources Information Center
Rindskopf, David
2012-01-01
Muthen and Asparouhov (2012) made a strong case for the advantages of Bayesian methodology in factor analysis and structural equation models. I show additional extensions and adaptations of their methods and show how non-Bayesians can take advantage of many (though not all) of these advantages by using interval restrictions on parameters. By…
ERIC Educational Resources Information Center
Marcoulides, Katerina M.
2018-01-01
This study examined the use of Bayesian analysis methods for the estimation of item parameters in a two-parameter logistic item response theory model. Using simulated data under various design conditions with both informative and non-informative priors, the parameter recovery of Bayesian analysis methods were examined. Overall results showed that…
A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.
ERIC Educational Resources Information Center
Glas, Cees A. W.; Meijer, Rob R.
A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…
A Tutorial Introduction to Bayesian Models of Cognitive Development
ERIC Educational Resources Information Center
Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei
2011-01-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…
ERIC Educational Resources Information Center
Sebro, Negusse Yohannes; Goshu, Ayele Taye
2017-01-01
This study aims to explore Bayesian multilevel modeling to investigate variations of average academic achievement of grade eight school students. A sample of 636 students is randomly selected from 26 private and government schools by a two-stage stratified sampling design. Bayesian method is used to estimate the fixed and random effects. Input and…
Bayesian estimation of differential transcript usage from RNA-seq data.
Papastamoulis, Panagiotis; Rattray, Magnus
2017-11-27
Next generation sequencing allows the identification of genes consisting of differentially expressed transcripts, a term which usually refers to changes in the overall expression level. A specific type of differential expression is differential transcript usage (DTU) and targets changes in the relative within gene expression of a transcript. The contribution of this paper is to: (a) extend the use of cjBitSeq to the DTU context, a previously introduced Bayesian model which is originally designed for identifying changes in overall expression levels and (b) propose a Bayesian version of DRIMSeq, a frequentist model for inferring DTU. cjBitSeq is a read based model and performs fully Bayesian inference by MCMC sampling on the space of latent state of each transcript per gene. BayesDRIMSeq is a count based model and estimates the Bayes Factor of a DTU model against a null model using Laplace's approximation. The proposed models are benchmarked against the existing ones using a recent independent simulation study as well as a real RNA-seq dataset. Our results suggest that the Bayesian methods exhibit similar performance with DRIMSeq in terms of precision/recall but offer better calibration of False Discovery Rate.
Development of uncertainty-based work injury model using Bayesian structural equation modelling.
Chatterjee, Snehamoy
2014-01-01
This paper proposed a Bayesian method-based structural equation model (SEM) of miners' work injury for an underground coal mine in India. The environmental and behavioural variables for work injury were identified and causal relationships were developed. For Bayesian modelling, prior distributions of SEM parameters are necessary to develop the model. In this paper, two approaches were adopted to obtain prior distribution for factor loading parameters and structural parameters of SEM. In the first approach, the prior distributions were considered as a fixed distribution function with specific parameter values, whereas, in the second approach, prior distributions of the parameters were generated from experts' opinions. The posterior distributions of these parameters were obtained by applying Bayesian rule. The Markov Chain Monte Carlo sampling in the form Gibbs sampling was applied for sampling from the posterior distribution. The results revealed that all coefficients of structural and measurement model parameters are statistically significant in experts' opinion-based priors, whereas, two coefficients are not statistically significant when fixed prior-based distributions are applied. The error statistics reveals that Bayesian structural model provides reasonably good fit of work injury with high coefficient of determination (0.91) and less mean squared error as compared to traditional SEM.
BUMPER: the Bayesian User-friendly Model for Palaeo-Environmental Reconstruction
NASA Astrophysics Data System (ADS)
Holden, Phil; Birks, John; Brooks, Steve; Bush, Mark; Hwang, Grace; Matthews-Bird, Frazer; Valencia, Bryan; van Woesik, Robert
2017-04-01
We describe the Bayesian User-friendly Model for Palaeo-Environmental Reconstruction (BUMPER), a Bayesian transfer function for inferring past climate and other environmental variables from microfossil assemblages. The principal motivation for a Bayesian approach is that the palaeoenvironment is treated probabilistically, and can be updated as additional data become available. Bayesian approaches therefore provide a reconstruction-specific quantification of the uncertainty in the data and in the model parameters. BUMPER is fully self-calibrating, straightforward to apply, and computationally fast, requiring 2 seconds to build a 100-taxon model from a 100-site training-set on a standard personal computer. We apply the model's probabilistic framework to generate thousands of artificial training-sets under ideal assumptions. We then use these to demonstrate both the general applicability of the model and the sensitivity of reconstructions to the characteristics of the training-set, considering assemblage richness, taxon tolerances, and the number of training sites. We demonstrate general applicability to real data, considering three different organism types (chironomids, diatoms, pollen) and different reconstructed variables. In all of these applications an identically configured model is used, the only change being the input files that provide the training-set environment and taxon-count data.
Shah, A A; Xing, W W; Triantafyllidis, V
2017-04-01
In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach.
Xing, W. W.; Triantafyllidis, V.
2017-01-01
In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach. PMID:28484327
NASA Astrophysics Data System (ADS)
Park, K.-R.; Kim, K.-h.; Kwak, S.; Svensson, J.; Lee, J.; Ghim, Y.-c.
2017-11-01
Feasibility study of direct spectra measurements of Thomson scattered photons for fusion-grade plasmas is performed based on a forward model of the KSTAR Thomson scattering system. Expected spectra in the forward model are calculated based on Selden function including the relativistic polarization correction. Noise in the signal is modeled with photon noise and Gaussian electrical noise. Electron temperature and density are inferred using Bayesian probability theory. Based on bias error, full width at half maximum and entropy of posterior distributions, spectral measurements are found to be feasible. Comparisons between spectrometer-based and polychromator-based Thomson scattering systems are performed with varying quantum efficiency and electrical noise levels.
Application of bayesian networks to real-time flood risk estimation
NASA Astrophysics Data System (ADS)
Garrote, L.; Molina, M.; Blasco, G.
2003-04-01
This paper presents the application of a computational paradigm taken from the field of artificial intelligence - the bayesian network - to model the behaviour of hydrologic basins during floods. The final goal of this research is to develop representation techniques for hydrologic simulation models in order to define, develop and validate a mechanism, supported by a software environment, oriented to build decision models for the prediction and management of river floods in real time. The emphasis is placed on providing decision makers with tools to incorporate their knowledge of basin behaviour, usually formulated in terms of rainfall-runoff models, in the process of real-time decision making during floods. A rainfall-runoff model is only a step in the process of decision making. If a reliable rainfall forecast is available and the rainfall-runoff model is well calibrated, decisions can be based mainly on model results. However, in most practical situations, uncertainties in rainfall forecasts or model performance have to be incorporated in the decision process. The computation paradigm adopted for the simulation of hydrologic processes is the bayesian network. A bayesian network is a directed acyclic graph that represents causal influences between linked variables. Under this representation, uncertain qualitative variables are related through causal relations quantified with conditional probabilities. The solution algorithm allows the computation of the expected probability distribution of unknown variables conditioned to the observations. An approach to represent hydrologic processes by bayesian networks with temporal and spatial extensions is presented in this paper, together with a methodology for the development of bayesian models using results produced by deterministic hydrologic simulation models
Model-based Bayesian inference for ROC data analysis
NASA Astrophysics Data System (ADS)
Lei, Tianhu; Bae, K. Ty
2013-03-01
This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.
Bayesian Modeling of a Human MMORPG Player
NASA Astrophysics Data System (ADS)
Synnaeve, Gabriel; Bessière, Pierre
2011-03-01
This paper describes an application of Bayesian programming to the control of an autonomous avatar in a multiplayer role-playing game (the example is based on World of Warcraft). We model a particular task, which consists of choosing what to do and to select which target in a situation where allies and foes are present. We explain the model in Bayesian programming and show how we could learn the conditional probabilities from data gathered during human-played sessions.
A comment on priors for Bayesian occupancy models
Gerber, Brian D.
2018-01-01
Understanding patterns of species occurrence and the processes underlying these patterns is fundamental to the study of ecology. One of the more commonly used approaches to investigate species occurrence patterns is occupancy modeling, which can account for imperfect detection of a species during surveys. In recent years, there has been a proliferation of Bayesian modeling in ecology, which includes fitting Bayesian occupancy models. The Bayesian framework is appealing to ecologists for many reasons, including the ability to incorporate prior information through the specification of prior distributions on parameters. While ecologists almost exclusively intend to choose priors so that they are “uninformative” or “vague”, such priors can easily be unintentionally highly informative. Here we report on how the specification of a “vague” normally distributed (i.e., Gaussian) prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation. Using both simulated data and empirical examples, we illustrate how this issue likely compromises inference about species-habitat relationships. While the extent to which these informative priors influence inference depends on the data set, researchers fitting Bayesian occupancy models should conduct sensitivity analyses to ensure intended inference, or employ less commonly used priors that are less informative (e.g., logistic or t prior distributions). We provide suggestions for addressing this issue in occupancy studies, and an online tool for exploring this issue under different contexts. PMID:29481554
Accounting for uncertainty in health economic decision models by using model averaging.
Jackson, Christopher H; Thompson, Simon G; Sharples, Linda D
2009-04-01
Health economic decision models are subject to considerable uncertainty, much of which arises from choices between several plausible model structures, e.g. choices of covariates in a regression model. Such structural uncertainty is rarely accounted for formally in decision models but can be addressed by model averaging. We discuss the most common methods of averaging models and the principles underlying them. We apply them to a comparison of two surgical techniques for repairing abdominal aortic aneurysms. In model averaging, competing models are usually either weighted by using an asymptotically consistent model assessment criterion, such as the Bayesian information criterion, or a measure of predictive ability, such as Akaike's information criterion. We argue that the predictive approach is more suitable when modelling the complex underlying processes of interest in health economics, such as individual disease progression and response to treatment.
The Bayesian boom: good thing or bad?
Hahn, Ulrike
2014-01-01
A series of high-profile critiques of Bayesian models of cognition have recently sparked controversy. These critiques question the contribution of rational, normative considerations in the study of cognition. The present article takes central claims from these critiques and evaluates them in light of specific models. Closer consideration of actual examples of Bayesian treatments of different cognitive phenomena allows one to defuse these critiques showing that they cannot be sustained across the diversity of applications of the Bayesian framework for cognitive modeling. More generally, there is nothing in the Bayesian framework that would inherently give rise to the deficits that these critiques perceive, suggesting they have been framed at the wrong level of generality. At the same time, the examples are used to demonstrate the different ways in which consideration of rationality uniquely benefits both theory and practice in the study of cognition. PMID:25152738
A comprehensive probabilistic analysis model of oil pipelines network based on Bayesian network
NASA Astrophysics Data System (ADS)
Zhang, C.; Qin, T. X.; Jiang, B.; Huang, C.
2018-02-01
Oil pipelines network is one of the most important facilities of energy transportation. But oil pipelines network accident may result in serious disasters. Some analysis models for these accidents have been established mainly based on three methods, including event-tree, accident simulation and Bayesian network. Among these methods, Bayesian network is suitable for probabilistic analysis. But not all the important influencing factors are considered and the deployment rule of the factors has not been established. This paper proposed a probabilistic analysis model of oil pipelines network based on Bayesian network. Most of the important influencing factors, including the key environment condition and emergency response are considered in this model. Moreover, the paper also introduces a deployment rule for these factors. The model can be used in probabilistic analysis and sensitive analysis of oil pipelines network accident.
Hippert, Henrique S; Taylor, James W
2010-04-01
Artificial neural networks have frequently been proposed for electricity load forecasting because of their capabilities for the nonlinear modelling of large multivariate data sets. Modelling with neural networks is not an easy task though; two of the main challenges are defining the appropriate level of model complexity, and choosing the input variables. This paper evaluates techniques for automatic neural network modelling within a Bayesian framework, as applied to six samples containing daily load and weather data for four different countries. We analyse input selection as carried out by the Bayesian 'automatic relevance determination', and the usefulness of the Bayesian 'evidence' for the selection of the best structure (in terms of number of neurones), as compared to methods based on cross-validation. Copyright 2009 Elsevier Ltd. All rights reserved.
Toribo, S.G.; Gray, B.R.; Liang, S.
2011-01-01
The N-mixture model proposed by Royle in 2004 may be used to approximate the abundance and detection probability of animal species in a given region. In 2006, Royle and Dorazio discussed the advantages of using a Bayesian approach in modelling animal abundance and occurrence using a hierarchical N-mixture model. N-mixture models assume replication on sampling sites, an assumption that may be violated when the site is not closed to changes in abundance during the survey period or when nominal replicates are defined spatially. In this paper, we studied the robustness of a Bayesian approach to fitting the N-mixture model for pseudo-replicated count data. Our simulation results showed that the Bayesian estimates for abundance and detection probability are slightly biased when the actual detection probability is small and are sensitive to the presence of extra variability within local sites.
Bayesian analysis of CCDM models
NASA Astrophysics Data System (ADS)
Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.
2017-09-01
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.
Refining value-at-risk estimates using a Bayesian Markov-switching GJR-GARCH copula-EVT model.
Sampid, Marius Galabe; Hasim, Haslifah M; Dai, Hongsheng
2018-01-01
In this paper, we propose a model for forecasting Value-at-Risk (VaR) using a Bayesian Markov-switching GJR-GARCH(1,1) model with skewed Student's-t innovation, copula functions and extreme value theory. A Bayesian Markov-switching GJR-GARCH(1,1) model that identifies non-constant volatility over time and allows the GARCH parameters to vary over time following a Markov process, is combined with copula functions and EVT to formulate the Bayesian Markov-switching GJR-GARCH(1,1) copula-EVT VaR model, which is then used to forecast the level of risk on financial asset returns. We further propose a new method for threshold selection in EVT analysis, which we term the hybrid method. Empirical and back-testing results show that the proposed VaR models capture VaR reasonably well in periods of calm and in periods of crisis.
Comparing interval estimates for small sample ordinal CFA models
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research. PMID:26579002
Comparing interval estimates for small sample ordinal CFA models.
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research.
Bayesian demography 250 years after Bayes
Bijak, Jakub; Bryant, John
2016-01-01
Bayesian statistics offers an alternative to classical (frequentist) statistics. It is distinguished by its use of probability distributions to describe uncertain quantities, which leads to elegant solutions to many difficult statistical problems. Although Bayesian demography, like Bayesian statistics more generally, is around 250 years old, only recently has it begun to flourish. The aim of this paper is to review the achievements of Bayesian demography, address some misconceptions, and make the case for wider use of Bayesian methods in population studies. We focus on three applications: demographic forecasts, limited data, and highly structured or complex models. The key advantages of Bayesian methods are the ability to integrate information from multiple sources and to describe uncertainty coherently. Bayesian methods also allow for including additional (prior) information next to the data sample. As such, Bayesian approaches are complementary to many traditional methods, which can be productively re-expressed in Bayesian terms. PMID:26902889
A Bayesian approach to meta-analysis of plant pathology studies.
Mila, A L; Ngugi, H K
2011-01-01
Bayesian statistical methods are used for meta-analysis in many disciplines, including medicine, molecular biology, and engineering, but have not yet been applied for quantitative synthesis of plant pathology studies. In this paper, we illustrate the key concepts of Bayesian statistics and outline the differences between Bayesian and classical (frequentist) methods in the way parameters describing population attributes are considered. We then describe a Bayesian approach to meta-analysis and present a plant pathological example based on studies evaluating the efficacy of plant protection products that induce systemic acquired resistance for the management of fire blight of apple. In a simple random-effects model assuming a normal distribution of effect sizes and no prior information (i.e., a noninformative prior), the results of the Bayesian meta-analysis are similar to those obtained with classical methods. Implementing the same model with a Student's t distribution and a noninformative prior for the effect sizes, instead of a normal distribution, yields similar results for all but acibenzolar-S-methyl (Actigard) which was evaluated only in seven studies in this example. Whereas both the classical (P = 0.28) and the Bayesian analysis with a noninformative prior (95% credibility interval [CRI] for the log response ratio: -0.63 to 0.08) indicate a nonsignificant effect for Actigard, specifying a t distribution resulted in a significant, albeit variable, effect for this product (CRI: -0.73 to -0.10). These results confirm the sensitivity of the analytical outcome (i.e., the posterior distribution) to the choice of prior in Bayesian meta-analyses involving a limited number of studies. We review some pertinent literature on more advanced topics, including modeling of among-study heterogeneity, publication bias, analyses involving a limited number of studies, and methods for dealing with missing data, and show how these issues can be approached in a Bayesian framework. Bayesian meta-analysis can readily include information not easily incorporated in classical methods, and allow for a full evaluation of competing models. Given the power and flexibility of Bayesian methods, we expect them to become widely adopted for meta-analysis of plant pathology studies.
A nanomaterial release model for waste shredding using a Bayesian belief network
NASA Astrophysics Data System (ADS)
Shandilya, Neeraj; Ligthart, Tom; van Voorde, Imelda; Stahlmecke, Burkhard; Clavaguera, Simon; Philippot, Cecile; Ding, Yaobo; Goede, Henk
2018-02-01
The shredding of waste of electrical and electronic equipment (WEEE) and other products, incorporated with nanomaterials, can lead to a substantial release of nanomaterials. Considering the uncertainty, complexity, and scarcity of experimental data on release, we present the development of a Bayesian belief network (BBN) model. This baseline model aims to give a first prediction of the release of nanomaterials (excluding nanofibers) during their mechanical shredding. With a focus on the description of the model development methodology, we characterize nanomaterial release in terms of number, size, mass, and composition of released particles. Through a sensitivity analysis of the model, we find the material-specific parameters like affinity of nanomaterials to the matrix of the composite and their state of dispersion inside the matrix to reduce the nanomaterial release up to 50%. The shredder-specific parameters like number of shafts in a shredder and input and output size of the material for shredding could minimize it up to 98%. The comparison with two experimental test cases shows promising outcome on the prediction capacity of the model. As additional experimental data on nanomaterial release becomes available, the model is able to further adapt and update risk forecasts. When adapting the model with additional expert beliefs, experts should be selected using criteria, e.g., substantial contribution to nanomaterial and/or particulate matter release-related scientific literature, the capacity and willingness to contribute to further development of the BBN model, and openness to accepting deviating opinions. [Figure not available: see fulltext.
NASA Astrophysics Data System (ADS)
Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor
2018-02-01
Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.
Li, Yan; Li, Xiang; Ma, Weiya; Dong, Zigang
2014-08-12
The epidermal growth factor receptor (EGFR) is aberrantly activated in various cancer cells and an important target for cancer treatment. Deep understanding of EGFR conformational changes between the active and inactive states is of pharmaceutical interest. Here we present a strategy combining multiply targeted molecular dynamics simulations, unbiased molecular dynamics simulations, and Bayesian clustering to investigate transition pathways during the activation/inactivation process of EGFR kinase domain. Two distinct pathways between the active and inactive forms are designed, explored, and compared. Based on Bayesian clustering and rough two-dimensional free energy surfaces, the energy-favorable pathway is recognized, though DFG-flip happens in both pathways. In addition, another pathway with different intermediate states appears in our simulations. Comparison of distinct pathways also indicates that disruption of the Lys745-Glu762 interaction is critically important in DFG-flip while movement of the A-loop significantly facilitates the conformational change. Our simulations yield new insights into EGFR conformational transitions. Moreover, our results verify that this approach is valid and efficient in sampling of protein conformational changes and comparison of distinct pathways.
A Bayesian Assessment of Seismic Semi-Periodicity Forecasts
NASA Astrophysics Data System (ADS)
Nava, F.; Quinteros, C.; Glowacka, E.; Frez, J.
2016-01-01
Among the schemes for earthquake forecasting, the search for semi-periodicity during large earthquakes in a given seismogenic region plays an important role. When considering earthquake forecasts based on semi-periodic sequence identification, the Bayesian formalism is a useful tool for: (1) assessing how well a given earthquake satisfies a previously made forecast; (2) re-evaluating the semi-periodic sequence probability; and (3) testing other prior estimations of the sequence probability. A comparison of Bayesian estimates with updated estimates of semi-periodic sequences that incorporate new data not used in the original estimates shows extremely good agreement, indicating that: (1) the probability that a semi-periodic sequence is not due to chance is an appropriate estimate for the prior sequence probability estimate; and (2) the Bayesian formalism does a very good job of estimating corrected semi-periodicity probabilities, using slightly less data than that used for updated estimates. The Bayesian approach is exemplified explicitly by its application to the Parkfield semi-periodic forecast, and results are given for its application to other forecasts in Japan and Venezuela.
Li, Baoyue; Lingsma, Hester F; Steyerberg, Ewout W; Lesaffre, Emmanuel
2011-05-23
Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC.Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient. On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "non-informative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain.
ERIC Educational Resources Information Center
Griffiths, Thomas L.; Tenenbaum, Joshua B.
2011-01-01
Predicting the future is a basic problem that people have to solve every day and a component of planning, decision making, memory, and causal reasoning. In this article, we present 5 experiments testing a Bayesian model of predicting the duration or extent of phenomena from their current state. This Bayesian model indicates how people should…
Variations on Bayesian Prediction and Inference
2016-05-09
inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle
Posterior Predictive Model Checking in Bayesian Networks
ERIC Educational Resources Information Center
Crawford, Aaron
2014-01-01
This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex…
Metrics for evaluating performance and uncertainty of Bayesian network models
Bruce G. Marcot
2012-01-01
This paper presents a selected set of existing and new metrics for gauging Bayesian network model performance and uncertainty. Selected existing and new metrics are discussed for conducting model sensitivity analysis (variance reduction, entropy reduction, case file simulation); evaluating scenarios (influence analysis); depicting model complexity (numbers of model...
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Bayesian logistic regression approaches to predict incorrect DRG assignment.
Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural
2018-05-07
Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
Assessment of parametric uncertainty for groundwater reactive transport modeling,
Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun
2014-01-01
The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.
Liu, Feng; Walters, Stephen J; Julious, Steven A
2017-10-02
It is important to quantify the dose response for a drug in phase 2a clinical trials so the optimal doses can then be selected for subsequent late phase trials. In a phase 2a clinical trial of new lead drug being developed for the treatment of rheumatoid arthritis (RA), a U-shaped dose response curve was observed. In the light of this result further research was undertaken to design an efficient phase 2a proof of concept (PoC) trial for a follow-on compound using the lessons learnt from the lead compound. The planned analysis for the Phase 2a trial for GSK123456 was a Bayesian Emax model which assumes the dose-response relationship follows a monotonic sigmoid "S" shaped curve. This model was found to be suboptimal to model the U-shaped dose response observed in the data from this trial and alternatives approaches were needed to be considered for the next compound for which a Normal dynamic linear model (NDLM) is proposed. This paper compares the statistical properties of the Bayesian Emax model and NDLM model and both models are evaluated using simulation in the context of adaptive Phase 2a PoC design under a variety of assumed dose response curves: linear, Emax model, U-shaped model, and flat response. It is shown that the NDLM method is flexible and can handle a wide variety of dose-responses, including monotonic and non-monotonic relationships. In comparison to the NDLM model the Emax model excelled with higher probability of selecting ED90 and smaller average sample size, when the true dose response followed Emax like curve. In addition, the type I error, probability of incorrectly concluding a drug may work when it does not, is inflated with the Bayesian NDLM model in all scenarios which would represent a development risk to pharmaceutical company. The bias, which is the difference between the estimated effect from the Emax and NDLM models and the simulated value, is comparable if the true dose response follows a placebo like curve, an Emax like curve, or log linear shape curve under fixed dose allocation, no adaptive allocation, half adaptive and adaptive scenarios. The bias though is significantly increased for the Emax model if the true dose response follows a U-shaped curve. In most cases the Bayesian Emax model works effectively and efficiently, with low bias and good probability of success in case of monotonic dose response. However, if there is a belief that the dose response could be non-monotonic then the NDLM is the superior model to assess the dose response.
Prospective evaluation of a Bayesian model to predict organizational change.
Molfenter, Todd; Gustafson, Dave; Kilo, Chuck; Bhattacharya, Abhik; Olsson, Jesper
2005-01-01
This research examines a subjective Bayesian model's ability to predict organizational change outcomes and sustainability of those outcomes for project teams participating in a multi-organizational improvement collaborative.
Aerosol-type retrieval and uncertainty quantification from OMI data
NASA Astrophysics Data System (ADS)
Kauppi, Anu; Kolmonen, Pekka; Laine, Marko; Tamminen, Johanna
2017-11-01
We discuss uncertainty quantification for aerosol-type selection in satellite-based atmospheric aerosol retrieval. The retrieval procedure uses precalculated aerosol microphysical models stored in look-up tables (LUTs) and top-of-atmosphere (TOA) spectral reflectance measurements to solve the aerosol characteristics. The forward model approximations cause systematic differences between the modelled and observed reflectance. Acknowledging this model discrepancy as a source of uncertainty allows us to produce more realistic uncertainty estimates and assists the selection of the most appropriate LUTs for each individual retrieval.This paper focuses on the aerosol microphysical model selection and characterisation of uncertainty in the retrieved aerosol type and aerosol optical depth (AOD). The concept of model evidence is used as a tool for model comparison. The method is based on Bayesian inference approach, in which all uncertainties are described as a posterior probability distribution. When there is no single best-matching aerosol microphysical model, we use a statistical technique based on Bayesian model averaging to combine AOD posterior probability densities of the best-fitting models to obtain an averaged AOD estimate. We also determine the shared evidence of the best-matching models of a certain main aerosol type in order to quantify how plausible it is that it represents the underlying atmospheric aerosol conditions.The developed method is applied to Ozone Monitoring Instrument (OMI) measurements using a multiwavelength approach for retrieving the aerosol type and AOD estimate with uncertainty quantification for cloud-free over-land pixels. Several larger pixel set areas were studied in order to investigate the robustness of the developed method. We evaluated the retrieved AOD by comparison with ground-based measurements at example sites. We found that the uncertainty of AOD expressed by posterior probability distribution reflects the difficulty in model selection. The posterior probability distribution can provide a comprehensive characterisation of the uncertainty in this kind of problem for aerosol-type selection. As a result, the proposed method can account for the model error and also include the model selection uncertainty in the total uncertainty budget.
Wind power application research on the fusion of the determination and ensemble prediction
NASA Astrophysics Data System (ADS)
Lan, Shi; Lina, Xu; Yuzhu, Hao
2017-07-01
The fused product of wind speed for the wind farm is designed through the use of wind speed products of ensemble prediction from the European Centre for Medium-Range Weather Forecasts (ECMWF) and professional numerical model products on wind power based on Mesoscale Model5 (MM5) and Beijing Rapid Update Cycle (BJ-RUC), which are suitable for short-term wind power forecasting and electric dispatch. The single-valued forecast is formed by calculating the different ensemble statistics of the Bayesian probabilistic forecasting representing the uncertainty of ECMWF ensemble prediction. Using autoregressive integrated moving average (ARIMA) model to improve the time resolution of the single-valued forecast, and based on the Bayesian model averaging (BMA) and the deterministic numerical model prediction, the optimal wind speed forecasting curve and the confidence interval are provided. The result shows that the fusion forecast has made obvious improvement to the accuracy relative to the existing numerical forecasting products. Compared with the 0-24 h existing deterministic forecast in the validation period, the mean absolute error (MAE) is decreased by 24.3 % and the correlation coefficient (R) is increased by 12.5 %. In comparison with the ECMWF ensemble forecast, the MAE is reduced by 11.7 %, and R is increased 14.5 %. Additionally, MAE did not increase with the prolongation of the forecast ahead.
Comparison of Nurse Staffing Measurements in Staffing-Outcomes Research.
Park, Shin Hye; Blegen, Mary A; Spetz, Joanne; Chapman, Susan A; De Groot, Holly A
2015-01-01
Investigators have used a variety of operational definitions of nursing hours of care in measuring nurse staffing for health services research. However, little is known about which approach is best for nurse staffing measurement. To examine whether various nursing hours measures yield different model estimations when predicting patient outcomes and to determine the best method to measure nurse staffing based on the model estimations. We analyzed data from the University HealthSystem Consortium for 2005. The sample comprised 208 hospital-quarter observations from 54 hospitals, representing information on 971 adult-care units and about 1 million inpatient discharges. We compared regression models using different combinations of staffing measures based on productive/nonproductive and direct-care/indirect-care hours. Akaike Information Criterion and Bayesian Information Criterion were used in the assessment of staffing measure performance. The models that included the staffing measure calculated from productive hours by direct-care providers were best, in general. However, the Akaike Information Criterion and Bayesian Information Criterion differences between models were small, indicating that distinguishing nonproductive and indirect-care hours from productive direct-care hours does not substantially affect the approximation of the relationship between nurse staffing and patient outcomes. This study is the first to explicitly evaluate various measures of nurse staffing. Productive hours by direct-care providers are the strongest measure related to patient outcomes and thus should be preferred in research on nurse staffing and patient outcomes.
Bayesian uncertainty quantification in linear models for diffusion MRI.
Sjölund, Jens; Eklund, Anders; Özarslan, Evren; Herberthson, Magnus; Bånkestad, Maria; Knutsson, Hans
2018-03-29
Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue microstructure. By fitting a model to the dMRI signal it is possible to derive various quantitative features. Several of the most popular dMRI signal models are expansions in an appropriately chosen basis, where the coefficients are determined using some variation of least-squares. However, such approaches lack any notion of uncertainty, which could be valuable in e.g. group analyses. In this work, we use a probabilistic interpretation of linear least-squares methods to recast popular dMRI models as Bayesian ones. This makes it possible to quantify the uncertainty of any derived quantity. In particular, for quantities that are affine functions of the coefficients, the posterior distribution can be expressed in closed-form. We simulated measurements from single- and double-tensor models where the correct values of several quantities are known, to validate that the theoretically derived quantiles agree with those observed empirically. We included results from residual bootstrap for comparison and found good agreement. The validation employed several different models: Diffusion Tensor Imaging (DTI), Mean Apparent Propagator MRI (MAP-MRI) and Constrained Spherical Deconvolution (CSD). We also used in vivo data to visualize maps of quantitative features and corresponding uncertainties, and to show how our approach can be used in a group analysis to downweight subjects with high uncertainty. In summary, we convert successful linear models for dMRI signal estimation to probabilistic models, capable of accurate uncertainty quantification. Copyright © 2018 Elsevier Inc. All rights reserved.
Matthews, Luke J.; Tehrani, Jamie J.; Jordan, Fiona M.; Collard, Mark; Nunn, Charles L.
2011-01-01
Background Archaeologists and anthropologists have long recognized that different cultural complexes may have distinct descent histories, but they have lacked analytical techniques capable of easily identifying such incongruence. Here, we show how Bayesian phylogenetic analysis can be used to identify incongruent cultural histories. We employ the approach to investigate Iranian tribal textile traditions. Methods We used Bayes factor comparisons in a phylogenetic framework to test two models of cultural evolution: the hierarchically integrated system hypothesis and the multiple coherent units hypothesis. In the hierarchically integrated system hypothesis, a core tradition of characters evolves through descent with modification and characters peripheral to the core are exchanged among contemporaneous populations. In the multiple coherent units hypothesis, a core tradition does not exist. Rather, there are several cultural units consisting of sets of characters that have different histories of descent. Results For the Iranian textiles, the Bayesian phylogenetic analyses supported the multiple coherent units hypothesis over the hierarchically integrated system hypothesis. Our analyses suggest that pile-weave designs represent a distinct cultural unit that has a different phylogenetic history compared to other textile characters. Conclusions The results from the Iranian textiles are consistent with the available ethnographic evidence, which suggests that the commercial rug market has influenced pile-rug designs but not the techniques or designs incorporated in the other textiles produced by the tribes. We anticipate that Bayesian phylogenetic tests for inferring cultural units will be of great value for researchers interested in studying the evolution of cultural traits including language, behavior, and material culture. PMID:21559083
A FAST BAYESIAN METHOD FOR UPDATING AND FORECASTING HOURLY OZONE LEVELS
A Bayesian hierarchical space-time model is proposed by combining information from real-time ambient AIRNow air monitoring data, and output from a computer simulation model known as the Community Multi-scale Air Quality (Eta-CMAQ) forecast model. A model validation analysis shows...
Bayesian methods for characterizing unknown parameters of material models
Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.
2016-02-04
A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less
Bayesian methods for characterizing unknown parameters of material models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.
A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less
Bayesian methods in reliability
NASA Astrophysics Data System (ADS)
Sander, P.; Badoux, R.
1991-11-01
The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
Gilet, Estelle; Diard, Julien; Bessière, Pierre
2011-01-01
In this paper, we study the collaboration of perception and action representations involved in cursive letter recognition and production. We propose a mathematical formulation for the whole perception–action loop, based on probabilistic modeling and Bayesian inference, which we call the Bayesian Action–Perception (BAP) model. Being a model of both perception and action processes, the purpose of this model is to study the interaction of these processes. More precisely, the model includes a feedback loop from motor production, which implements an internal simulation of movement. Motor knowledge can therefore be involved during perception tasks. In this paper, we formally define the BAP model and show how it solves the following six varied cognitive tasks using Bayesian inference: i) letter recognition (purely sensory), ii) writer recognition, iii) letter production (with different effectors), iv) copying of trajectories, v) copying of letters, and vi) letter recognition (with internal simulation of movements). We present computer simulations of each of these cognitive tasks, and discuss experimental predictions and theoretical developments. PMID:21674043
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach.
Duarte, Belmiro P M; Wong, Weng Kee
2015-08-01
This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted.
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach
Duarte, Belmiro P. M.; Wong, Weng Kee
2014-01-01
Summary This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted. PMID:26512159
A systematic review of Bayesian articles in psychology: The last 25 years.
van de Schoot, Rens; Winter, Sonja D; Ryan, Oisín; Zondervan-Zwijnenburg, Mariëlle; Depaoli, Sarah
2017-06-01
Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Eckstein, Miguel P; Mack, Stephen C; Liston, Dorion B; Bogush, Lisa; Menzel, Randolf; Krauzlis, Richard J
2013-06-07
Visual attention is commonly studied by using visuo-spatial cues indicating probable locations of a target and assessing the effect of the validity of the cue on perceptual performance and its neural correlates. Here, we adapt a cueing task to measure spatial cueing effects on the decisions of honeybees and compare their behavior to that of humans and monkeys in a similarly structured two-alternative forced-choice perceptual task. Unlike the typical cueing paradigm in which the stimulus strength remains unchanged within a block of trials, for the monkey and human studies we randomized the contrast of the signal to simulate more real world conditions in which the organism is uncertain about the strength of the signal. A Bayesian ideal observer that weights sensory evidence from cued and uncued locations based on the cue validity to maximize overall performance is used as a benchmark of comparison against the three animals and other suboptimal models: probability matching, ignore the cue, always follow the cue, and an additive bias/single decision threshold model. We find that the cueing effect is pervasive across all three species but is smaller in size than that shown by the Bayesian ideal observer. Humans show a larger cueing effect than monkeys and bees show the smallest effect. The cueing effect and overall performance of the honeybees allows rejection of the models in which the bees are ignoring the cue, following the cue and disregarding stimuli to be discriminated, or adopting a probability matching strategy. Stimulus strength uncertainty also reduces the theoretically predicted variation in cueing effect with stimulus strength of an optimal Bayesian observer and diminishes the size of the cueing effect when stimulus strength is low. A more biologically plausible model that includes an additive bias to the sensory response from the cued location, although not mathematically equivalent to the optimal observer for the case stimulus strength uncertainty, can approximate the benefits of the more computationally complex optimal Bayesian model. We discuss the implications of our findings on the field's common conceptualization of covert visual attention in the cueing task and what aspects, if any, might be unique to humans. Copyright © 2013 Elsevier Ltd. All rights reserved.
Safta, C.; Ricciuto, Daniel M.; Sargsyan, Khachik; ...
2015-07-01
In this paper we propose a probabilistic framework for an uncertainty quantification (UQ) study of a carbon cycle model and focus on the comparison between steady-state and transient simulation setups. A global sensitivity analysis (GSA) study indicates the parameters and parameter couplings that are important at different times of the year for quantities of interest (QoIs) obtained with the data assimilation linked ecosystem carbon (DALEC) model. We then employ a Bayesian approach and a statistical model error term to calibrate the parameters of DALEC using net ecosystem exchange (NEE) observations at the Harvard Forest site. The calibration results are employedmore » in the second part of the paper to assess the predictive skill of the model via posterior predictive checks.« less
Prior Design for Dependent Dirichlet Processes: An Application to Marathon Modeling
F. Pradier, Melanie; J. R. Ruiz, Francisco; Perez-Cruz, Fernando
2016-01-01
This paper presents a novel application of Bayesian nonparametrics (BNP) for marathon data modeling. We make use of two well-known BNP priors, the single-p dependent Dirichlet process and the hierarchical Dirichlet process, in order to address two different problems. First, we study the impact of age, gender and environment on the runners’ performance. We derive a fair grading method that allows direct comparison of runners regardless of their age and gender. Unlike current grading systems, our approach is based not only on top world records, but on the performances of all runners. The presented methodology for comparison of densities can be adopted in many other applications straightforwardly, providing an interesting perspective to build dependent Dirichlet processes. Second, we analyze the running patterns of the marathoners in time, obtaining information that can be valuable for training purposes. We also show that these running patterns can be used to predict finishing time given intermediate interval measurements. We apply our models to New York City, Boston and London marathons. PMID:26821155
Baker, Robert L; Leong, Wen Fung; An, Nan; Brock, Marcus T; Rubin, Matthew J; Welch, Stephen; Weinig, Cynthia
2018-02-01
We develop Bayesian function-valued trait models that mathematically isolate genetic mechanisms underlying leaf growth trajectories by factoring out genotype-specific differences in photosynthesis. Remote sensing data can be used instead of leaf-level physiological measurements. Characterizing the genetic basis of traits that vary during ontogeny and affect plant performance is a major goal in evolutionary biology and agronomy. Describing genetic programs that specifically regulate morphological traits can be complicated by genotypic differences in physiological traits. We describe the growth trajectories of leaves using novel Bayesian function-valued trait (FVT) modeling approaches in Brassica rapa recombinant inbred lines raised in heterogeneous field settings. While frequentist approaches estimate parameter values by treating each experimental replicate discretely, Bayesian models can utilize information in the global dataset, potentially leading to more robust trait estimation. We illustrate this principle by estimating growth asymptotes in the face of missing data and comparing heritabilities of growth trajectory parameters estimated by Bayesian and frequentist approaches. Using pseudo-Bayes factors, we compare the performance of an initial Bayesian logistic growth model and a model that incorporates carbon assimilation (A max ) as a cofactor, thus statistically accounting for genotypic differences in carbon resources. We further evaluate two remotely sensed spectroradiometric indices, photochemical reflectance (pri2) and MERIS Terrestrial Chlorophyll Index (mtci) as covariates in lieu of A max , because these two indices were genetically correlated with A max across years and treatments yet allow much higher throughput compared to direct leaf-level gas-exchange measurements. For leaf lengths in uncrowded settings, including A max improves model fit over the initial model. The mtci and pri2 indices also outperform direct A max measurements. Of particular importance for evolutionary biologists and plant breeders, hierarchical Bayesian models estimating FVT parameters improve heritabilities compared to frequentist approaches.
Bayesian Estimation of the Logistic Positive Exponent IRT Model
ERIC Educational Resources Information Center
Bolfarine, Heleno; Bazan, Jorge Luis
2010-01-01
A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…
The Misidentified Identifiability Problem of Bayesian Knowledge Tracing
ERIC Educational Resources Information Center
Doroudi, Shayan; Brunskill, Emma
2017-01-01
In this paper, we investigate two purported problems with Bayesian Knowledge Tracing (BKT), a popular statistical model of student learning: "identifiability" and "semantic model degeneracy." In 2007, Beck and Chang stated that BKT is susceptible to an "identifiability problem"--various models with different…
Common quandaries and their practical solutions in Bayesian network modeling
Bruce G. Marcot
2017-01-01
Use and popularity of Bayesian network (BN) modeling has greatly expanded in recent years, but many common problems remain. Here, I summarize key problems in BN model construction and interpretation,along with suggested practical solutions. Problems in BN model construction include parameterizing probability values, variable definition, complex network structures,...
On the Bayesian Nonparametric Generalization of IRT-Type Models
ERIC Educational Resources Information Center
San Martin, Ernesto; Jara, Alejandro; Rolin, Jean-Marie; Mouchart, Michel
2011-01-01
We study the identification and consistency of Bayesian semiparametric IRT-type models, where the uncertainty on the abilities' distribution is modeled using a prior distribution on the space of probability measures. We show that for the semiparametric Rasch Poisson counts model, simple restrictions ensure the identification of a general…
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.
Nowakowska, Marzena
2017-04-01
The development of the Bayesian logistic regression model classifying the road accident severity is discussed. The already exploited informative priors (method of moments, maximum likelihood estimation, and two-stage Bayesian updating), along with the original idea of a Boot prior proposal, are investigated when no expert opinion has been available. In addition, two possible approaches to updating the priors, in the form of unbalanced and balanced training data sets, are presented. The obtained logistic Bayesian models are assessed on the basis of a deviance information criterion (DIC), highest probability density (HPD) intervals, and coefficients of variation estimated for the model parameters. The verification of the model accuracy has been based on sensitivity, specificity and the harmonic mean of sensitivity and specificity, all calculated from a test data set. The models obtained from the balanced training data set have a better classification quality than the ones obtained from the unbalanced training data set. The two-stage Bayesian updating prior model and the Boot prior model, both identified with the use of the balanced training data set, outperform the non-informative, method of moments, and maximum likelihood estimation prior models. It is important to note that one should be careful when interpreting the parameters since different priors can lead to different models. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bayesian learning and the psychology of rule induction
Endress, Ansgar D.
2014-01-01
In recent years, Bayesian learning models have been applied to an increasing variety of domains. While such models have been criticized on theoretical grounds, the underlying assumptions and predictions are rarely made concrete and tested experimentally. Here, I use Frank and Tenenbaum's (2011) Bayesian model of rule-learning as a case study to spell out the underlying assumptions, and to confront them with the empirical results Frank and Tenenbaum (2011) propose to simulate, as well as with novel experiments. While rule-learning is arguably well suited to rational Bayesian approaches, I show that their models are neither psychologically plausible nor ideal observer models. Further, I show that their central assumption is unfounded: humans do not always preferentially learn more specific rules, but, at least in some situations, those rules that happen to be more salient. Even when granting the unsupported assumptions, I show that all of the experiments modeled by Frank and Tenenbaum (2011) either contradict their models, or have a large number of more plausible interpretations. I provide an alternative account of the experimental data based on simple psychological mechanisms, and show that this account both describes the data better, and is easier to falsify. I conclude that, despite the recent surge in Bayesian models of cognitive phenomena, psychological phenomena are best understood by developing and testing psychological theories rather than models that can be fit to virtually any data. PMID:23454791
Archambeau, Cédric; Verleysen, Michel
2007-01-01
A new variational Bayesian learning algorithm for Student-t mixture models is introduced. This algorithm leads to (i) robust density estimation, (ii) robust clustering and (iii) robust automatic model selection. Gaussian mixture models are learning machines which are based on a divide-and-conquer approach. They are commonly used for density estimation and clustering tasks, but are sensitive to outliers. The Student-t distribution has heavier tails than the Gaussian distribution and is therefore less sensitive to any departure of the empirical distribution from Gaussianity. As a consequence, the Student-t distribution is suitable for constructing robust mixture models. In this work, we formalize the Bayesian Student-t mixture model as a latent variable model in a different way from Svensén and Bishop [Svensén, M., & Bishop, C. M. (2005). Robust Bayesian mixture modelling. Neurocomputing, 64, 235-252]. The main difference resides in the fact that it is not necessary to assume a factorized approximation of the posterior distribution on the latent indicator variables and the latent scale variables in order to obtain a tractable solution. Not neglecting the correlations between these unobserved random variables leads to a Bayesian model having an increased robustness. Furthermore, it is expected that the lower bound on the log-evidence is tighter. Based on this bound, the model complexity, i.e. the number of components in the mixture, can be inferred with a higher confidence.
Advances in Bayesian Modeling in Educational Research
ERIC Educational Resources Information Center
Levy, Roy
2016-01-01
In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…
A Bayesian model for estimating multi-state disease progression.
Shen, Shiwen; Han, Simon X; Petousis, Panayiotis; Weiss, Robert E; Meng, Frank; Bui, Alex A T; Hsu, William
2017-02-01
A growing number of individuals who are considered at high risk of cancer are now routinely undergoing population screening. However, noted harms such as radiation exposure, overdiagnosis, and overtreatment underscore the need for better temporal models that predict who should be screened and at what frequency. The mean sojourn time (MST), an average duration period when a tumor can be detected by imaging but with no observable clinical symptoms, is a critical variable for formulating screening policy. Estimation of MST has been long studied using continuous Markov model (CMM) with Maximum likelihood estimation (MLE). However, a lot of traditional methods assume no observation error of the imaging data, which is unlikely and can bias the estimation of the MST. In addition, the MLE may not be stably estimated when data is sparse. Addressing these shortcomings, we present a probabilistic modeling approach for periodic cancer screening data. We first model the cancer state transition using a three state CMM model, while simultaneously considering observation error. We then jointly estimate the MST and observation error within a Bayesian framework. We also consider the inclusion of covariates to estimate individualized rates of disease progression. Our approach is demonstrated on participants who underwent chest x-ray screening in the National Lung Screening Trial (NLST) and validated using posterior predictive p-values and Pearson's chi-square test. Our model demonstrates more accurate and sensible estimates of MST in comparison to MLE. Copyright © 2016 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Leventhal, Brian C.; Stone, Clement A.
2018-01-01
Interest in Bayesian analysis of item response theory (IRT) models has grown tremendously due to the appeal of the paradigm among psychometricians, advantages of these methods when analyzing complex models, and availability of general-purpose software. Possible models include models which reflect multidimensionality due to designed test structure,…
Harrigan, George G; Harrison, Jay M
2012-01-01
New transgenic (GM) crops are subjected to extensive safety assessments that include compositional comparisons with conventional counterparts as a cornerstone of the process. The influence of germplasm, location, environment, and agronomic treatments on compositional variability is, however, often obscured in these pair-wise comparisons. Furthermore, classical statistical significance testing can often provide an incomplete and over-simplified summary of highly responsive variables such as crop composition. In order to more clearly describe the influence of the numerous sources of compositional variation we present an introduction to two alternative but complementary approaches to data analysis and interpretation. These include i) exploratory data analysis (EDA) with its emphasis on visualization and graphics-based approaches and ii) Bayesian statistical methodology that provides easily interpretable and meaningful evaluations of data in terms of probability distributions. The EDA case-studies include analyses of herbicide-tolerant GM soybean and insect-protected GM maize and soybean. Bayesian approaches are presented in an analysis of herbicide-tolerant GM soybean. Advantages of these approaches over classical frequentist significance testing include the more direct interpretation of results in terms of probabilities pertaining to quantities of interest and no confusion over the application of corrections for multiple comparisons. It is concluded that a standardized framework for these methodologies could provide specific advantages through enhanced clarity of presentation and interpretation in comparative assessments of crop composition.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kara G. Eby
2010-08-01
At the Idaho National Laboratory (INL) Cs-137 concentrations above the U.S. Environmental Protection Agency risk-based threshold of 0.23 pCi/g may increase the risk of human mortality due to cancer. As a leader in nuclear research, the INL has been conducting nuclear activities for decades. Elevated anthropogenic radionuclide levels including Cs-137 are a result of atmospheric weapons testing, the Chernobyl accident, and nuclear activities occurring at the INL site. Therefore environmental monitoring and long-term surveillance of Cs-137 is required to evaluate risk. However, due to the large land area involved, frequent and comprehensive monitoring is limited. Developing a spatial model thatmore » predicts Cs-137 concentrations at unsampled locations will enhance the spatial characterization of Cs-137 in surface soils, provide guidance for an efficient monitoring program, and pinpoint areas requiring mitigation strategies. The predictive model presented herein is based on applied geostatistics using a Bayesian analysis of environmental characteristics across the INL site, which provides kriging spatial maps of both Cs-137 estimates and prediction errors. Comparisons are presented of two different kriging methods, showing that the use of secondary information (i.e., environmental characteristics) can provide improved prediction performance in some areas of the INL site.« less
Testing adaptive toolbox models: a Bayesian hierarchical approach.
Scheibehenne, Benjamin; Rieskamp, Jörg; Wagenmakers, Eric-Jan
2013-01-01
Many theories of human cognition postulate that people are equipped with a repertoire of strategies to solve the tasks they face. This theoretical framework of a cognitive toolbox provides a plausible account of intra- and interindividual differences in human behavior. Unfortunately, it is often unclear how to rigorously test the toolbox framework. How can a toolbox model be quantitatively specified? How can the number of toolbox strategies be limited to prevent uncontrolled strategy sprawl? How can a toolbox model be formally tested against alternative theories? The authors show how these challenges can be met by using Bayesian inference techniques. By means of parameter recovery simulations and the analysis of empirical data across a variety of domains (i.e., judgment and decision making, children's cognitive development, function learning, and perceptual categorization), the authors illustrate how Bayesian inference techniques allow toolbox models to be quantitatively specified, strategy sprawl to be contained, and toolbox models to be rigorously tested against competing theories. The authors demonstrate that their approach applies at the individual level but can also be generalized to the group level with hierarchical Bayesian procedures. The suggested Bayesian inference techniques represent a theoretical and methodological advancement for toolbox theories of cognition and behavior.
Tree Biomass Estimation of Chinese fir (Cunninghamia lanceolata) Based on Bayesian Method
Zhang, Jianguo
2013-01-01
Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass. PMID:24278198
Tree biomass estimation of Chinese fir (Cunninghamia lanceolata) based on Bayesian method.
Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo
2013-01-01
Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation W = a(D2H)b was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass.
Use of limited data to construct Bayesian networks for probabilistic risk assessment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Groth, Katrina M.; Swiler, Laura Painton
2013-03-01
Probabilistic Risk Assessment (PRA) is a fundamental part of safety/quality assurance for nuclear power and nuclear weapons. Traditional PRA very effectively models complex hardware system risks using binary probabilistic models. However, traditional PRA models are not flexible enough to accommodate non-binary soft-causal factors, such as digital instrumentation&control, passive components, aging, common cause failure, and human errors. Bayesian Networks offer the opportunity to incorporate these risks into the PRA framework. This report describes the results of an early career LDRD project titled %E2%80%9CUse of Limited Data to Construct Bayesian Networks for Probabilistic Risk Assessment%E2%80%9D. The goal of the work was tomore » establish the capability to develop Bayesian Networks from sparse data, and to demonstrate this capability by producing a data-informed Bayesian Network for use in Human Reliability Analysis (HRA) as part of nuclear power plant Probabilistic Risk Assessment (PRA). This report summarizes the research goal and major products of the research.« less
NASA Astrophysics Data System (ADS)
Maiti, Saumen; Tiwari, Ram Krishna
2010-10-01
A new probabilistic approach based on the concept of Bayesian neural network (BNN) learning theory is proposed for decoding litho-facies boundaries from well-log data. We show that how a multi-layer-perceptron neural network model can be employed in Bayesian framework to classify changes in litho-log successions. The method is then applied to the German Continental Deep Drilling Program (KTB) well-log data for classification and uncertainty estimation in the litho-facies boundaries. In this framework, a posteriori distribution of network parameter is estimated via the principle of Bayesian probabilistic theory, and an objective function is minimized following the scaled conjugate gradient optimization scheme. For the model development, we inflict a suitable criterion, which provides probabilistic information by emulating different combinations of synthetic data. Uncertainty in the relationship between the data and the model space is appropriately taken care by assuming a Gaussian a priori distribution of networks parameters (e.g., synaptic weights and biases). Prior to applying the new method to the real KTB data, we tested the proposed method on synthetic examples to examine the sensitivity of neural network hyperparameters in prediction. Within this framework, we examine stability and efficiency of this new probabilistic approach using different kinds of synthetic data assorted with different level of correlated noise. Our data analysis suggests that the designed network topology based on the Bayesian paradigm is steady up to nearly 40% correlated noise; however, adding more noise (˜50% or more) degrades the results. We perform uncertainty analyses on training, validation, and test data sets with and devoid of intrinsic noise by making the Gaussian approximation of the a posteriori distribution about the peak model. We present a standard deviation error-map at the network output corresponding to the three types of the litho-facies present over the entire litho-section of the KTB. The comparisons of maximum a posteriori geological sections constructed here, based on the maximum a posteriori probability distribution, with the available geological information and the existing geophysical findings suggest that the BNN results reveal some additional finer details in the KTB borehole data at certain depths, which appears to be of some geological significance. We also demonstrate that the proposed BNN approach is superior to the conventional artificial neural network in terms of both avoiding "over-fitting" and aiding uncertainty estimation, which are vital for meaningful interpretation of geophysical records. Our analyses demonstrate that the BNN-based approach renders a robust means for the classification of complex changes in the litho-facies successions and thus could provide a useful guide for understanding the crustal inhomogeneity and the structural discontinuity in many other tectonically complex regions.
Abanto-Valle, C. A.; Bandyopadhyay, D.; Lachos, V. H.; Enriquez, I.
2009-01-01
A Bayesian analysis of stochastic volatility (SV) models using the class of symmetric scale mixtures of normal (SMN) distributions is considered. In the face of non-normality, this provides an appealing robust alternative to the routine use of the normal distribution. Specific distributions examined include the normal, student-t, slash and the variance gamma distributions. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo (MCMC) algorithm is introduced for parameter estimation. Moreover, the mixing parameters obtained as a by-product of the scale mixture representation can be used to identify outliers. The methods developed are applied to analyze daily stock returns data on S&P500 index. Bayesian model selection criteria as well as out-of- sample forecasting results reveal that the SV models based on heavy-tailed SMN distributions provide significant improvement in model fit as well as prediction to the S&P500 index data over the usual normal model. PMID:20730043
Application of a predictive Bayesian model to environmental accounting.
Anex, R P; Englehardt, J D
2001-03-30
Environmental accounting techniques are intended to capture important environmental costs and benefits that are often overlooked in standard accounting practices. Environmental accounting methods themselves often ignore or inadequately represent large but highly uncertain environmental costs and costs conditioned by specific prior events. Use of a predictive Bayesian model is demonstrated for the assessment of such highly uncertain environmental and contingent costs. The predictive Bayesian approach presented generates probability distributions for the quantity of interest (rather than parameters thereof). A spreadsheet implementation of a previously proposed predictive Bayesian model, extended to represent contingent costs, is described and used to evaluate whether a firm should undertake an accelerated phase-out of its PCB containing transformers. Variability and uncertainty (due to lack of information) in transformer accident frequency and severity are assessed simultaneously using a combination of historical accident data, engineering model-based cost estimates, and subjective judgement. Model results are compared using several different risk measures. Use of the model for incorporation of environmental risk management into a company's overall risk management strategy is discussed.
A Bayesian Framework for Human Body Pose Tracking from Depth Image Sequences
Zhu, Youding; Fujimura, Kikuo
2010-01-01
This paper addresses the problem of accurate and robust tracking of 3D human body pose from depth image sequences. Recovering the large number of degrees of freedom in human body movements from a depth image sequence is challenging due to the need to resolve the depth ambiguity caused by self-occlusions and the difficulty to recover from tracking failure. Human body poses could be estimated through model fitting using dense correspondences between depth data and an articulated human model (local optimization method). Although it usually achieves a high accuracy due to dense correspondences, it may fail to recover from tracking failure. Alternately, human pose may be reconstructed by detecting and tracking human body anatomical landmarks (key-points) based on low-level depth image analysis. While this method (key-point based method) is robust and recovers from tracking failure, its pose estimation accuracy depends solely on image-based localization accuracy of key-points. To address these limitations, we present a flexible Bayesian framework for integrating pose estimation results obtained by methods based on key-points and local optimization. Experimental results are shown and performance comparison is presented to demonstrate the effectiveness of the proposed approach. PMID:22399933
Bayesian modelling of uncertainties of Monte Carlo radiative-transfer simulations
NASA Astrophysics Data System (ADS)
Beaujean, Frederik; Eggers, Hans C.; Kerzendorf, Wolfgang E.
2018-04-01
One of the big challenges in astrophysics is the comparison of complex simulations to observations. As many codes do not directly generate observables (e.g. hydrodynamic simulations), the last step in the modelling process is often a radiative-transfer treatment. For this step, the community relies increasingly on Monte Carlo radiative transfer due to the ease of implementation and scalability with computing power. We show how to estimate the statistical uncertainty given the output of just a single radiative-transfer simulation in which the number of photon packets follows a Poisson distribution and the weight (e.g. energy or luminosity) of a single packet may follow an arbitrary distribution. Our Bayesian approach produces a posterior distribution that is valid for any number of packets in a bin, even zero packets, and is easy to implement in practice. Our analytic results for large number of packets show that we generalise existing methods that are valid only in limiting cases. The statistical problem considered here appears in identical form in a wide range of Monte Carlo simulations including particle physics and importance sampling. It is particularly powerful in extracting information when the available data are sparse or quantities are small.
Bayesian modelling of uncertainties of Monte Carlo radiative-transfer simulations
NASA Astrophysics Data System (ADS)
Beaujean, Frederik; Eggers, Hans C.; Kerzendorf, Wolfgang E.
2018-07-01
One of the big challenges in astrophysics is the comparison of complex simulations to observations. As many codes do not directly generate observables (e.g. hydrodynamic simulations), the last step in the modelling process is often a radiative-transfer treatment. For this step, the community relies increasingly on Monte Carlo radiative transfer due to the ease of implementation and scalability with computing power. We consider simulations in which the number of photon packets is Poisson distributed, while the weight assigned to a single photon packet follows any distribution of choice. We show how to estimate the statistical uncertainty of the sum of weights in each bin from the output of a single radiative-transfer simulation. Our Bayesian approach produces a posterior distribution that is valid for any number of packets in a bin, even zero packets, and is easy to implement in practice. Our analytic results for large number of packets show that we generalize existing methods that are valid only in limiting cases. The statistical problem considered here appears in identical form in a wide range of Monte Carlo simulations including particle physics and importance sampling. It is particularly powerful in extracting information when the available data are sparse or quantities are small.
Accounting for uncertainty in health economic decision models by using model averaging
Jackson, Christopher H; Thompson, Simon G; Sharples, Linda D
2009-01-01
Health economic decision models are subject to considerable uncertainty, much of which arises from choices between several plausible model structures, e.g. choices of covariates in a regression model. Such structural uncertainty is rarely accounted for formally in decision models but can be addressed by model averaging. We discuss the most common methods of averaging models and the principles underlying them. We apply them to a comparison of two surgical techniques for repairing abdominal aortic aneurysms. In model averaging, competing models are usually either weighted by using an asymptotically consistent model assessment criterion, such as the Bayesian information criterion, or a measure of predictive ability, such as Akaike's information criterion. We argue that the predictive approach is more suitable when modelling the complex underlying processes of interest in health economics, such as individual disease progression and response to treatment. PMID:19381329
NASA Astrophysics Data System (ADS)
Oladyshkin, S.; Schroeder, P.; Class, H.; Nowak, W.
2013-12-01
Predicting underground carbon dioxide (CO2) storage represents a challenging problem in a complex dynamic system. Due to lacking information about reservoir parameters, quantification of uncertainties may become the dominant question in risk assessment. Calibration on past observed data from pilot-scale test injection can improve the predictive power of the involved geological, flow, and transport models. The current work performs history matching to pressure time series from a pilot storage site operated in Europe, maintained during an injection period. Simulation of compressible two-phase flow and transport (CO2/brine) in the considered site is computationally very demanding, requiring about 12 days of CPU time for an individual model run. For that reason, brute-force approaches for calibration are not feasible. In the current work, we explore an advanced framework for history matching based on the arbitrary polynomial chaos expansion (aPC) and strict Bayesian principles. The aPC [1] offers a drastic but accurate stochastic model reduction. Unlike many previous chaos expansions, it can handle arbitrary probability distribution shapes of uncertain parameters, and can therefore handle directly the statistical information appearing during the matching procedure. We capture the dependence of model output on these multipliers with the expansion-based reduced model. In our study we keep the spatial heterogeneity suggested by geophysical methods, but consider uncertainty in the magnitude of permeability trough zone-wise permeability multipliers. Next combined the aPC with Bootstrap filtering (a brute-force but fully accurate Bayesian updating mechanism) in order to perform the matching. In comparison to (Ensemble) Kalman Filters, our method accounts for higher-order statistical moments and for the non-linearity of both the forward model and the inversion, and thus allows a rigorous quantification of calibrated model uncertainty. The usually high computational costs of accurate filtering become very feasible for our suggested aPC-based calibration framework. However, the power of aPC-based Bayesian updating strongly depends on the accuracy of prior information. In the current study, the prior assumptions on the model parameters were not satisfactory and strongly underestimate the reservoir pressure. Thus, the aPC-based response surface used in Bootstrap filtering is fitted to a distant and poorly chosen region within the parameter space. Thanks to the iterative procedure suggested in [2] we overcome this drawback with small computational costs. The iteration successively improves the accuracy of the expansion around the current estimation of the posterior distribution. The final result is a calibrated model of the site that can be used for further studies, with an excellent match to the data. References [1] Oladyshkin S. and Nowak W. Data-driven uncertainty quantification using the arbitrary polynomial chaos expansion. Reliability Engineering and System Safety, 106:179-190, 2012. [2] Oladyshkin S., Class H., Nowak W. Bayesian updating via Bootstrap filtering combined with data-driven polynomial chaos expansions: methodology and application to history matching for carbon dioxide storage in geological formations. Computational Geosciences, 17 (4), 671-687, 2013.
Adam-Poupart, Ariane; Brand, Allan; Fournier, Michel; Jerrett, Michael; Smargiassi, Audrey
2014-09-01
Ambient air ozone (O3) is a pulmonary irritant that has been associated with respiratory health effects including increased lung inflammation and permeability, airway hyperreactivity, respiratory symptoms, and decreased lung function. Estimation of O3 exposure is a complex task because the pollutant exhibits complex spatiotemporal patterns. To refine the quality of exposure estimation, various spatiotemporal methods have been developed worldwide. We sought to compare the accuracy of three spatiotemporal models to predict summer ground-level O3 in Quebec, Canada. We developed a land-use mixed-effects regression (LUR) model based on readily available data (air quality and meteorological monitoring data, road networks information, latitude), a Bayesian maximum entropy (BME) model incorporating both O3 monitoring station data and the land-use mixed model outputs (BME-LUR), and a kriging method model based only on available O3 monitoring station data (BME kriging). We performed leave-one-station-out cross-validation and visually assessed the predictive capability of each model by examining the mean temporal and spatial distributions of the average estimated errors. The BME-LUR was the best predictive model (R2 = 0.653) with the lowest root mean-square error (RMSE ;7.06 ppb), followed by the LUR model (R2 = 0.466, RMSE = 8.747) and the BME kriging model (R2 = 0.414, RMSE = 9.164). Our findings suggest that errors of estimation in the interpolation of O3 concentrations with BME can be greatly reduced by incorporating outputs from a LUR model developed with readily available data.
A Bayesian modification to the Jelinski-Moranda software reliability growth model
NASA Technical Reports Server (NTRS)
Littlewood, B.; Sofer, A.
1983-01-01
The Jelinski-Moranda (JM) model for software reliability was examined. It is suggested that a major reason for the poor results given by this model is the poor performance of the maximum likelihood method (ML) of parameter estimation. A reparameterization and Bayesian analysis, involving a slight modelling change, are proposed. It is shown that this new Bayesian-Jelinski-Moranda model (BJM) is mathematically quite tractable, and several metrics of interest to practitioners are obtained. The BJM and JM models are compared by using several sets of real software failure data collected and in all cases the BJM model gives superior reliability predictions. A change in the assumption which underlay both models to present the debugging process more accurately is discussed.
Bayesian spatiotemporal crash frequency models with mixture components for space-time interactions.
Cheng, Wen; Gill, Gurdiljot Singh; Zhang, Yongping; Cao, Zhong
2018-03-01
The traffic safety research has developed spatiotemporal models to explore the variations in the spatial pattern of crash risk over time. Many studies observed notable benefits associated with the inclusion of spatial and temporal correlation and their interactions. However, the safety literature lacks sufficient research for the comparison of different temporal treatments and their interaction with spatial component. This study developed four spatiotemporal models with varying complexity due to the different temporal treatments such as (I) linear time trend; (II) quadratic time trend; (III) Autoregressive-1 (AR-1); and (IV) time adjacency. Moreover, the study introduced a flexible two-component mixture for the space-time interaction which allows greater flexibility compared to the traditional linear space-time interaction. The mixture component allows the accommodation of global space-time interaction as well as the departures from the overall spatial and temporal risk patterns. This study performed a comprehensive assessment of mixture models based on the diverse criteria pertaining to goodness-of-fit, cross-validation and evaluation based on in-sample data for predictive accuracy of crash estimates. The assessment of model performance in terms of goodness-of-fit clearly established the superiority of the time-adjacency specification which was evidently more complex due to the addition of information borrowed from neighboring years, but this addition of parameters allowed significant advantage at posterior deviance which subsequently benefited overall fit to crash data. The Base models were also developed to study the comparison between the proposed mixture and traditional space-time components for each temporal model. The mixture models consistently outperformed the corresponding Base models due to the advantages of much lower deviance. For cross-validation comparison of predictive accuracy, linear time trend model was adjudged the best as it recorded the highest value of log pseudo marginal likelihood (LPML). Four other evaluation criteria were considered for typical validation using the same data for model development. Under each criterion, observed crash counts were compared with three types of data containing Bayesian estimated, normal predicted, and model replicated ones. The linear model again performed the best in most scenarios except one case of using model replicated data and two cases involving prediction without including random effects. These phenomena indicated the mediocre performance of linear trend when random effects were excluded for evaluation. This might be due to the flexible mixture space-time interaction which can efficiently absorb the residual variability escaping from the predictable part of the model. The comparison of Base and mixture models in terms of prediction accuracy further bolstered the superiority of the mixture models as the mixture ones generated more precise estimated crash counts across all four models, suggesting that the advantages associated with mixture component at model fit were transferable to prediction accuracy. Finally, the residual analysis demonstrated the consistently superior performance of random effect models which validates the importance of incorporating the correlation structures to account for unobserved heterogeneity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Ghosh, Sujit K
2010-01-01
Bayesian methods are rapidly becoming popular tools for making statistical inference in various fields of science including biology, engineering, finance, and genetics. One of the key aspects of Bayesian inferential method is its logical foundation that provides a coherent framework to utilize not only empirical but also scientific information available to a researcher. Prior knowledge arising from scientific background, expert judgment, or previously collected data is used to build a prior distribution which is then combined with current data via the likelihood function to characterize the current state of knowledge using the so-called posterior distribution. Bayesian methods allow the use of models of complex physical phenomena that were previously too difficult to estimate (e.g., using asymptotic approximations). Bayesian methods offer a means of more fully understanding issues that are central to many practical problems by allowing researchers to build integrated models based on hierarchical conditional distributions that can be estimated even with limited amounts of data. Furthermore, advances in numerical integration methods, particularly those based on Monte Carlo methods, have made it possible to compute the optimal Bayes estimators. However, there is a reasonably wide gap between the background of the empirically trained scientists and the full weight of Bayesian statistical inference. Hence, one of the goals of this chapter is to bridge the gap by offering elementary to advanced concepts that emphasize linkages between standard approaches and full probability modeling via Bayesian methods.
Climatic Models Ensemble-based Mid-21st Century Runoff Projections: A Bayesian Framework
NASA Astrophysics Data System (ADS)
Achieng, K. O.; Zhu, J.
2017-12-01
There are a number of North American Regional Climate Change Assessment Program (NARCCAP) climatic models that have been used to project surface runoff in the mid-21st century. Statistical model selection techniques are often used to select the model that best fits data. However, model selection techniques often lead to different conclusions. In this study, ten models are averaged in Bayesian paradigm to project runoff. Bayesian Model Averaging (BMA) is used to project and identify effect of model uncertainty on future runoff projections. Baseflow separation - a two-digital filter which is also called Eckhardt filter - is used to separate USGS streamflow (total runoff) into two components: baseflow and surface runoff. We use this surface runoff as the a priori runoff when conducting BMA of runoff simulated from the ten RCM models. The primary objective of this study is to evaluate how well RCM multi-model ensembles simulate surface runoff, in a Bayesian framework. Specifically, we investigate and discuss the following questions: How well do ten RCM models ensemble jointly simulate surface runoff by averaging over all the models using BMA, given a priori surface runoff? What are the effects of model uncertainty on surface runoff simulation?
Bayesian state space models for dynamic genetic network construction across multiple tissues.
Liang, Yulan; Kelemen, Arpad
2016-08-01
Construction of gene-gene interaction networks and potential pathways is a challenging and important problem in genomic research for complex diseases while estimating the dynamic changes of the temporal correlations and non-stationarity are the keys in this process. In this paper, we develop dynamic state space models with hierarchical Bayesian settings to tackle this challenge for inferring the dynamic profiles and genetic networks associated with disease treatments. We treat both the stochastic transition matrix and the observation matrix time-variant and include temporal correlation structures in the covariance matrix estimations in the multivariate Bayesian state space models. The unevenly spaced short time courses with unseen time points are treated as hidden state variables. Hierarchical Bayesian approaches with various prior and hyper-prior models with Monte Carlo Markov Chain and Gibbs sampling algorithms are used to estimate the model parameters and the hidden state variables. We apply the proposed Hierarchical Bayesian state space models to multiple tissues (liver, skeletal muscle, and kidney) Affymetrix time course data sets following corticosteroid (CS) drug administration. Both simulation and real data analysis results show that the genomic changes over time and gene-gene interaction in response to CS treatment can be well captured by the proposed models. The proposed dynamic Hierarchical Bayesian state space modeling approaches could be expanded and applied to other large scale genomic data, such as next generation sequence (NGS) combined with real time and time varying electronic health record (EHR) for more comprehensive and robust systematic and network based analysis in order to transform big biomedical data into predictions and diagnostics for precision medicine and personalized healthcare with better decision making and patient outcomes.
Number-Knower Levels in Young Children: Insights from Bayesian Modeling
ERIC Educational Resources Information Center
Lee, Michael D.; Sarnecka, Barbara W.
2011-01-01
Lee and Sarnecka (2010) developed a Bayesian model of young children's behavior on the Give-N test of number knowledge. This paper presents two new extensions of the model, and applies the model to new data. In the first extension, the model is used to evaluate competing theories about the conceptual knowledge underlying children's behavior. One,…
Uwano, Ikuko; Sasaki, Makoto; Kudo, Kohsuke; Boutelier, Timothé; Kameda, Hiroyuki; Mori, Futoshi; Yamashita, Fumio
2017-01-10
The Bayesian estimation algorithm improves the precision of bolus tracking perfusion imaging. However, this algorithm cannot directly calculate Tmax, the time scale widely used to identify ischemic penumbra, because Tmax is a non-physiological, artificial index that reflects the tracer arrival delay (TD) and other parameters. We calculated Tmax from the TD and mean transit time (MTT) obtained by the Bayesian algorithm and determined its accuracy in comparison with Tmax obtained by singular value decomposition (SVD) algorithms. The TD and MTT maps were generated by the Bayesian algorithm applied to digital phantoms with time-concentration curves that reflected a range of values for various perfusion metrics using a global arterial input function. Tmax was calculated from the TD and MTT using constants obtained by a linear least-squares fit to Tmax obtained from the two SVD algorithms that showed the best benchmarks in a previous study. Correlations between the Tmax values obtained by the Bayesian and SVD methods were examined. The Bayesian algorithm yielded accurate TD and MTT values relative to the true values of the digital phantom. Tmax calculated from the TD and MTT values with the least-squares fit constants showed excellent correlation (Pearson's correlation coefficient = 0.99) and agreement (intraclass correlation coefficient = 0.99) with Tmax obtained from SVD algorithms. Quantitative analyses of Tmax values calculated from Bayesian-estimation algorithm-derived TD and MTT from a digital phantom correlated and agreed well with Tmax values determined using SVD algorithms.
Theory-based Bayesian Models of Inductive Inference
2010-07-19
Subjective randomness and natural scene statistics. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom/papers/randscenes.pdf Page 1...in press). Exemplar models as a mechanism for performing Bayesian inference. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom
Modeling Error Distributions of Growth Curve Models through Bayesian Methods
ERIC Educational Resources Information Center
Zhang, Zhiyong
2016-01-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is…
Bayesian Semiparametric Structural Equation Models with Latent Variables
ERIC Educational Resources Information Center
Yang, Mingan; Dunson, David B.
2010-01-01
Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions of unknowns, including the latent variables. In…
Dynamic Bayesian Networks for Student Modeling
ERIC Educational Resources Information Center
Kaser, Tanja; Klingler, Severin; Schwing, Alexander G.; Gross, Markus
2017-01-01
Intelligent tutoring systems adapt the curriculum to the needs of the individual student. Therefore, an accurate representation and prediction of student knowledge is essential. Bayesian Knowledge Tracing (BKT) is a popular approach for student modeling. The structure of BKT models, however, makes it impossible to represent the hierarchy and…
Sakr, Sherif; Elshawi, Radwa; Ahmed, Amjad M; Qureshi, Waqas T; Brawner, Clinton A; Keteyian, Steven J; Blaha, Michael J; Al-Mallah, Mouaz H
2017-12-19
Prior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of this study is to present an evaluation and comparison of how machine learning techniques can be applied on medical records of cardiorespiratory fitness and how the various techniques differ in terms of capabilities of predicting medical outcomes (e.g. mortality). We use data of 34,212 patients free of known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems Between 1991 and 2009 and had a complete 10-year follow-up. Seven machine learning classification techniques were evaluated: Decision Tree (DT), Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayesian Classifier (BC), Bayesian Network (BN), K-Nearest Neighbor (KNN) and Random Forest (RF). In order to handle the imbalanced dataset used, the Synthetic Minority Over-Sampling Technique (SMOTE) is used. Two set of experiments have been conducted with and without the SMOTE sampling technique. On average over different evaluation metrics, SVM Classifier has shown the lowest performance while other models like BN, BC and DT performed better. The RF classifier has shown the best performance (AUC = 0.97) among all models trained using the SMOTE sampling. The results show that various ML techniques can significantly vary in terms of its performance for the different evaluation metrics. It is also not necessarily that the more complex the ML model, the more prediction accuracy can be achieved. The prediction performance of all models trained with SMOTE is much better than the performance of models trained without SMOTE. The study shows the potential of machine learning methods for predicting all-cause mortality using cardiorespiratory fitness data.
Cross-country transferability of multi-variable damage models
NASA Astrophysics Data System (ADS)
Wagenaar, Dennis; Lüdtke, Stefan; Kreibich, Heidi; Bouwer, Laurens
2017-04-01
Flood damage assessment is often done with simple damage curves based only on flood water depth. Additionally, damage models are often transferred in space and time, e.g. from region to region or from one flood event to another. Validation has shown that depth-damage curve estimates are associated with high uncertainties, particularly when applied in regions outside the area where the data for curve development was collected. Recently, progress has been made with multi-variable damage models created with data-mining techniques, i.e. Bayesian Networks and random forest. However, it is still unknown to what extent and under which conditions model transfers are possible and reliable. Model validations in different countries will provide valuable insights into the transferability of multi-variable damage models. In this study we compare multi-variable models developed on basis of flood damage datasets from Germany as well as from The Netherlands. Data from several German floods was collected using computer aided telephone interviews. Data from the 1993 Meuse flood in the Netherlands is available, based on compensations paid by the government. The Bayesian network and random forest based models are applied and validated in both countries on basis of the individual datasets. A major challenge was the harmonization of the variables between both datasets due to factors like differences in variable definitions, and regional and temporal differences in flood hazard and exposure characteristics. Results of model validations and comparisons in both countries are discussed, particularly in respect to encountered challenges and possible solutions for an improvement of model transferability.
A Bayesian context fear learning algorithm/automaton
Krasne, Franklin B.; Cushman, Jesse D.; Fanselow, Michael S.
2015-01-01
Contextual fear conditioning is thought to involve the synaptic plasticity-dependent establishment in hippocampus of representations of to-be-conditioned contexts which can then become associated with USs in the amygdala. A conceptual and computational model of this process is proposed in which contextual attributes are assumed to be sampled serially and randomly during contextual exposures. Given this assumption, moment-to-moment information about such attributes will often be quite different from one exposure to another and, in particular, between exposures during which representations are created, exposures during which conditioning occurs, and during recall sessions. This presents challenges to current conceptual models of hippocampal function. In order to meet these challenges, our model's hippocampus was made to operate in different modes during representation creation and recall, and non-hippocampal machinery was constructed that controlled these hippocampal modes. This machinery uses a comparison between contextual information currently observed and information associated with existing hippocampal representations of familiar contexts to compute the Bayesian Weight of Evidence that the current context is (or is not) a known one, and it uses this value to assess the appropriateness of creation or recall modes. The model predicts a number of known phenomena such as the immediate shock deficit, spurious fear conditioning to contexts that are absent but similar to actually present ones, and modulation of conditioning by pre-familiarization with contexts. It also predicts a number of as yet unknown phenomena. PMID:26074792
Using Bayesian analysis in repeated preclinical in vivo studies for a more effective use of animals.
Walley, Rosalind; Sherington, John; Rastrick, Joe; Detrait, Eric; Hanon, Etienne; Watt, Gillian
2016-05-01
Whilst innovative Bayesian approaches are increasingly used in clinical studies, in the preclinical area Bayesian methods appear to be rarely used in the reporting of pharmacology data. This is particularly surprising in the context of regularly repeated in vivo studies where there is a considerable amount of data from historical control groups, which has potential value. This paper describes our experience with introducing Bayesian analysis for such studies using a Bayesian meta-analytic predictive approach. This leads naturally either to an informative prior for a control group as part of a full Bayesian analysis of the next study or using a predictive distribution to replace a control group entirely. We use quality control charts to illustrate study-to-study variation to the scientists and describe informative priors in terms of their approximate effective numbers of animals. We describe two case studies of animal models: the lipopolysaccharide-induced cytokine release model used in inflammation and the novel object recognition model used to screen cognitive enhancers, both of which show the advantage of a Bayesian approach over the standard frequentist analysis. We conclude that using Bayesian methods in stable repeated in vivo studies can result in a more effective use of animals, either by reducing the total number of animals used or by increasing the precision of key treatment differences. This will lead to clearer results and supports the "3Rs initiative" to Refine, Reduce and Replace animals in research. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Minunno, Francesco; Peltoniemi, Mikko; Launiainen, Samuli; Mäkelä, Annikki
2014-05-01
Biogeochemical models quantify the material and energy flux exchanges between biosphere, atmosphere and soil, however there is still considerable uncertainty underpinning model structure and parametrization. The increasing availability of data from of multiple sources provides useful information for model calibration and validation at different space and time scales. We calibrated a simplified ecosystem process model PRELES to data from multiple sites. In this work we had the following objective: to compare a multi-site calibration and site-specific calibrations, in order to test if PRELES is a model of general applicability, and to test how well one parameterization can predict ecosystem fluxes. Model calibration and evaluation were carried out by the means of the Bayesian method; Bayesian calibration (BC) and Bayesian model comparison (BMC) were used to quantify the uncertainty in model parameters and model structure. Evapotranspiration (ET) and gross primary production (GPP) measurements collected in 9 sites of Finland and Sweden were used in the study; half dataset was used for model calibrations and half for the comparative analyses. 10 BCs were performed; the model was independently calibrated for each of the nine sites (site-specific calibrations) and a multi-site calibration was achieved using the data from all the sites in one BC. Then 9 BMCs were carried out, one for each site, using output from the multi-site and the site-specific versions of PRELES. Similar estimates were obtained for the parameters at which model outputs are most sensitive. Not surprisingly, the joint posterior distribution achieved through the multi-site calibration was characterized by lower uncertainty, because more data were involved in the calibration process. No significant differences were encountered in the prediction of the multi-site and site-specific versions of PRELES, and after BMC, we concluded that the model can be reliably used at regional scale to simulate carbon and water fluxes of Boreal forests. Despite being a simple model, PRELES provided good estimates of GPP and ET; only for one site PRELES multi-site version underestimated water fluxes. Our study implies convergence of GPP and water processes in boreal zone to the extent that their plausible prediction is possible with a simple model using global parameterization.
Wu, Xiao-Lin; Sun, Chuanyu; Beissinger, Timothy M; Rosa, Guilherme Jm; Weigel, Kent A; Gatti, Natalia de Leon; Gianola, Daniel
2012-09-25
Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs.
2012-01-01
Background Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Results Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Conclusions Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs. PMID:23009363
Bayesian survival analysis in clinical trials: What methods are used in practice?
Brard, Caroline; Le Teuff, Gwénaël; Le Deley, Marie-Cécile; Hampson, Lisa V
2017-02-01
Background Bayesian statistics are an appealing alternative to the traditional frequentist approach to designing, analysing, and reporting of clinical trials, especially in rare diseases. Time-to-event endpoints are widely used in many medical fields. There are additional complexities to designing Bayesian survival trials which arise from the need to specify a model for the survival distribution. The objective of this article was to critically review the use and reporting of Bayesian methods in survival trials. Methods A systematic review of clinical trials using Bayesian survival analyses was performed through PubMed and Web of Science databases. This was complemented by a full text search of the online repositories of pre-selected journals. Cost-effectiveness, dose-finding studies, meta-analyses, and methodological papers using clinical trials were excluded. Results In total, 28 articles met the inclusion criteria, 25 were original reports of clinical trials and 3 were re-analyses of a clinical trial. Most trials were in oncology (n = 25), were randomised controlled (n = 21) phase III trials (n = 13), and half considered a rare disease (n = 13). Bayesian approaches were used for monitoring in 14 trials and for the final analysis only in 14 trials. In the latter case, Bayesian survival analyses were used for the primary analysis in four cases, for the secondary analysis in seven cases, and for the trial re-analysis in three cases. Overall, 12 articles reported fitting Bayesian regression models (semi-parametric, n = 3; parametric, n = 9). Prior distributions were often incompletely reported: 20 articles did not define the prior distribution used for the parameter of interest. Over half of the trials used only non-informative priors for monitoring and the final analysis (n = 12) when it was specified. Indeed, no articles fitting Bayesian regression models placed informative priors on the parameter of interest. The prior for the treatment effect was based on historical data in only four trials. Decision rules were pre-defined in eight cases when trials used Bayesian monitoring, and in only one case when trials adopted a Bayesian approach to the final analysis. Conclusion Few trials implemented a Bayesian survival analysis and few incorporated external data into priors. There is scope to improve the quality of reporting of Bayesian methods in survival trials. Extension of the Consolidated Standards of Reporting Trials statement for reporting Bayesian clinical trials is recommended.
Wijeysundera, Duminda N; Austin, Peter C; Hux, Janet E; Beattie, W Scott; Laupacis, Andreas
2009-01-01
Randomized trials generally use "frequentist" statistics based on P-values and 95% confidence intervals. Frequentist methods have limitations that might be overcome, in part, by Bayesian inference. To illustrate these advantages, we re-analyzed randomized trials published in four general medical journals during 2004. We used Medline to identify randomized superiority trials with two parallel arms, individual-level randomization and dichotomous or time-to-event primary outcomes. Studies with P<0.05 in favor of the intervention were deemed "positive"; otherwise, they were "negative." We used several prior distributions and exact conjugate analyses to calculate Bayesian posterior probabilities for clinically relevant effects. Of 88 included studies, 39 were positive using a frequentist analysis. Although the Bayesian posterior probabilities of any benefit (relative risk or hazard ratio<1) were high in positive studies, these probabilities were lower and variable for larger benefits. The positive studies had only moderate probabilities for exceeding the effects that were assumed for calculating the sample size. By comparison, there were moderate probabilities of any benefit in negative studies. Bayesian and frequentist analyses complement each other when interpreting the results of randomized trials. Future reports of randomized trials should include both.
Benchmarking for Bayesian Reinforcement Learning
Ernst, Damien; Couëtoux, Adrien
2016-01-01
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed. PMID:27304891
Benchmarking for Bayesian Reinforcement Learning.
Castronovo, Michael; Ernst, Damien; Couëtoux, Adrien; Fonteneau, Raphael
2016-01-01
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.
A Bayesian Approach for Analyzing Longitudinal Structural Equation Models
ERIC Educational Resources Information Center
Song, Xin-Yuan; Lu, Zhao-Hua; Hser, Yih-Ing; Lee, Sik-Yum
2011-01-01
This article considers a Bayesian approach for analyzing a longitudinal 2-level nonlinear structural equation model with covariates, and mixed continuous and ordered categorical variables. The first-level model is formulated for measures taken at each time point nested within individuals for investigating their characteristics that are dynamically…
Model Diagnostics for Bayesian Networks. Research Report. ETS RR-04-17
ERIC Educational Resources Information Center
Sinharay, Sandip
2004-01-01
Assessing fit of psychometric models has always been an issue of enormous interest, but there exists no unanimously agreed upon item fit diagnostic for the models. Bayesian networks, frequently used in educational assessments (see, for example, Mislevy, Almond, Yan, & Steinberg, 2001) primarily for learning about students' knowledge and…
Bayesian Unimodal Density Regression for Causal Inference
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2011-01-01
Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ng, B
This survey gives an overview of popular generative models used in the modeling of stochastic temporal systems. In particular, this survey is organized into two parts. The first part discusses the discrete-time representations of dynamic Bayesian networks and dynamic relational probabilistic models, while the second part discusses the continuous-time representation of continuous-time Bayesian networks.
Bayesian Estimation of the DINA Model with Gibbs Sampling
ERIC Educational Resources Information Center
Culpepper, Steven Andrew
2015-01-01
A Bayesian model formulation of the deterministic inputs, noisy "and" gate (DINA) model is presented. Gibbs sampling is employed to simulate from the joint posterior distribution of item guessing and slipping parameters, subject attribute parameters, and latent class probabilities. The procedure extends concepts in Béguin and Glas,…
Semisupervised learning using Bayesian interpretation: application to LS-SVM.
Adankon, Mathias M; Cheriet, Mohamed; Biem, Alain
2011-04-01
Bayesian reasoning provides an ideal basis for representing and manipulating uncertain knowledge, with the result that many interesting algorithms in machine learning are based on Bayesian inference. In this paper, we use the Bayesian approach with one and two levels of inference to model the semisupervised learning problem and give its application to the successful kernel classifier support vector machine (SVM) and its variant least-squares SVM (LS-SVM). Taking advantage of Bayesian interpretation of LS-SVM, we develop a semisupervised learning algorithm for Bayesian LS-SVM using our approach based on two levels of inference. Experimental results on both artificial and real pattern recognition problems show the utility of our method.
Fancher, Chris M.; Han, Zhen; Levin, Igor; Page, Katharine; Reich, Brian J.; Smith, Ralph C.; Wilson, Alyson G.; Jones, Jacob L.
2016-01-01
A Bayesian inference method for refining crystallographic structures is presented. The distribution of model parameters is stochastically sampled using Markov chain Monte Carlo. Posterior probability distributions are constructed for all model parameters to properly quantify uncertainty by appropriately modeling the heteroskedasticity and correlation of the error structure. The proposed method is demonstrated by analyzing a National Institute of Standards and Technology silicon standard reference material. The results obtained by Bayesian inference are compared with those determined by Rietveld refinement. Posterior probability distributions of model parameters provide both estimates and uncertainties. The new method better estimates the true uncertainties in the model as compared to the Rietveld method. PMID:27550221
Using Bayesian belief networks in adaptive management.
J.B. Nyberg; B.G. Marcot; R. Sulyma
2006-01-01
Bayesian belief and decision networks are relatively new modeling methods that are especially well suited to adaptive-management applications, but they appear not to have been widely used in adaptive management to date. Bayesian belief networks (BBNs) can serve many purposes for practioners of adaptive management, from illustrating system relations conceptually to...
Prajapati, Kalp Bhusan; Singh, Rajesh
2018-05-10
In present study batch tests were performed to investigate the enhancement in methane production under bio-electrolysis anaerobic co-digestion of sewage sludge and food waste. The bio-electrolysis reactor system (B-EL) yield more methane 148.5 ml/g COD in comparison to reactor system without bio-electrolysis (B-CONT) 125.1 ml/g COD. Whereas bio-electrolysis reactor system (C-EL) Iron Scraps amended yield lesser methane (51.2 ml/g COD) in comparison to control bio-electrolysis reactor system without Iron scraps (C-CONT - 114.4 ml/g COD). Richard and Exponential model were best fitted for cumulative methane production and biogas production rates respectively as revealed modelling study. The best model fit for the different reactors was compared by Akaike's Information Criterion (AIC) and Bayesian Information Criterion (BIC). The bioelectrolysis process seems to be an emerging technology with lesser the loss in cellulase specific activity with increasing temperature from 50 to 80 °C. Copyright © 2018 Elsevier Ltd. All rights reserved.
Gu, Weidong; Medalla, Felicita; Hoekstra, Robert M
2018-02-01
The National Antimicrobial Resistance Monitoring System (NARMS) at the Centers for Disease Control and Prevention tracks resistance among Salmonella infections. The annual number of Salmonella isolates of a particular serotype from states may be small, making direct estimation of resistance proportions unreliable. We developed a Bayesian hierarchical model to improve estimation by borrowing strength from relevant sampling units. We illustrate the models with different specifications of spatio-temporal interaction using 2004-2013 NARMS data for ceftriaxone-resistant Salmonella serotype Heidelberg. Our results show that Bayesian estimates of resistance proportions were smoother than observed values, and the difference between predicted and observed proportions was inversely related to the number of submitted isolates. The model with interaction allowed for tracking of annual changes in resistance proportions at the state level. We demonstrated that Bayesian hierarchical models provide a useful tool to examine spatio-temporal patterns of small sample size such as those found in NARMS. Published by Elsevier Ltd.
NASA Astrophysics Data System (ADS)
Reis, D. S.; Stedinger, J. R.; Martins, E. S.
2005-10-01
This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.
Hayes, Brett K; Heit, Evan
2018-05-01
Inductive reasoning entails using existing knowledge to make predictions about novel cases. The first part of this review summarizes key inductive phenomena and critically evaluates theories of induction. We highlight recent theoretical advances, with a special emphasis on the structured statistical approach, the importance of sampling assumptions in Bayesian models, and connectionist modeling. A number of new research directions in this field are identified including comparisons of inductive and deductive reasoning, the identification of common core processes in induction and memory tasks and induction involving category uncertainty. The implications of induction research for areas as diverse as complex decision-making and fear generalization are discussed. This article is categorized under: Psychology > Reasoning and Decision Making Psychology > Learning. © 2017 Wiley Periodicals, Inc.
Mental health assessment: Inference, explanation, and coherence.
Thagard, Paul; Larocque, Laurette
2018-06-01
Mental health professionals such as psychiatrists and psychotherapists assess their patients by identifying disorders that explain their symptoms. This assessment requires an inference to the best explanation that compares different disorders with respect to how well they explain the available evidence. Such comparisons are captured by the theory of explanatory coherence that states 7 principles for evaluating competing hypotheses in the light of evidence. The computational model ECHO shows how explanatory coherence can be efficiently computed. We show the applicability of explanatory coherence to mental health assessment by modelling a case of psychiatric interviewing and a case of psychotherapeutic evaluation. We argue that this approach is more plausible than Bayesian inference and hermeneutic interpretation. © 2018 John Wiley & Sons, Ltd.
Bayesian parameter estimation for the Wnt pathway: an infinite mixture models approach.
Koutroumpas, Konstantinos; Ballarini, Paolo; Votsi, Irene; Cournède, Paul-Henry
2016-09-01
Likelihood-free methods, like Approximate Bayesian Computation (ABC), have been extensively used in model-based statistical inference with intractable likelihood functions. When combined with Sequential Monte Carlo (SMC) algorithms they constitute a powerful approach for parameter estimation and model selection of mathematical models of complex biological systems. A crucial step in the ABC-SMC algorithms, significantly affecting their performance, is the propagation of a set of parameter vectors through a sequence of intermediate distributions using Markov kernels. In this article, we employ Dirichlet process mixtures (DPMs) to design optimal transition kernels and we present an ABC-SMC algorithm with DPM kernels. We illustrate the use of the proposed methodology using real data for the canonical Wnt signaling pathway. A multi-compartment model of the pathway is developed and it is compared to an existing model. The results indicate that DPMs are more efficient in the exploration of the parameter space and can significantly improve ABC-SMC performance. In comparison to alternative sampling schemes that are commonly used, the proposed approach can bring potential benefits in the estimation of complex multimodal distributions. The method is used to estimate the parameters and the initial state of two models of the Wnt pathway and it is shown that the multi-compartment model fits better the experimental data. Python scripts for the Dirichlet Process Gaussian Mixture model and the Gibbs sampler are available at https://sites.google.com/site/kkoutroumpas/software konstantinos.koutroumpas@ecp.fr. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A fresh look at the Last Glacial Maximum using Paleoclimate Data Assimilation
NASA Astrophysics Data System (ADS)
Malevich, S. B.; Tierney, J. E.; Hakim, G. J.; Tardif, R.
2017-12-01
Quantifying climate conditions during the Last Glacial Maximum ( 21ka) can help us to understand climate responses to forcing and climate states that are poorly represented in the instrumental record. Paleoclimate proxies may be used to estimate these climate conditions, but proxies are sparsely distributed and possess uncertainties from environmental and biogeochemical processes. Alternatively, climate model simulations provide a full-field view, but may predict unrealistic climate states or states not faithful to proxy records. Here, we use data assimilation - combining climate proxy records with a theoretical understanding from climate models - to produce field reconstructions of the LGM that leverage the information from both data and models. To date, data assimilation has mainly been used to produce reconstructions of climate fields through the last millennium. We expand this approach in order to produce a climate fields for the Last Glacial Maximum using an ensemble Kalman filter assimilation. Ensemble samples were formed from output from multiple models including CCSM3, CESM2.1, and HadCM3. These model simulations are combined with marine sediment proxies for upper ocean temperature (TEX86, UK'37, Mg/Ca and δ18O of foraminifera), utilizing forward models based on a newly developed suite of Bayesian proxy system models. We also incorporate age model and radiocarbon reservoir uncertainty into our reconstructions using Bayesian age modeling software. The resulting fields show familiar patterns based on comparison with previous proxy-based reconstructions, but additionally reveal novel patterns of large-scale shifts in ocean-atmosphere dynamics, as the surface temperature data inform upon atmospheric circulation and precipitation patterns.
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring
Carlos Carroll; Devin S. Johnson; Jeffrey R. Dunk; William J. Zielinski
2010-01-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their dataâs spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and...
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies conservation planning. Journal compilation © 2010 Society for Conservation Biology. No claim to original US government works.
Using Bayesian Stable Isotope Mixing Models to Enhance Marine Ecosystem Models
The use of stable isotopes in food web studies has proven to be a valuable tool for ecologists. We investigated the use of Bayesian stable isotope mixing models as constraints for an ecosystem model of a temperate seagrass system on the Atlantic coast of France. δ13C and δ15N i...
ERIC Educational Resources Information Center
Aslan, Burak Galip; Öztürk, Özlem; Inceoglu, Mustafa Murat
2014-01-01
Considering the increasing importance of adaptive approaches in CALL systems, this study implemented a machine learning based student modeling middleware with Bayesian networks. The profiling approach of the student modeling system is based on Felder and Silverman's Learning Styles Model and Felder and Soloman's Index of Learning Styles…
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range between the true value and the maximum likelihood estimated value lines.
Bayesian Factor Analysis as a Variable Selection Problem: Alternative Priors and Consequences
Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric
2016-01-01
Factor analysis is a popular statistical technique for multivariate data analysis. Developments in the structural equation modeling framework have enabled the use of hybrid confirmatory/exploratory approaches in which factor loading structures can be explored relatively flexibly within a confirmatory factor analysis (CFA) framework. Recently, a Bayesian structural equation modeling (BSEM) approach (Muthén & Asparouhov, 2012) has been proposed as a way to explore the presence of cross-loadings in CFA models. We show that the issue of determining factor loading patterns may be formulated as a Bayesian variable selection problem in which Muthén and Asparouhov’s approach can be regarded as a BSEM approach with ridge regression prior (BSEM-RP). We propose another Bayesian approach, denoted herein as the Bayesian structural equation modeling with spike and slab prior (BSEM-SSP), which serves as a one-stage alternative to the BSEM-RP. We review the theoretical advantages and disadvantages of both approaches and compare their empirical performance relative to two modification indices-based approaches and exploratory factor analysis with target rotation. A teacher stress scale data set (Byrne, 2012; Pettegrew & Wolf, 1982) is used to demonstrate our approach. PMID:27314566
Bayesian Inference and Online Learning in Poisson Neuronal Networks.
Huang, Yanping; Rao, Rajesh P N
2016-08-01
Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.
Borchani, Hanen; Bielza, Concha; Martı Nez-Martı N, Pablo; Larrañaga, Pedro
2012-12-01
Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models recently proposed to deal with multi-dimensional classification problems, where each instance in the data set has to be assigned to more than one class variable. In this paper, we propose a Markov blanket-based approach for learning MBCs from data. Basically, it consists of determining the Markov blanket around each class variable using the HITON algorithm, then specifying the directionality over the MBC subgraphs. Our approach is applied to the prediction problem of the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson's Disease Questionnaire (PDQ-39) in order to estimate the health-related quality of life of Parkinson's patients. Fivefold cross-validation experiments were carried out on randomly generated synthetic data sets, Yeast data set, as well as on a real-world Parkinson's disease data set containing 488 patients. The experimental study, including comparison with additional Bayesian network-based approaches, back propagation for multi-label learning, multi-label k-nearest neighbor, multinomial logistic regression, ordinary least squares, and censored least absolute deviations, shows encouraging results in terms of predictive accuracy as well as the identification of dependence relationships among class and feature variables. Copyright © 2012 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Ma, Jun; Yang, Gang; Du, Bo; Zhang, Liangpei
2017-06-01
A new Bayesian method named Poisson Nonnegative Matrix Factorization with Parameter Subspace Clustering Constraint (PNMF-PSCC) has been presented to extract endmembers from Hyperspectral Imagery (HSI). First, the method integrates the liner spectral mixture model with the Bayesian framework and it formulates endmember extraction into a Bayesian inference problem. Second, the Parameter Subspace Clustering Constraint (PSCC) is incorporated into the statistical program to consider the clustering of all pixels in the parameter subspace. The PSCC could enlarge differences among ground objects and helps finding endmembers with smaller spectrum divergences. Meanwhile, the PNMF-PSCC method utilizes the Poisson distribution as the prior knowledge of spectral signals to better explain the quantum nature of light in imaging spectrometer. Third, the optimization problem of PNMF-PSCC is formulated into maximizing the joint density via the Maximum A Posterior (MAP) estimator. The program is finally solved by iteratively optimizing two sub-problems via the Alternating Direction Method of Multipliers (ADMM) framework and the FURTHESTSUM initialization scheme. Five state-of-the art methods are implemented to make comparisons with the performance of PNMF-PSCC on both the synthetic and real HSI datasets. Experimental results show that the PNMF-PSCC outperforms all the five methods in Spectral Angle Distance (SAD) and Root-Mean-Square-Error (RMSE), and especially it could identify good endmembers for ground objects with smaller spectrum divergences.
BAYESIAN METHODS FOR REGIONAL-SCALE EUTROPHICATION MODELS. (R830887)
We demonstrate a Bayesian classification and regression tree (CART) approach to link multiple environmental stressors to biological responses and quantify uncertainty in model predictions. Such an approach can: (1) report prediction uncertainty, (2) be consistent with the amou...
Bayesian stable isotope mixing models
In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...
Bayesian conditional-independence modeling of the AIDS epidemic in England and Wales
NASA Astrophysics Data System (ADS)
Gilks, Walter R.; De Angelis, Daniela; Day, Nicholas E.
We describe the use of conditional-independence modeling, Bayesian inference and Markov chain Monte Carlo, to model and project the HIV-AIDS epidemic in homosexual/bisexual males in England and Wales. Complexity in this analysis arises through selectively missing data, indirectly observed underlying processes, and measurement error. Our emphasis is on presentation and discussion of the concepts, not on the technicalities of this analysis, which can be found elsewhere [D. De Angelis, W.R. Gilks, N.E. Day, Bayesian projection of the the acquired immune deficiency syndrome epidemic (with discussion), Applied Statistics, in press].
Bayesian estimation of seasonal course of canopy leaf area index from hyperspectral satellite data
NASA Astrophysics Data System (ADS)
Varvia, Petri; Rautiainen, Miina; Seppänen, Aku
2018-03-01
In this paper, Bayesian inversion of a physically-based forest reflectance model is investigated to estimate of boreal forest canopy leaf area index (LAI) from EO-1 Hyperion hyperspectral data. The data consist of multiple forest stands with different species compositions and structures, imaged in three phases of the growing season. The Bayesian estimates of canopy LAI are compared to reference estimates based on a spectral vegetation index. The forest reflectance model contains also other unknown variables in addition to LAI, for example leaf single scattering albedo and understory reflectance. In the Bayesian approach, these variables are estimated simultaneously with LAI. The feasibility and seasonal variation of these estimates is also examined. Credible intervals for the estimates are also calculated and evaluated. The results show that the Bayesian inversion approach is significantly better than using a comparable spectral vegetation index regression.
2011-01-01
Background Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. Methods We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC. Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. Results The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient. Conclusions On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "non-informative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain. PMID:21605357
Li, Michael; Dushoff, Jonathan; Bolker, Benjamin M
2018-07-01
Simple mechanistic epidemic models are widely used for forecasting and parameter estimation of infectious diseases based on noisy case reporting data. Despite the widespread application of models to emerging infectious diseases, we know little about the comparative performance of standard computational-statistical frameworks in these contexts. Here we build a simple stochastic, discrete-time, discrete-state epidemic model with both process and observation error and use it to characterize the effectiveness of different flavours of Bayesian Markov chain Monte Carlo (MCMC) techniques. We use fits to simulated data, where parameters (and future behaviour) are known, to explore the limitations of different platforms and quantify parameter estimation accuracy, forecasting accuracy, and computational efficiency across combinations of modeling decisions (e.g. discrete vs. continuous latent states, levels of stochasticity) and computational platforms (JAGS, NIMBLE, Stan).
Lim, Cherry; Wannapinij, Prapass; White, Lisa; Day, Nicholas P J; Cooper, Ben S; Peacock, Sharon J; Limmathurotsakul, Direk
2013-01-01
Estimates of the sensitivity and specificity for new diagnostic tests based on evaluation against a known gold standard are imprecise when the accuracy of the gold standard is imperfect. Bayesian latent class models (LCMs) can be helpful under these circumstances, but the necessary analysis requires expertise in computational programming. Here, we describe open-access web-based applications that allow non-experts to apply Bayesian LCMs to their own data sets via a user-friendly interface. Applications for Bayesian LCMs were constructed on a web server using R and WinBUGS programs. The models provided (http://mice.tropmedres.ac) include two Bayesian LCMs: the two-tests in two-population model (Hui and Walter model) and the three-tests in one-population model (Walter and Irwig model). Both models are available with simplified and advanced interfaces. In the former, all settings for Bayesian statistics are fixed as defaults. Users input their data set into a table provided on the webpage. Disease prevalence and accuracy of diagnostic tests are then estimated using the Bayesian LCM, and provided on the web page within a few minutes. With the advanced interfaces, experienced researchers can modify all settings in the models as needed. These settings include correlation among diagnostic test results and prior distributions for all unknown parameters. The web pages provide worked examples with both models using the original data sets presented by Hui and Walter in 1980, and by Walter and Irwig in 1988. We also illustrate the utility of the advanced interface using the Walter and Irwig model on a data set from a recent melioidosis study. The results obtained from the web-based applications were comparable to those published previously. The newly developed web-based applications are open-access and provide an important new resource for researchers worldwide to evaluate new diagnostic tests.
Precise Network Modeling of Systems Genetics Data Using the Bayesian Network Webserver.
Ziebarth, Jesse D; Cui, Yan
2017-01-01
The Bayesian Network Webserver (BNW, http://compbio.uthsc.edu/BNW ) is an integrated platform for Bayesian network modeling of biological datasets. It provides a web-based network modeling environment that seamlessly integrates advanced algorithms for probabilistic causal modeling and reasoning with Bayesian networks. BNW is designed for precise modeling of relatively small networks that contain less than 20 nodes. The structure learning algorithms used by BNW guarantee the discovery of the best (most probable) network structure given the data. To facilitate network modeling across multiple biological levels, BNW provides a very flexible interface that allows users to assign network nodes into different tiers and define the relationships between and within the tiers. This function is particularly useful for modeling systems genetics datasets that often consist of multiscalar heterogeneous genotype-to-phenotype data. BNW enables users to, within seconds or minutes, go from having a simply formatted input file containing a dataset to using a network model to make predictions about the interactions between variables and the potential effects of experimental interventions. In this chapter, we will introduce the functions of BNW and show how to model systems genetics datasets with BNW.
Ahn, Jaeil; Mukherjee, Bhramar; Banerjee, Mousumi; Cooney, Kathleen A.
2011-01-01
Summary The stereotype regression model for categorical outcomes, proposed by Anderson (1984) is nested between the baseline category logits and adjacent category logits model with proportional odds structure. The stereotype model is more parsimonious than the ordinary baseline-category (or multinomial logistic) model due to a product representation of the log odds-ratios in terms of a common parameter corresponding to each predictor and category specific scores. The model could be used for both ordered and unordered outcomes. For ordered outcomes, the stereotype model allows more flexibility than the popular proportional odds model in capturing highly subjective ordinal scaling which does not result from categorization of a single latent variable, but are inherently multidimensional in nature. As pointed out by Greenland (1994), an additional advantage of the stereotype model is that it provides unbiased and valid inference under outcome-stratified sampling as in case-control studies. In addition, for matched case-control studies, the stereotype model is amenable to classical conditional likelihood principle, whereas there is no reduction due to sufficiency under the proportional odds model. In spite of these attractive features, the model has been applied less, as there are issues with maximum likelihood estimation and likelihood based testing approaches due to non-linearity and lack of identifiability of the parameters. We present comprehensive Bayesian inference and model comparison procedure for this class of models as an alternative to the classical frequentist approach. We illustrate our methodology by analyzing data from The Flint Men’s Health Study, a case-control study of prostate cancer in African-American men aged 40 to 79 years. We use clinical staging of prostate cancer in terms of Tumors, Nodes and Metastatsis (TNM) as the categorical response of interest. PMID:19731262
Liu, Peigui; Elshall, Ahmed S.; Ye, Ming; ...
2016-02-05
Evaluating marginal likelihood is the most critical and computationally expensive task, when conducting Bayesian model averaging to quantify parametric and model uncertainties. The evaluation is commonly done by using Laplace approximations to evaluate semianalytical expressions of the marginal likelihood or by using Monte Carlo (MC) methods to evaluate arithmetic or harmonic mean of a joint likelihood function. This study introduces a new MC method, i.e., thermodynamic integration, which has not been attempted in environmental modeling. Instead of using samples only from prior parameter space (as in arithmetic mean evaluation) or posterior parameter space (as in harmonic mean evaluation), the thermodynamicmore » integration method uses samples generated gradually from the prior to posterior parameter space. This is done through a path sampling that conducts Markov chain Monte Carlo simulation with different power coefficient values applied to the joint likelihood function. The thermodynamic integration method is evaluated using three analytical functions by comparing the method with two variants of the Laplace approximation method and three MC methods, including the nested sampling method that is recently introduced into environmental modeling. The thermodynamic integration method outperforms the other methods in terms of their accuracy, convergence, and consistency. The thermodynamic integration method is also applied to a synthetic case of groundwater modeling with four alternative models. The application shows that model probabilities obtained using the thermodynamic integration method improves predictive performance of Bayesian model averaging. As a result, the thermodynamic integration method is mathematically rigorous, and its MC implementation is computationally general for a wide range of environmental problems.« less
NASA Astrophysics Data System (ADS)
Liu, L.; Du, L.; Liao, Y.
2017-12-01
Based on the ensemble hindcast dataset of CSM1.1m by NCC, CMA, Bayesian merging models and a two-step statistical model are developed and employed to predict monthly grid/station precipitation in the Huaihe River China during summer at the lead-time of 1 to 3 months. The hindcast datasets span a period of 1991 to 2014. The skill of the two models is evaluated using area under the ROC curve (AUC) in a leave-one-out cross-validation framework, and is compared to the skill of CSM1.1m. CSM1.1m has highest skill for summer precipitation from April while lowest from May, and has highest skill for precipitation in June but lowest for precipitation in July. Compared with raw outputs of climate models, some schemes of the two approaches have higher skill for the prediction from March and May, but almost schemes have lower skill for prediction from April. Compared to two-step approach, one sampling scheme of Bayesian merging approach has higher skill for the prediction from March, but has lower skill from May. The results suggest that there is potential to apply the two statistical models for monthly precipitation forecast in summer from March and from May over Huaihe River basin, but is potential to apply CSM1.1m forecast from April. Finally, the summer runoff during 1991 to 2014 is simulated based on one hydrological model using the climate hindcast of CSM1.1m and the two statistical models.
Fragment virtual screening based on Bayesian categorization for discovering novel VEGFR-2 scaffolds.
Zhang, Yanmin; Jiao, Yu; Xiong, Xiao; Liu, Haichun; Ran, Ting; Xu, Jinxing; Lu, Shuai; Xu, Anyang; Pan, Jing; Qiao, Xin; Shi, Zhihao; Lu, Tao; Chen, Yadong
2015-11-01
The discovery of novel scaffolds against a specific target has long been one of the most significant but challengeable goals in discovering lead compounds. A scaffold that binds in important regions of the active pocket is more favorable as a starting point because scaffolds generally possess greater optimization possibilities. However, due to the lack of sufficient chemical space diversity of the databases and the ineffectiveness of the screening methods, it still remains a great challenge to discover novel active scaffolds. Since the strengths and weaknesses of both fragment-based drug design and traditional virtual screening (VS), we proposed a fragment VS concept based on Bayesian categorization for the discovery of novel scaffolds. This work investigated the proposal through an application on VEGFR-2 target. Firstly, scaffold and structural diversity of chemical space for 10 compound databases were explicitly evaluated. Simultaneously, a robust Bayesian classification model was constructed for screening not only compound databases but also their corresponding fragment databases. Although analysis of the scaffold diversity demonstrated a very unevenly distribution of scaffolds over molecules, results showed that our Bayesian model behaved better in screening fragments than molecules. Through a literature retrospective research, several generated fragments with relatively high Bayesian scores indeed exhibit VEGFR-2 biological activity, which strongly proved the effectiveness of fragment VS based on Bayesian categorization models. This investigation of Bayesian-based fragment VS can further emphasize the necessity for enrichment of compound databases employed in lead discovery by amplifying the diversity of databases with novel structures.