Thomas, D.L.; Johnson, D.; Griffith, B.
2006-01-01
Modeling the probability of use of land units characterized by discrete and continuous measures, we present a Bayesian random-effects model to assess resource selection. This model provides simultaneous estimation of both individual- and population-level selection. Deviance information criterion (DIC), a Bayesian alternative to AIC that is sample-size specific, is used for model selection. Aerial radiolocation data from 76 adult female caribou (Rangifer tarandus) and calf pairs during 1 year on an Arctic coastal plain calving ground were used to illustrate models and assess population-level selection of landscape attributes, as well as individual heterogeneity of selection. Landscape attributes included elevation, NDVI (a measure of forage greenness), and land cover-type classification. Results from the first of a 2-stage model-selection procedure indicated that there is substantial heterogeneity among cow-calf pairs with respect to selection of the landscape attributes. In the second stage, selection of models with heterogeneity included indicated that at the population-level, NDVI and land cover class were significant attributes for selection of different landscapes by pairs on the calving ground. Population-level selection coefficients indicate that the pairs generally select landscapes with higher levels of NDVI, but the relationship is quadratic. The highest rate of selection occurs at values of NDVI less than the maximum observed. Results for land cover-class selections coefficients indicate that wet sedge, moist sedge, herbaceous tussock tundra, and shrub tussock tundra are selected at approximately the same rate, while alpine and sparsely vegetated landscapes are selected at a lower rate. Furthermore, the variability in selection by individual caribou for moist sedge and sparsely vegetated landscapes is large relative to the variability in selection of other land cover types. The example analysis illustrates that, while sometimes computationally intense, a
Gracia, Enrique; López-Quílez, Antonio; Marco, Miriam; Lladosa, Silvia; Lila, Marisol
2014-01-01
This paper uses spatial data of cases of intimate partner violence against women (IPVAW) to examine neighborhood-level influences on small-area variations in IPVAW risk in a police district of the city of Valencia (Spain). To analyze area variations in IPVAW risk and its association with neighborhood-level explanatory variables we use a Bayesian spatial random-effects modeling approach, as well as disease mapping methods to represent risk probabilities in each area. Analyses show that IPVAW cases are more likely in areas of high immigrant concentration, high public disorder and crime, and high physical disorder. Results also show a spatial component indicating remaining variability attributable to spatially structured random effects. Bayesian spatial modeling offers a new perspective to identify IPVAW high and low risk areas, and provides a new avenue for the design of better-informed prevention and intervention strategies. PMID:24413701
Gracia, Enrique; López-Quílez, Antonio; Marco, Miriam; Lladosa, Silvia; Lila, Marisol
2014-01-01
This paper uses spatial data of cases of intimate partner violence against women (IPVAW) to examine neighborhood-level influences on small-area variations in IPVAW risk in a police district of the city of Valencia (Spain). To analyze area variations in IPVAW risk and its association with neighborhood-level explanatory variables we use a Bayesian spatial random-effects modeling approach, as well as disease mapping methods to represent risk probabilities in each area. Analyses show that IPVAW cases are more likely in areas of high immigrant concentration, high public disorder and crime, and high physical disorder. Results also show a spatial component indicating remaining variability attributable to spatially structured random effects. Bayesian spatial modeling offers a new perspective to identify IPVAW high and low risk areas, and provides a new avenue for the design of better-informed prevention and intervention strategies. PMID:24413701
Chan, Jennifer S K
2016-05-01
Dropouts are common in longitudinal study. If the dropout probability depends on the missing observations at or after dropout, this type of dropout is called informative (or nonignorable) dropout (ID). Failure to accommodate such dropout mechanism into the model will bias the parameter estimates. We propose a conditional autoregressive model for longitudinal binary data with an ID model such that the probabilities of positive outcomes as well as the drop-out indicator in each occasion are logit linear in some covariates and outcomes. This model adopting a marginal model for outcomes and a conditional model for dropouts is called a selection model. To allow for the heterogeneity and clustering effects, the outcome model is extended to incorporate mixture and random effects. Lastly, the model is further extended to a novel model that models the outcome and dropout jointly such that their dependency is formulated through an odds ratio function. Parameters are estimated by a Bayesian approach implemented using the user-friendly Bayesian software WinBUGS. A methadone clinic dataset is analyzed to illustrate the proposed models. Result shows that the treatment time effect is still significant but weaker after allowing for an ID process in the data. Finally the effect of drop-out on parameter estimates is evaluated through simulation studies. PMID:26467236
Random effects and shrinkage estimation in capture-recapture models
Royle, J. Andrew; Link, W.A.
2002-01-01
We discuss the analysis of random effects in capture-recapture models, and outline Bayesian and frequentists approaches to their analysis. Under a normal model, random effects estimators derived from Bayesian or frequentist considerations have a common form as shrinkage estimators. We discuss some of the difficulties of analysing random effects using traditional methods, and argue that a Bayesian formulation provides a rigorous framework for dealing with these difficulties. In capture-recapture models, random effects may provide a parsimonious compromise between constant and completely time-dependent models for the parameters (e.g. survival probability). We consider application of random effects to band-recovery models, although the principles apply to more general situations, such as Cormack-Jolly-Seber models. We illustrate these ideas using a commonly analysed band recovery data set.
Application of Poisson random effect models for highway network screening.
Jiang, Ximiao; Abdel-Aty, Mohamed; Alamili, Samer
2014-02-01
In recent years, Bayesian random effect models that account for the temporal and spatial correlations of crash data became popular in traffic safety research. This study employs random effect Poisson Log-Normal models for crash risk hotspot identification. Both the temporal and spatial correlations of crash data were considered. Potential for Safety Improvement (PSI) were adopted as a measure of the crash risk. Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in the Central Florida area, the random effect approaches were compared to the traditional Empirical Bayesian (EB) method and the conventional Bayesian Poisson Log-Normal model. A series of method examination tests were conducted to evaluate the performance of different approaches. These tests include the previously developed site consistence test, method consistence test, total rank difference test, and the modified total score test, as well as the newly proposed total safety performance measure difference test. Results show that the Bayesian Poisson model accounting for both temporal and spatial random effects (PTSRE) outperforms the model that with only temporal random effect, and both are superior to the conventional Poisson Log-Normal model (PLN) and the EB model in the fitting of crash data. Additionally, the method evaluation tests indicate that the PTSRE model is significantly superior to the PLN model and the EB model in consistently identifying hotspots during successive time periods. The results suggest that the PTSRE model is a superior alternative for road site crash risk hotspot identification. PMID:24269863
2003-01-01
In the case of the mixed linear model the random effects are usually assumed to be normally distributed in both the Bayesian and classical frameworks. In this paper, the Dirichlet process prior was used to provide nonparametric Bayesian estimates for correlated random effects. This goal was achieved by providing a Gibbs sampler algorithm that allows these correlated random effects to have a nonparametric prior distribution. A sampling based method is illustrated. This method which is employed by transforming the genetic covariance matrix to an identity matrix so that the random effects are uncorrelated, is an extension of the theory and the results of previous researchers. Also by using Gibbs sampling and data augmentation a simulation procedure was derived for estimating the precision parameter M associated with the Dirichlet process prior. All needed conditional posterior distributions are given. To illustrate the application, data from the Elsenburg Dormer sheep stud were analysed. A total of 3325 weaning weight records from the progeny of 101 sires were used. PMID:12633530
ERIC Educational Resources Information Center
Huang, Hung-Yu; Wang, Wen-Chung
2014-01-01
The DINA (deterministic input, noisy, and gate) model has been widely used in cognitive diagnosis tests and in the process of test development. The outcomes known as slip and guess are included in the DINA model function representing the responses to the items. This study aimed to extend the DINA model by using the random-effect approach to allow…
Random-effects models for longitudinal data
Laird, N.M.; Ware, J.H.
1982-12-01
Models for the analysis of longitudinal data must recognize the relationship between serial observations on the same unit. Multivariate models with general covariance structure are often difficult to apply to highly unbalanced data, whereas two-stage random-effects models can be used easily. In two-stage models, the probability distributions for the response vectors of different individuals belong to a single family, but some random-effects parameters vary across individuals, with a distribution specified at the second stage. A general family of models is discussed, which includes both growth models and repeated-measures models as special cases. A unified approach to fitting these models, based on a combination of empirical Bayes and maximum likelihood estimation of model parameters and using the EM algorithm, is discussed. Two examples are taken from a current epidemiological study of the health effects of air pollution.
De la Cruz, Rolando; Meza, Cristian; Arribas-Gil, Ana; Carroll, Raymond J.
2016-01-01
Joint models for a wide class of response variables and longitudinal measurements consist on a mixed-effects model to fit longitudinal trajectories whose random effects enter as covariates in a generalized linear model for the primary response. They provide a useful way to assess association between these two kinds of data, which in clinical studies are often collected jointly on a series of individuals and may help understanding, for instance, the mechanisms of recovery of a certain disease or the efficacy of a given therapy. When a nonlinear mixed-effects model is used to fit the longitudinal trajectories, the existing estimation strategies based on likelihood approximations have been shown to exhibit some computational efficiency problems (De la Cruz et al., 2011). In this article we consider a Bayesian estimation procedure for the joint model with a nonlinear mixed-effects model for the longitudinal data and a generalized linear model for the primary response. The proposed prior structure allows for the implementation of an MCMC sampler. Moreover, we consider that the errors in the longitudinal model may be correlated. We apply our method to the analysis of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. We also conduct a simulation study to assess the importance of modelling correlated errors and quantify the consequences of model misspecification. PMID:27274601
ERIC Educational Resources Information Center
Beretvas, S. Natasha; Murphy, Daniel L.
2013-01-01
The authors assessed correct model identification rates of Akaike's information criterion (AIC), corrected criterion (AICC), consistent AIC (CAIC), Hannon and Quinn's information criterion (HQIC), and Bayesian information criterion (BIC) for selecting among cross-classified random effects models. Performance of default values for the 5…
A random effects epidemic-type aftershock sequence model.
Lin, Feng-Chang
2011-04-01
We consider an extension of the temporal epidemic-type aftershock sequence (ETAS) model with random effects as a special case of a well-known doubly stochastic self-exciting point process. The new model arises from a deterministic function that is randomly scaled by a nonnegative random variable, which is unobservable but assumed to follow either positive stable or one-parameter gamma distribution with unit mean. Both random effects models are of interest although the one-parameter gamma random effects model is more popular when modeling associated survival times. Our estimation is based on the maximum likelihood approach with marginalized intensity. The methods are shown to perform well in simulation experiments. When applied to an earthquake sequence on the east coast of Taiwan, the extended model with positive stable random effects provides a better model fit, compared to the original ETAS model and the extended model with one-parameter gamma random effects. PMID:24039322
A Gompertzian model with random effects to cervical cancer growth
NASA Astrophysics Data System (ADS)
Mazlan, Mazma Syahidatul Ayuni; Rosli, Norhayati
2015-05-01
In this paper, a Gompertzian model with random effects is introduced to describe the cervical cancer growth. The parameters values of the mathematical model are estimated via maximum likehood estimation. We apply 4-stage Runge-Kutta (SRK4) for solving the stochastic model numerically. The efficiency of mathematical model is measured by comparing the simulated result and the clinical data of the cervical cancer growth. Low values of root mean-square error (RMSE) of Gompertzian model with random effect indicate good fits.
A Gompertzian model with random effects to cervical cancer growth
Mazlan, Mazma Syahidatul Ayuni; Rosli, Norhayati
2015-05-15
In this paper, a Gompertzian model with random effects is introduced to describe the cervical cancer growth. The parameters values of the mathematical model are estimated via maximum likehood estimation. We apply 4-stage Runge-Kutta (SRK4) for solving the stochastic model numerically. The efficiency of mathematical model is measured by comparing the simulated result and the clinical data of the cervical cancer growth. Low values of root mean-square error (RMSE) of Gompertzian model with random effect indicate good fits.
Performance of Random Effects Model Estimators under Complex Sampling Designs
ERIC Educational Resources Information Center
Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan
2011-01-01
In this article, we consider estimation of parameters of random effects models from samples collected via complex multistage designs. Incorporation of sampling weights is one way to reduce estimation bias due to unequal probabilities of selection. Several weighting methods have been proposed in the literature for estimating the parameters of…
The Random-Effect Generalized Rating Scale Model
ERIC Educational Resources Information Center
Wang, Wen-Chung; Wu, Shiu-Lien
2011-01-01
Rating scale items have been widely used in educational and psychological tests. These items require people to make subjective judgments, and these subjective judgments usually involve randomness. To account for this randomness, Wang, Wilson, and Shih proposed the random-effect rating scale model in which the threshold parameters are treated as…
Krivitsky, Pavel N.; Handcock, Mark S.; Raftery, Adrian E.; Hoff, Peter D.
2009-01-01
Social network data often involve transitivity, homophily on observed attributes, clustering, and heterogeneity of actor degrees. We propose a latent cluster random effects model to represent all of these features, and we describe a Bayesian estimation method for it. The model is applicable to both binary and non-binary network data. We illustrate the model using two real datasets. We also apply it to two simulated network datasets with the same, highly skewed, degree distribution, but very different network behavior: one unstructured and the other with transitivity and clustering. Models based on degree distributions, such as scale-free, preferential attachment and power-law models, cannot distinguish between these very different situations, but our model does. PMID:20191087
Identification of dynamical biological systems based on random effects models.
Batista, Levy; Bastogne, Thierry; Djermoune, El-Hadi
2015-01-01
System identification is a data-driven modeling approach more and more used in biology and biomedicine. In this application context, each assay is always repeated to estimate the response variability. The inference of the modeling conclusions to the whole population requires to account for the inter-individual variability within the modeling procedure. One solution consists in using random effects models but up to now no similar approach exists in the field of dynamical system identification. In this article, we propose a new solution based on an ARX (Auto Regressive model with eXternal inputs) structure using the EM (Expectation-Maximisation) algorithm for the estimation of the model parameters. Simulations show the relevance of this solution compared with a classical procedure of system identification repeated for each subject. PMID:26736981
Bayesian Model Selection for Group Studies
Stephan, Klaas Enno; Penny, Will D.; Daunizeau, Jean; Moran, Rosalyn J.; Friston, Karl J.
2009-01-01
Bayesian model selection (BMS) is a powerful method for determining the most likely among a set of competing hypotheses about the mechanisms that generated observed data. BMS has recently found widespread application in neuroimaging, particularly in the context of dynamic causal modelling (DCM). However, so far, combining BMS results from several subjects has relied on simple (fixed effects) metrics, e.g. the group Bayes factor (GBF), that do not account for group heterogeneity or outliers. In this paper, we compare the GBF with two random effects methods for BMS at the between-subject or group level. These methods provide inference on model-space using a classical and Bayesian perspective respectively. First, a classical (frequentist) approach uses the log model evidence as a subject-specific summary statistic. This enables one to use analysis of variance to test for differences in log-evidences over models, relative to inter-subject differences. We then consider the same problem in Bayesian terms and describe a novel hierarchical model, which is optimised to furnish a probability density on the models themselves. This new variational Bayes method rests on treating the model as a random variable and estimating the parameters of a Dirichlet distribution which describes the probabilities for all models considered. These probabilities then define a multinomial distribution over model space, allowing one to compute how likely it is that a specific model generated the data of a randomly chosen subject as well as the exceedance probability of one model being more likely than any other model. Using empirical and synthetic data, we show that optimising a conditional density of the model probabilities, given the log-evidences for each model over subjects, is more informative and appropriate than both the GBF and frequentist tests of the log-evidences. In particular, we found that the hierarchical Bayesian approach is considerably more robust than either of the other
Modeling Randomness in Judging Rating Scales with a Random-Effects Rating Scale Model
ERIC Educational Resources Information Center
Wang, Wen-Chung; Wilson, Mark; Shih, Ching-Lin
2006-01-01
This study presents the random-effects rating scale model (RE-RSM) which takes into account randomness in the thresholds over persons by treating them as random-effects and adding a random variable for each threshold in the rating scale model (RSM) (Andrich, 1978). The RE-RSM turns out to be a special case of the multidimensional random…
A Bayesian approach to parameter estimation in HIV dynamical models.
Putter, H; Heisterkamp, S H; Lange, J M A; de Wolf, F
2002-08-15
In the context of a mathematical model describing HIV infection, we discuss a Bayesian modelling approach to a non-linear random effects estimation problem. The model and the data exhibit a number of features that make the use of an ordinary non-linear mixed effects model intractable: (i) the data are from two compartments fitted simultaneously against the implicit numerical solution of a system of ordinary differential equations; (ii) data from one compartment are subject to censoring; (iii) random effects for one variable are assumed to be from a beta distribution. We show how the Bayesian framework can be exploited by incorporating prior knowledge on some of the parameters, and by combining the posterior distributions of the parameters to obtain estimates of quantities of interest that follow from the postulated model. PMID:12210633
Random-effects models for serial observations with binary response
Stiratelli, R.; Laird, N.; Ware, J.H.
1984-12-01
This paper presents a general mixed model for the analysis of serial dichotomous responses provided by a panel of study participants. Each subject's serial responses are assumed to arise from a logistic model, but with regression coefficients that vary between subjects. The logistic regression parameters are assumed to be normally distributed in the population. Inference is based upon maximum likelihood estimation of fixed effects and variance components, and empirical Bayes estimation of random effects. Exact solutions are analytically and computationally infeasible, but an approximation based on the mode of the posterior distribution of the random parameters is proposed, and is implemented by means of the EM algorithm. This approximate method is compared with a simpler two-step method proposed by Korn and Whittemore, using data from a panel study of asthmatics originally described in that paper. One advantage of the estimation strategy described here is the ability to use all of the data, including that from subjects with insufficient data to permit fitting of a separate logistic regression model, as required by the Korn and Whittemore method. However, the new method is computationally intensive.
Tang, An-Min; Tang, Nian-Sheng
2015-02-28
We propose a semiparametric multivariate skew-normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within-subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew-normal distribution to specify the within-subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis-Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within-subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies. PMID:25404574
Bayesian model reduction and empirical Bayes for group (DCM) studies.
Friston, Karl J; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E; van Wijk, Bernadette C M; Ziegler, Gabriel; Zeidman, Peter
2016-03-01
This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level - e.g., dynamic causal models - and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570
Bayesian model reduction and empirical Bayes for group (DCM) studies
Friston, Karl J.; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E.; van Wijk, Bernadette C.M.; Ziegler, Gabriel; Zeidman, Peter
2016-01-01
This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level – e.g., dynamic causal models – and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570
Bayesian Model Averaging for Propensity Score Analysis
ERIC Educational Resources Information Center
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
SnIPRE: selection inference using a Poisson random effects model.
Eilertson, Kirsten E; Booth, James G; Bustamante, Carlos D
2012-01-01
We present an approach for identifying genes under natural selection using polymorphism and divergence data from synonymous and non-synonymous sites within genes. A generalized linear mixed model is used to model the genome-wide variability among categories of mutations and estimate its functional consequence. We demonstrate how the model's estimated fixed and random effects can be used to identify genes under selection. The parameter estimates from our generalized linear model can be transformed to yield population genetic parameter estimates for quantities including the average selection coefficient for new mutations at a locus, the synonymous and non-synynomous mutation rates, and species divergence times. Furthermore, our approach incorporates stochastic variation due to the evolutionary process and can be fit using standard statistical software. The model is fit in both the empirical Bayes and Bayesian settings using the lme4 package in R, and Markov chain Monte Carlo methods in WinBUGS. Using simulated data we compare our method to existing approaches for detecting genes under selection: the McDonald-Kreitman test, and two versions of the Poisson random field based method MKprf. Overall, we find our method universally outperforms existing methods for detecting genes subject to selection using polymorphism and divergence data. PMID:23236270
SnIPRE: Selection Inference Using a Poisson Random Effects Model
Eilertson, Kirsten E.; Booth, James G.; Bustamante, Carlos D.
2012-01-01
We present an approach for identifying genes under natural selection using polymorphism and divergence data from synonymous and non-synonymous sites within genes. A generalized linear mixed model is used to model the genome-wide variability among categories of mutations and estimate its functional consequence. We demonstrate how the model's estimated fixed and random effects can be used to identify genes under selection. The parameter estimates from our generalized linear model can be transformed to yield population genetic parameter estimates for quantities including the average selection coefficient for new mutations at a locus, the synonymous and non-synynomous mutation rates, and species divergence times. Furthermore, our approach incorporates stochastic variation due to the evolutionary process and can be fit using standard statistical software. The model is fit in both the empirical Bayes and Bayesian settings using the lme4 package in R, and Markov chain Monte Carlo methods in WinBUGS. Using simulated data we compare our method to existing approaches for detecting genes under selection: the McDonald-Kreitman test, and two versions of the Poisson random field based method MKprf. Overall, we find our method universally outperforms existing methods for detecting genes subject to selection using polymorphism and divergence data. PMID:23236270
Bayesian stable isotope mixing models
In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...
Bayesian kinematic earthquake source models
NASA Astrophysics Data System (ADS)
Minson, S. E.; Simons, M.; Beck, J. L.; Genrich, J. F.; Galetzka, J. E.; Chowdhury, F.; Owen, S. E.; Webb, F.; Comte, D.; Glass, B.; Leiva, C.; Ortega, F. H.
2009-12-01
Most coseismic, postseismic, and interseismic slip models are based on highly regularized optimizations which yield one solution which satisfies the data given a particular set of regularizing constraints. This regularization hampers our ability to answer basic questions such as whether seismic and aseismic slip overlap or instead rupture separate portions of the fault zone. We present a Bayesian methodology for generating kinematic earthquake source models with a focus on large subduction zone earthquakes. Unlike classical optimization approaches, Bayesian techniques sample the ensemble of all acceptable models presented as an a posteriori probability density function (PDF), and thus we can explore the entire solution space to determine, for example, which model parameters are well determined and which are not, or what is the likelihood that two slip distributions overlap in space. Bayesian sampling also has the advantage that all a priori knowledge of the source process can be used to mold the a posteriori ensemble of models. Although very powerful, Bayesian methods have up to now been of limited use in geophysical modeling because they are only computationally feasible for problems with a small number of free parameters due to what is called the "curse of dimensionality." However, our methodology can successfully sample solution spaces of many hundreds of parameters, which is sufficient to produce finite fault kinematic earthquake models. Our algorithm is a modification of the tempered Markov chain Monte Carlo (tempered MCMC or TMCMC) method. In our algorithm, we sample a "tempered" a posteriori PDF using many MCMC simulations running in parallel and evolutionary computation in which models which fit the data poorly are preferentially eliminated in favor of models which better predict the data. We present results for both synthetic test problems as well as for the 2007 Mw 7.8 Tocopilla, Chile earthquake, the latter of which is constrained by InSAR, local high
Random effects coefficient of determination for mixed and meta-analysis models.
Demidenko, Eugene; Sargent, James; Onega, Tracy
2012-01-01
The key feature of a mixed model is the presence of random effects. We have developed a coefficient, called the random effects coefficient of determination, [Formula: see text], that estimates the proportion of the conditional variance of the dependent variable explained by random effects. This coefficient takes values from 0 to 1 and indicates how strong the random effects are. The difference from the earlier suggested fixed effects coefficient of determination is emphasized. If [Formula: see text] is close to 0, there is weak support for random effects in the model because the reduction of the variance of the dependent variable due to random effects is small; consequently, random effects may be ignored and the model simplifies to standard linear regression. The value of [Formula: see text] apart from 0 indicates the evidence of the variance reduction in support of the mixed model. If random effects coefficient of determination is close to 1 the variance of random effects is very large and random effects turn into free fixed effects-the model can be estimated using the dummy variable approach. We derive explicit formulas for [Formula: see text] in three special cases: the random intercept model, the growth curve model, and meta-analysis model. Theoretical results are illustrated with three mixed model examples: (1) travel time to the nearest cancer center for women with breast cancer in the U.S., (2) cumulative time watching alcohol related scenes in movies among young U.S. teens, as a risk factor for early drinking onset, and (3) the classic example of the meta-analysis model for combination of 13 studies on tuberculosis vaccine. PMID:23750070
Frequentist tests for Bayesian models
NASA Astrophysics Data System (ADS)
Lucy, L. B.
2016-04-01
Analogues of the frequentist chi-square and F tests are proposed for testing goodness-of-fit and consistency for Bayesian models. Simple examples exhibit these tests' detection of inconsistency between consecutive experiments with identical parameters, when the first experiment provides the prior for the second. In a related analysis, a quantitative measure is derived for judging the degree of tension between two different experiments with partially overlapping parameter vectors.
A likelihood reformulation method in non-normal random effects models.
Liu, Lei; Yu, Zhangsheng
2008-07-20
In this paper, we propose a practical computational method to obtain the maximum likelihood estimates (MLE) for mixed models with non-normal random effects. By simply multiplying and dividing a standard normal density, we reformulate the likelihood conditional on the non-normal random effects to that conditional on the normal random effects. Gaussian quadrature technique, conveniently implemented in SAS Proc NLMIXED, can then be used to carry out the estimation process. Our method substantially reduces computational time, while yielding similar estimates to the probability integral transformation method (J. Comput. Graphical Stat. 2006; 15:39-57). Furthermore, our method can be applied to more general situations, e.g. finite mixture random effects or correlated random effects from Clayton copula. Simulations and applications are presented to illustrate our method. PMID:18038445
Flexible Bayesian Human Fecundity Models
Kim, Sungduk; Sundaram, Rajeshwari; Buck Louis, Germaine M.; Pyper, Cecilia
2016-01-01
Human fecundity is an issue of considerable interest for both epidemiological and clinical audiences, and is dependent upon a couple’s biologic capacity for reproduction coupled with behaviors that place a couple at risk for pregnancy. Bayesian hierarchical models have been proposed to better model the conception probabilities by accounting for the acts of intercourse around the day of ovulation, i.e., during the fertile window. These models can be viewed in the framework of a generalized nonlinear model with an exponential link. However, a fixed choice of link function may not always provide the best fit, leading to potentially biased estimates for probability of conception. Motivated by this, we propose a general class of models for fecundity by relaxing the choice of the link function under the generalized nonlinear model framework. We use a sample from the Oxford Conception Study (OCS) to illustrate the utility and fit of this general class of models for estimating human conception. Our findings reinforce the need for attention to be paid to the choice of link function in modeling conception, as it may bias the estimation of conception probabilities. Various properties of the proposed models are examined and a Markov chain Monte Carlo sampling algorithm was developed for implementing the Bayesian computations. The deviance information criterion measure and logarithm of pseudo marginal likelihood are used for guiding the choice of links. The supplemental material section contains technical details of the proof of the theorem stated in the paper, and contains further simulation results and analysis.
A Bayesian nonlinear mixed-effects disease progression model
Kim, Seongho; Jang, Hyejeong; Wu, Dongfeng; Abrams, Judith
2016-01-01
A nonlinear mixed-effects approach is developed for disease progression models that incorporate variation in age in a Bayesian framework. We further generalize the probability model for sensitivity to depend on age at diagnosis, time spent in the preclinical state and sojourn time. The developed models are then applied to the Johns Hopkins Lung Project data and the Health Insurance Plan for Greater New York data using Bayesian Markov chain Monte Carlo and are compared with the estimation method that does not consider random-effects from age. Using the developed models, we obtain not only age-specific individual-level distributions, but also population-level distributions of sensitivity, sojourn time and transition probability. PMID:26798562
Bayesian Networks for Social Modeling
Whitney, Paul D.; White, Amanda M.; Walsh, Stephen J.; Dalton, Angela C.; Brothers, Alan J.
2011-03-28
This paper describes a body of work developed over the past five years. The work addresses the use of Bayesian network (BN) models for representing and predicting social/organizational behaviors. The topics covered include model construction, validation, and use. These topics show the bulk of the lifetime of such model, beginning with construction, moving to validation and other aspects of model ‘critiquing’, and finally demonstrating how the modeling approach might be used to inform policy analysis. To conclude, we discuss limitations of using BN for this activity and suggest remedies to address those limitations. The primary benefits of using a well-developed computational, mathematical, and statistical modeling structure, such as BN, are 1) there are significant computational, theoretical and capability bases on which to build 2) ability to empirically critique the model, and potentially evaluate competing models for a social/behavioral phenomena.
Standard errors for EM estimates in generalized linear models with random effects.
Friedl, H; Kauermann, G
2000-09-01
A procedure is derived for computing standard errors of EM estimates in generalized linear models with random effects. Quadrature formulas are used to approximate the integrals in the EM algorithm, where two different approaches are pursued, i.e., Gauss-Hermite quadrature in the case of Gaussian random effects and nonparametric maximum likelihood estimation for an unspecified random effect distribution. An approximation of the expected Fisher information matrix is derived from an expansion of the EM estimating equations. This allows for inferential arguments based on EM estimates, as demonstrated by an example and simulations. PMID:10985213
Modeling Diagnostic Assessments with Bayesian Networks
ERIC Educational Resources Information Center
Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego
2007-01-01
This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…
MOMENT-BASED METHOD FOR RANDOM EFFECTS SELECTION IN LINEAR MIXED MODELS
Ahn, Mihye; Lu, Wenbin
2012-01-01
The selection of random effects in linear mixed models is an important yet challenging problem in practice. We propose a robust and unified framework for automatically selecting random effects and estimating covariance components in linear mixed models. A moment-based loss function is first constructed for estimating the covariance matrix of random effects. Two types of shrinkage penalties, a hard thresholding operator and a new sandwich-type soft-thresholding penalty, are then imposed for sparse estimation and random effects selection. Compared with existing approaches, the new procedure does not require any distributional assumption on the random effects and error terms. We establish the asymptotic properties of the resulting estimator in terms of its consistency in both random effects selection and variance component estimation. Optimization strategies are suggested to tackle the computational challenges involved in estimating the sparse variance-covariance matrix. Furthermore, we extend the procedure to incorporate the selection of fixed effects as well. Numerical results show promising performance of the new approach in selecting both random and fixed effects and, consequently, improving the efficiency of estimating model parameters. Finally, we apply the approach to a data set from the Amsterdam Growth and Health study. PMID:23105913
Cross-Classified Random Effects Models in Institutional Research
ERIC Educational Resources Information Center
Meyers, Laura E.
2012-01-01
Multilevel modeling offers researchers a rich array of tools that can be used for a variety of purposes, such as analyzing specific institutional issues, looking for macro-level trends, and helping to shape and inform educational policy. One of the more complex multilevel modeling tools available to institutional researchers is cross-classified…
Mixed model analysis of censored longitudinal data with flexible random-effects density
Vock, David M.; Davidian, Marie; Tsiatis, Anastasios A.; Muir, Andrew J.
2012-01-01
Mixed models are commonly used to represent longitudinal or repeated measures data. An additional complication arises when the response is censored, for example, due to limits of quantification of the assay used. While Gaussian random effects are routinely assumed, little work has characterized the consequences of misspecifying the random-effects distribution nor has a more flexible distribution been studied for censored longitudinal data. We show that, in general, maximum likelihood estimators will not be consistent when the random-effects density is misspecified, and the effect of misspecification is likely to be greatest when the true random-effects density deviates substantially from normality and the number of noncensored observations on each subject is small. We develop a mixed model framework for censored longitudinal data in which the random effects are represented by the flexible seminonparametric density and show how to obtain estimates in SAS procedure NLMIXED. Simulations show that this approach can lead to reduction in bias and increase in efficiency relative to assuming Gaussian random effects. The methods are demonstrated on data from a study of hepatitis C virus. PMID:21914727
Multilevel models for survival analysis with random effects.
Yau, K K
2001-03-01
A method for modeling survival data with multilevel clustering is described. The Cox partial likelihood is incorporated into the generalized linear mixed model (GLMM) methodology. Parameter estimation is achieved by maximizing a log likelihood analogous to the likelihood associated with the best linear unbiased prediction (BLUP) at the initial step of estimation and is extended to obtain residual maximum likelihood (REML) estimators of the variance component. Estimating equations for a three-level hierarchical survival model are developed in detail, and such a model is applied to analyze a set of chronic granulomatous disease (CGD) data on recurrent infections as an illustration with both hospital and patient effects being considered as random. Only the latter gives a significant contribution. A simulation study is carried out to evaluate the performance of the REML estimators. Further extension of the estimation procedure to models with an arbitrary number of levels is also discussed. PMID:11252624
ERIC Educational Resources Information Center
Hedeker, Donald; And Others
1994-01-01
Proposes random-effects regression model for analysis of clustered data. Suggests model assumes some dependency of within-cluster data. Model adjusts effects for resulting dependency from data clustering. Describes maximum marginal likelihood solution. Discusses available statistical software. Demonstrates model via investigation involving…
A Mixture Proportional Hazards Model with Random Effects for Response Times in Tests
ERIC Educational Resources Information Center
Ranger, Jochen; Kuhn, Jörg-Tobias
2016-01-01
In this article, a new model for test response times is proposed that combines latent class analysis and the proportional hazards model with random effects in a similar vein as the mixture factor model. The model assumes the existence of different latent classes. In each latent class, the response times are distributed according to a…
ERIC Educational Resources Information Center
Clarke, Paul; Crawford, Claire; Steele, Fiona; Vignoles, Anna
2015-01-01
The use of fixed (FE) and random effects (RE) in two-level hierarchical linear regression is discussed in the context of education research. We compare the robustness of FE models with the modelling flexibility and potential efficiency of those from RE models. We argue that the two should be seen as complementary approaches. We then compare both…
Bayesian Calibration of Microsimulation Models.
Rutter, Carolyn M; Miglioretti, Diana L; Savarino, James E
2009-12-01
Microsimulation models that describe disease processes synthesize information from multiple sources and can be used to estimate the effects of screening and treatment on cancer incidence and mortality at a population level. These models are characterized by simulation of individual event histories for an idealized population of interest. Microsimulation models are complex and invariably include parameters that are not well informed by existing data. Therefore, a key component of model development is the choice of parameter values. Microsimulation model parameter values are selected to reproduce expected or known results though the process of model calibration. Calibration may be done by perturbing model parameters one at a time or by using a search algorithm. As an alternative, we propose a Bayesian method to calibrate microsimulation models that uses Markov chain Monte Carlo. We show that this approach converges to the target distribution and use a simulation study to demonstrate its finite-sample performance. Although computationally intensive, this approach has several advantages over previously proposed methods, including the use of statistical criteria to select parameter values, simultaneous calibration of multiple parameters to multiple data sources, incorporation of information via prior distributions, description of parameter identifiability, and the ability to obtain interval estimates of model parameters. We develop a microsimulation model for colorectal cancer and use our proposed method to calibrate model parameters. The microsimulation model provides a good fit to the calibration data. We find evidence that some parameters are identified primarily through prior distributions. Our results underscore the need to incorporate multiple sources of variability (i.e., due to calibration data, unknown parameters, and estimated parameters and predicted values) when calibrating and applying microsimulation models. PMID:20076767
Bayesian Calibration of Microsimulation Models
Rutter, Carolyn M.; Miglioretti, Diana L.; Savarino, James E.
2009-01-01
Microsimulation models that describe disease processes synthesize information from multiple sources and can be used to estimate the effects of screening and treatment on cancer incidence and mortality at a population level. These models are characterized by simulation of individual event histories for an idealized population of interest. Microsimulation models are complex and invariably include parameters that are not well informed by existing data. Therefore, a key component of model development is the choice of parameter values. Microsimulation model parameter values are selected to reproduce expected or known results though the process of model calibration. Calibration may be done by perturbing model parameters one at a time or by using a search algorithm. As an alternative, we propose a Bayesian method to calibrate microsimulation models that uses Markov chain Monte Carlo. We show that this approach converges to the target distribution and use a simulation study to demonstrate its finite-sample performance. Although computationally intensive, this approach has several advantages over previously proposed methods, including the use of statistical criteria to select parameter values, simultaneous calibration of multiple parameters to multiple data sources, incorporation of information via prior distributions, description of parameter identifiability, and the ability to obtain interval estimates of model parameters. We develop a microsimulation model for colorectal cancer and use our proposed method to calibrate model parameters. The microsimulation model provides a good fit to the calibration data. We find evidence that some parameters are identified primarily through prior distributions. Our results underscore the need to incorporate multiple sources of variability (i.e., due to calibration data, unknown parameters, and estimated parameters and predicted values) when calibrating and applying microsimulation models. PMID:20076767
Mixed-Effects Modeling with Crossed Random Effects for Subjects and Items
ERIC Educational Resources Information Center
Baayen, R. H.; Davidson, D. J.; Bates, D. M.
2008-01-01
This paper provides an introduction to mixed-effects models for the analysis of repeated measurement data with subjects and items as crossed random effects. A worked-out example of how to use recent software for mixed-effects modeling is provided. Simulation studies illustrate the advantages offered by mixed-effects analyses compared to…
A Flexible Two-Part Random Effects Model for Correlated Medical Costs
Liu, Lei; Cowen, Mark E.; Strawderman, Robert L.; Shih, Ya-Chen T.
2009-01-01
In this paper, we propose a flexible “two-part” random Effects model (Olsen and Schafer 2001; Tooze, Grunwald, and Jones 2002) for correlated medical cost data. Typically, medical cost data are right-skewed, involve a substantial proportion of zero values, and may exhibit heteroscedasticity. In many cases, such data is also obtained in hierarchical form, e.g., on patients served by the same physician. The proposed model specification therefore consists of two generalized linear mixed models (GLMM), linked together by correlated random Effects. Respectively, and conditionally on the random Effects and covariates, we model the odds of cost being positive (Part I) using a GLMM with a logistic link and the mean cost (Part II) given that costs were actually incurred using a generalized gamma regression model with random Effects and a scale parameter that is allowed to depend on covariates (c.f. Manning, Basu, and Mullahy 2005). The class of generalized gamma distributions is very flexible and includes the lognormal, gamma, inverse gamma and Weibull distributions as special cases. We demonstrate how to carry out estimation using the Gaussian quadrature techniques conveniently implemented in SAS Proc NLMIXED. The proposed model is used to analyze pharmacy cost data on 56,245 adult patients clustered within 239 physicians in a mid-western U.S. managed care organization. PMID:20015560
Sparse Bayesian infinite factor models
Bhattacharya, A.; Dunson, D. B.
2011-01-01
We focus on sparse modelling of high-dimensional covariance matrices using Bayesian latent factor models. We propose a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk towards zero as the column index increases. We use our prior on a parameter-expanded loading matrix to avoid the order dependence typical in factor analysis models and develop an efficient Gibbs sampler that scales well as data dimensionality increases. The gain in efficiency is achieved by the joint conjugacy property of the proposed prior, which allows block updating of the loadings matrix. We propose an adaptive Gibbs sampler for automatically truncating the infinite loading matrix through selection of the number of important factors. Theoretical results are provided on the support of the prior and truncation approximation bounds. A fast algorithm is proposed to produce approximate Bayes estimates. Latent factor regression methods are developed for prediction and variable selection in applications with high-dimensional correlated predictors. Operating characteristics are assessed through simulation studies, and the approach is applied to predict survival times from gene expression data. PMID:23049129
Withanage, Niroshan; de Leon, Alexander R; Rudnisky, Christopher J
2015-12-20
We present a model for describing correlated binocular data from reader-based diagnostic studies, where the same group of readers evaluates the presence or absence of certain diseases on binocular organs (e.g., fellow eyes) of patients. Multiple random effects are incorporated to meaningfully delineate various associations in the data including crossed random effects to account for reader-specific variability and to incorporate cross correlations. To overcome the computational complexity involved in the evaluation and maximization of the marginal likelihood, we adopt the data cloning approach, which calculates maximum likelihood estimates under the Bayesian paradigm. The bias and efficiency of the estimates are assessed in two simulation studies. We apply our model to data from a diabetic retinopathy study. PMID:26179660
The Impact of Five Missing Data Treatments on a Cross-Classified Random Effects Model
ERIC Educational Resources Information Center
Hoelzle, Braden R.
2012-01-01
The present study compared the performance of five missing data treatment methods within a Cross-Classified Random Effects Model environment under various levels and patterns of missing data given a specified sample size. Prior research has shown the varying effect of missing data treatment options within the context of numerous statistical…
Estimation of the Nonlinear Random Coefficient Model when Some Random Effects Are Separable
ERIC Educational Resources Information Center
du Toit, Stephen H. C.; Cudeck, Robert
2009-01-01
A method is presented for marginal maximum likelihood estimation of the nonlinear random coefficient model when the response function has some linear parameters. This is done by writing the marginal distribution of the repeated measures as a conditional distribution of the response given the nonlinear random effects. The resulting distribution…
Firm-Related Training Tracks: A Random Effects Ordered Probit Model
ERIC Educational Resources Information Center
Groot, Wim; van den Brink, Henriette Maassen
2003-01-01
A random effects ordered response model of training is estimated to analyze the existence of training tracks and time varying coefficients in training frequency. Two waves of a Dutch panel survey of workers are used covering the period 1992-1996. The amount of training received by workers increased during the period 1994-1996 compared to…
Aguero-Valverde, Jonathan
2013-01-01
In recent years, complex statistical modeling approaches have being proposed to handle the unobserved heterogeneity and the excess of zeros frequently found in crash data, including random effects and zero inflated models. This research compares random effects, zero inflated, and zero inflated random effects models using a full Bayes hierarchical approach. The models are compared not just in terms of goodness-of-fit measures but also in terms of precision of posterior crash frequency estimates since the precision of these estimates is vital for ranking of sites for engineering improvement. Fixed-over-time random effects models are also compared to independent-over-time random effects models. For the crash dataset being analyzed, it was found that once the random effects are included in the zero inflated models, the probability of being in the zero state is drastically reduced, and the zero inflated models degenerate to their non zero inflated counterparts. Also by fixing the random effects over time the fit of the models and the precision of the crash frequency estimates are significantly increased. It was found that the rankings of the fixed-over-time random effects models are very consistent among them. In addition, the results show that by fixing the random effects over time, the standard errors of the crash frequency estimates are significantly reduced for the majority of the segments on the top of the ranking. PMID:22633143
2012-01-01
Background Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. Results We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. Conclusions Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data. PMID:23151154
Estimating anatomical trajectories with Bayesian mixed-effects modeling
Ziegler, G.; Penny, W.D.; Ridgway, G.R.; Ourselin, S.; Friston, K.J.
2015-01-01
We introduce a mass-univariate framework for the analysis of whole-brain structural trajectories using longitudinal Voxel-Based Morphometry data and Bayesian inference. Our approach to developmental and aging longitudinal studies characterizes heterogeneous structural growth/decline between and within groups. In particular, we propose a probabilistic generative model that parameterizes individual and ensemble average changes in brain structure using linear mixed-effects models of age and subject-specific covariates. Model inversion uses Expectation Maximization (EM), while voxelwise (empirical) priors on the size of individual differences are estimated from the data. Bayesian inference on individual and group trajectories is realized using Posterior Probability Maps (PPM). In addition to parameter inference, the framework affords comparisons of models with varying combinations of model order for fixed and random effects using model evidence. We validate the model in simulations and real MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) project. We further demonstrate how subject specific characteristics contribute to individual differences in longitudinal volume changes in healthy subjects, Mild Cognitive Impairment (MCI), and Alzheimer's Disease (AD). PMID:26190405
Estimating anatomical trajectories with Bayesian mixed-effects modeling.
Ziegler, G; Penny, W D; Ridgway, G R; Ourselin, S; Friston, K J
2015-11-01
We introduce a mass-univariate framework for the analysis of whole-brain structural trajectories using longitudinal Voxel-Based Morphometry data and Bayesian inference. Our approach to developmental and aging longitudinal studies characterizes heterogeneous structural growth/decline between and within groups. In particular, we propose a probabilistic generative model that parameterizes individual and ensemble average changes in brain structure using linear mixed-effects models of age and subject-specific covariates. Model inversion uses Expectation Maximization (EM), while voxelwise (empirical) priors on the size of individual differences are estimated from the data. Bayesian inference on individual and group trajectories is realized using Posterior Probability Maps (PPM). In addition to parameter inference, the framework affords comparisons of models with varying combinations of model order for fixed and random effects using model evidence. We validate the model in simulations and real MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) project. We further demonstrate how subject specific characteristics contribute to individual differences in longitudinal volume changes in healthy subjects, Mild Cognitive Impairment (MCI), and Alzheimer's Disease (AD). PMID:26190405
Bayesian Data-Model Fit Assessment for Structural Equation Modeling
ERIC Educational Resources Information Center
Levy, Roy
2011-01-01
Bayesian approaches to modeling are receiving an increasing amount of attention in the areas of model construction and estimation in factor analysis, structural equation modeling (SEM), and related latent variable models. However, model diagnostics and model criticism remain relatively understudied aspects of Bayesian SEM. This article describes…
Crash risk analysis for Shanghai urban expressways: A Bayesian semi-parametric modeling approach.
Yu, Rongjie; Wang, Xuesong; Yang, Kui; Abdel-Aty, Mohamed
2016-10-01
Urban expressway systems have been developed rapidly in recent years in China; it has become one key part of the city roadway networks as carrying large traffic volume and providing high traveling speed. Along with the increase of traffic volume, traffic safety has become a major issue for Chinese urban expressways due to the frequent crash occurrence and the non-recurrent congestions caused by them. For the purpose of unveiling crash occurrence mechanisms and further developing Active Traffic Management (ATM) control strategies to improve traffic safety, this study developed disaggregate crash risk analysis models with loop detector traffic data and historical crash data. Bayesian random effects logistic regression models were utilized as it can account for the unobserved heterogeneity among crashes. However, previous crash risk analysis studies formulated random effects distributions in a parametric approach, which assigned them to follow normal distributions. Due to the limited information known about random effects distributions, subjective parametric setting may be incorrect. In order to construct more flexible and robust random effects to capture the unobserved heterogeneity, Bayesian semi-parametric inference technique was introduced to crash risk analysis in this study. Models with both inference techniques were developed for total crashes; semi-parametric models were proved to provide substantial better model goodness-of-fit, while the two models shared consistent coefficient estimations. Later on, Bayesian semi-parametric random effects logistic regression models were developed for weekday peak hour crashes, weekday non-peak hour crashes, and weekend non-peak hour crashes to investigate different crash occurrence scenarios. Significant factors that affect crash risk have been revealed and crash mechanisms have been concluded. PMID:26847949
An efficient technique for Bayesian modeling of family data using the BUGS software
Bae, Harold T.; Perls, Thomas T.; Sebastiani, Paola
2014-01-01
Linear mixed models have become a popular tool to analyze continuous data from family-based designs by using random effects that model the correlation of subjects from the same family. However, mixed models for family data are challenging to implement with the BUGS (Bayesian inference Using Gibbs Sampling) software because of the high-dimensional covariance matrix of the random effects. This paper describes an efficient parameterization that utilizes the singular value decomposition of the covariance matrix of random effects, includes the BUGS code for such implementation, and extends the parameterization to generalized linear mixed models. The implementation is evaluated using simulated data and an example from a large family-based study is presented with a comparison to other existing methods. PMID:25477899
An Integrated Bayesian Model for DIF Analysis
ERIC Educational Resources Information Center
Soares, Tufi M.; Goncalves, Flavio B.; Gamerman, Dani
2009-01-01
In this article, an integrated Bayesian model for differential item functioning (DIF) analysis is proposed. The model is integrated in the sense of modeling the responses along with the DIF analysis. This approach allows DIF detection and explanation in a simultaneous setup. Previous empirical studies and/or subjective beliefs about the item…
Zeng, Donglin; Lin, D. Y.
2011-01-01
Summary We propose a broad class of semiparametric transformation models with random effects for the joint analysis of recurrent events and a terminal event. The transformation models include proportional hazards/intensity and proportional odds models. We estimate the model parameters by the nonparametric maximum likelihood approach. The estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Simple and stable numerical algorithms are provided to calculate the parameter estimators and to estimate their variances. Extensive simulation studies demonstrate that the proposed inference procedures perform well in realistic settings. Applications to two HIV/AIDS studies are presented. PMID:18945267
Heterogeneous Factor Analysis Models: A Bayesian Approach.
ERIC Educational Resources Information Center
Ansari, Asim; Jedidi, Kamel; Dube, Laurette
2002-01-01
Developed Markov Chain Monte Carlo procedures to perform Bayesian inference, model checking, and model comparison in heterogeneous factor analysis. Tested the approach with synthetic data and data from a consumption emotion study involving 54 consumers. Results show that traditional psychometric methods cannot fully capture the heterogeneity in…
Survey of Bayesian Models for Modelling of Stochastic Temporal Processes
Ng, B
2006-10-12
This survey gives an overview of popular generative models used in the modeling of stochastic temporal systems. In particular, this survey is organized into two parts. The first part discusses the discrete-time representations of dynamic Bayesian networks and dynamic relational probabilistic models, while the second part discusses the continuous-time representation of continuous-time Bayesian networks.
Xu, Chengcheng; Wang, Wei; Liu, Pan; Li, Zhibin
2015-12-01
This study aimed to develop a real-time crash risk model with limited data in China by using Bayesian meta-analysis and Bayesian inference approach. A systematic review was first conducted by using three different Bayesian meta-analyses, including the fixed effect meta-analysis, the random effect meta-analysis, and the meta-regression. The meta-analyses provided a numerical summary of the effects of traffic variables on crash risks by quantitatively synthesizing results from previous studies. The random effect meta-analysis and the meta-regression produced a more conservative estimate for the effects of traffic variables compared with the fixed effect meta-analysis. Then, the meta-analyses results were used as informative priors for developing crash risk models with limited data. Three different meta-analyses significantly affect model fit and prediction accuracy. The model based on meta-regression can increase the prediction accuracy by about 15% as compared to the model that was directly developed with limited data. Finally, the Bayesian predictive densities analysis was used to identify the outliers in the limited data. It can further improve the prediction accuracy by 5.0%. PMID:26468977
Road network safety evaluation using Bayesian hierarchical joint model.
Wang, Jie; Huang, Helai
2016-05-01
Safety and efficiency are commonly regarded as two significant performance indicators of transportation systems. In practice, road network planning has focused on road capacity and transport efficiency whereas the safety level of a road network has received little attention in the planning stage. This study develops a Bayesian hierarchical joint model for road network safety evaluation to help planners take traffic safety into account when planning a road network. The proposed model establishes relationships between road network risk and micro-level variables related to road entities and traffic volume, as well as socioeconomic, trip generation and network density variables at macro level which are generally used for long term transportation plans. In addition, network spatial correlation between intersections and their connected road segments is also considered in the model. A road network is elaborately selected in order to compare the proposed hierarchical joint model with a previous joint model and a negative binomial model. According to the results of the model comparison, the hierarchical joint model outperforms the joint model and negative binomial model in terms of the goodness-of-fit and predictive performance, which indicates the reasonableness of considering the hierarchical data structure in crash prediction and analysis. Moreover, both random effects at the TAZ level and the spatial correlation between intersections and their adjacent segments are found to be significant, supporting the employment of the hierarchical joint model as an alternative in road-network-level safety modeling as well. PMID:26945109
Karr, Justin E; Areshenkoff, Corson N; Duggan, Emily C; Garcia-Barrera, Mauricio A
2014-12-01
Throughout their careers, many soldiers experience repeated blasts exposures from improvised explosive devices, which often involve head injury. Consequentially, blast-related mild Traumatic Brain Injury (mTBI) has become prevalent in modern conflicts, often occuring co-morbidly with psychiatric illness (e.g., post-traumatic stress disorder [PTSD]). In turn, a growing body of research has begun to explore the cognitive and psychiatric sequelae of blast-related mTBI. The current meta-analysis aimed to evaluate the chronic effects of blast-related mTBI on cognitive performance. A systematic review identified 9 studies reporting 12 samples meeting eligibility criteria. A Bayesian random-effects meta-analysis was conducted with cognitive construct and PTSD symptoms explored as moderators. The overall posterior mean effect size and Highest Density Interval (HDI) came to d = -0.12 [-0.21, -0.04], with executive function (-0.16 [-0.31, 0.00]), verbal delayed memory (-0.19 [-0.44, 0.06]) and processing speed (-0.11 [-0.26, 0.01]) presenting as the most sensitive cognitive domains to blast-related mTBI. When dividing executive function into diverse sub-constructs (i.e., working memory, inhibition, set-shifting), set-shifting presented the largest effect size (-0.33 [-0.55, -0.05]). PTSD symptoms did not predict cognitive effects sizes, β PTSD = -0.02 [-0.23, 0.20]. The results indicate a subtle, but chronic cognitive impairment following mTBI, especially in set-shifting, a relevant aspect of executive attention. These findings are consistent with past meta-analyses on multiple mTBI and correspond with past neuroimaging research on the cognitive correlates of white matter damage common in mTBI. However, all studies had cross-sectional designs, which resulted in universally low quality ratings and limited the conclusions inferable from this meta-analysis. PMID:25253505
Hierarchical Bayesian model updating for structural identification
NASA Astrophysics Data System (ADS)
Behmanesh, Iman; Moaveni, Babak; Lombaert, Geert; Papadimitriou, Costas
2015-12-01
A new probabilistic finite element (FE) model updating technique based on Hierarchical Bayesian modeling is proposed for identification of civil structural systems under changing ambient/environmental conditions. The performance of the proposed technique is investigated for (1) uncertainty quantification of model updating parameters, and (2) probabilistic damage identification of the structural systems. Accurate estimation of the uncertainty in modeling parameters such as mass or stiffness is a challenging task. Several Bayesian model updating frameworks have been proposed in the literature that can successfully provide the "parameter estimation uncertainty" of model parameters with the assumption that there is no underlying inherent variability in the updating parameters. However, this assumption may not be valid for civil structures where structural mass and stiffness have inherent variability due to different sources of uncertainty such as changing ambient temperature, temperature gradient, wind speed, and traffic loads. Hierarchical Bayesian model updating is capable of predicting the overall uncertainty/variability of updating parameters by assuming time-variability of the underlying linear system. A general solution based on Gibbs Sampler is proposed to estimate the joint probability distributions of the updating parameters. The performance of the proposed Hierarchical approach is evaluated numerically for uncertainty quantification and damage identification of a 3-story shear building model. Effects of modeling errors and incomplete modal data are considered in the numerical study.
Normativity, interpretation, and Bayesian models
Oaksford, Mike
2014-01-01
It has been suggested that evaluative normativity should be expunged from the psychology of reasoning. A broadly Davidsonian response to these arguments is presented. It is suggested that two distinctions, between different types of rationality, are more permeable than this argument requires and that the fundamental objection is to selecting theories that make the most rational sense of the data. It is argued that this is inevitable consequence of radical interpretation where understanding others requires assuming they share our own norms of reasoning. This requires evaluative normativity and it is shown that when asked to evaluate others’ arguments participants conform to rational Bayesian norms. It is suggested that logic and probability are not in competition and that the variety of norms is more limited than the arguments against evaluative normativity suppose. Moreover, the universality of belief ascription suggests that many of our norms are universal and hence evaluative. It is concluded that the union of evaluative normativity and descriptive psychology implicit in Davidson and apparent in the psychology of reasoning is a good thing. PMID:24860519
Posterior Predictive Bayesian Phylogenetic Model Selection
Lewis, Paul O.; Xie, Wangang; Chen, Ming-Hui; Fan, Yu; Kuo, Lynn
2014-01-01
We present two distinctly different posterior predictive approaches to Bayesian phylogenetic model selection and illustrate these methods using examples from green algal protein-coding cpDNA sequences and flowering plant rDNA sequences. The Gelfand–Ghosh (GG) approach allows dissection of an overall measure of model fit into components due to posterior predictive variance (GGp) and goodness-of-fit (GGg), which distinguishes this method from the posterior predictive P-value approach. The conditional predictive ordinate (CPO) method provides a site-specific measure of model fit useful for exploratory analyses and can be combined over sites yielding the log pseudomarginal likelihood (LPML) which is useful as an overall measure of model fit. CPO provides a useful cross-validation approach that is computationally efficient, requiring only a sample from the posterior distribution (no additional simulation is required). Both GG and CPO add new perspectives to Bayesian phylogenetic model selection based on the predictive abilities of models and complement the perspective provided by the marginal likelihood (including Bayes Factor comparisons) based solely on the fit of competing models to observed data. [Bayesian; conditional predictive ordinate; CPO; L-measure; LPML; model selection; phylogenetics; posterior predictive.] PMID:24193892
Technology diffusion in hospitals: a log odds random effects regression model.
Blank, Jos L T; Valdmanis, Vivian G
2015-01-01
This study identifies the factors that affect the diffusion of hospital innovations. We apply a log odds random effects regression model on hospital micro data. We introduce the concept of clustering innovations and the application of a log odds random effects regression model to describe the diffusion of technologies. We distinguish a number of determinants, such as service, physician, and environmental, financial and organizational characteristics of the 60 Dutch hospitals in our sample. On the basis of this data set on Dutch general hospitals over the period 1995-2002, we conclude that there is a relation between a number of determinants and the diffusion of innovations underlining conclusions from earlier research. Positive effects were found on the basis of the size of the hospitals, competition and a hospital's commitment to innovation. It appears that if a policy is developed to further diffuse innovations, the external effects of demand and market competition need to be examined, which would de facto lead to an efficient use of technology. For the individual hospital, instituting an innovations office appears to be the most prudent course of action. PMID:24323484
A Bayesian Model of Sensory Adaptation
Sato, Yoshiyuki; Aihara, Kazuyuki
2011-01-01
Recent studies reported two opposite types of adaptation in temporal perception. Here, we propose a Bayesian model of sensory adaptation that exhibits both types of adaptation. We regard adaptation as the adaptive updating of estimations of time-evolving variables, which determine the mean value of the likelihood function and that of the prior distribution in a Bayesian model of temporal perception. On the basis of certain assumptions, we can analytically determine the mean behavior in our model and identify the parameters that determine the type of adaptation that actually occurs. The results of our model suggest that we can control the type of adaptation by controlling the statistical properties of the stimuli presented. PMID:21541346
Prague, Mélanie; Commenges, Daniel; Guedj, Jérémie; Drylewicz, Julia; Thiébaut, Rodolphe
2013-08-01
Models based on ordinary differential equations (ODE) are widespread tools for describing dynamical systems. In biomedical sciences, data from each subject can be sparse making difficult to precisely estimate individual parameters by standard non-linear regression but information can often be gained from between-subjects variability. This makes natural the use of mixed-effects models to estimate population parameters. Although the maximum likelihood approach is a valuable option, identifiability issues favour Bayesian approaches which can incorporate prior knowledge in a flexible way. However, the combination of difficulties coming from the ODE system and from the presence of random effects raises a major numerical challenge. Computations can be simplified by making a normal approximation of the posterior to find the maximum of the posterior distribution (MAP). Here we present the NIMROD program (normal approximation inference in models with random effects based on ordinary differential equations) devoted to the MAP estimation in ODE models. We describe the specific implemented features such as convergence criteria and an approximation of the leave-one-out cross-validation to assess the model quality of fit. In pharmacokinetics models, first, we evaluate the properties of this algorithm and compare it with FOCE and MCMC algorithms in simulations. Then, we illustrate NIMROD use on Amprenavir pharmacokinetics data from the PUZZLE clinical trial in HIV infected patients. PMID:23764196
Bayesian network modelling of upper gastrointestinal bleeding
NASA Astrophysics Data System (ADS)
Aisha, Nazziwa; Shohaimi, Shamarina; Adam, Mohd Bakri
2013-09-01
Bayesian networks are graphical probabilistic models that represent causal and other relationships between domain variables. In the context of medical decision making, these models have been explored to help in medical diagnosis and prognosis. In this paper, we discuss the Bayesian network formalism in building medical support systems and we learn a tree augmented naive Bayes Network (TAN) from gastrointestinal bleeding data. The accuracy of the TAN in classifying the source of gastrointestinal bleeding into upper or lower source is obtained. The TAN achieves a high classification accuracy of 86% and an area under curve of 92%. A sensitivity analysis of the model shows relatively high levels of entropy reduction for color of the stool, history of gastrointestinal bleeding, consistency and the ratio of blood urea nitrogen to creatinine. The TAN facilitates the identification of the source of GIB and requires further validation.
A Bayesian model for visual space perception
NASA Technical Reports Server (NTRS)
Curry, R. E.
1972-01-01
A model for visual space perception is proposed that contains desirable features in the theories of Gibson and Brunswik. This model is a Bayesian processor of proximal stimuli which contains three important elements: an internal model of the Markov process describing the knowledge of the distal world, the a priori distribution of the state of the Markov process, and an internal model relating state to proximal stimuli. The universality of the model is discussed and it is compared with signal detection theory models. Experimental results of Kinchla are used as a special case.
Bayesian population modeling of drug dosing adherence.
Fellows, Kelly; Stoneking, Colin J; Ramanathan, Murali
2015-10-01
Adherence is a frequent contributing factor to variations in drug concentrations and efficacy. The purpose of this work was to develop an integrated population model to describe variation in adherence, dose-timing deviations, overdosing and persistence to dosing regimens. The hybrid Markov chain-von Mises method for modeling adherence in individual subjects was extended to the population setting using a Bayesian approach. Four integrated population models for overall adherence, the two-state Markov chain transition parameters, dose-timing deviations, overdosing and persistence were formulated and critically compared. The Markov chain-Monte Carlo algorithm was used for identifying distribution parameters and for simulations. The model was challenged with medication event monitoring system data for 207 hypertension patients. The four Bayesian models demonstrated good mixing and convergence characteristics. The distributions of adherence, dose-timing deviations, overdosing and persistence were markedly non-normal and diverse. The models varied in complexity and the method used to incorporate inter-dependence with the preceding dose in the two-state Markov chain. The model that incorporated a cooperativity term for inter-dependence and a hyperbolic parameterization of the transition matrix probabilities was identified as the preferred model over the alternatives. The simulated probability densities from the model satisfactorily fit the observed probability distributions of adherence, dose-timing deviations, overdosing and persistence parameters in the sample patients. The model also adequately described the median and observed quartiles for these parameters. The Bayesian model for adherence provides a parsimonious, yet integrated, description of adherence in populations. It may find potential applications in clinical trial simulations and pharmacokinetic-pharmacodynamic modeling. PMID:26319548
Bayesian model selection analysis of WMAP3
Parkinson, David; Mukherjee, Pia; Liddle, Andrew R.
2006-06-15
We present a Bayesian model selection analysis of WMAP3 data using our code CosmoNest. We focus on the density perturbation spectral index n{sub S} and the tensor-to-scalar ratio r, which define the plane of slow-roll inflationary models. We find that while the Bayesian evidence supports the conclusion that n{sub S}{ne}1, the data are not yet powerful enough to do so at a strong or decisive level. If tensors are assumed absent, the current odds are approximately 8 to 1 in favor of n{sub S}{ne}1 under our assumptions, when WMAP3 data is used together with external data sets. WMAP3 data on its own is unable to distinguish between the two models. Further, inclusion of r as a parameter weakens the conclusion against the Harrison-Zel'dovich case (n{sub S}=1, r=0), albeit in a prior-dependent way. In appendices we describe the CosmoNest code in detail, noting its ability to supply posterior samples as well as to accurately compute the Bayesian evidence. We make a first public release of CosmoNest, now available at www.cosmonest.org.
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach. PMID:26442771
Bayesian spatial modeling of HIV mortality via zero-inflated Poisson models.
Musal, Muzaffer; Aktekin, Tevfik
2013-01-30
In this paper, we investigate the effects of poverty and inequality on the number of HIV-related deaths in 62 New York counties via Bayesian zero-inflated Poisson models that exhibit spatial dependence. We quantify inequality via the Theil index and poverty via the ratios of two Census 2000 variables, the number of people under the poverty line and the number of people for whom poverty status is determined, in each Zip Code Tabulation Area. The purpose of this study was to investigate the effects of inequality and poverty in addition to spatial dependence between neighboring regions on HIV mortality rate, which can lead to improved health resource allocation decisions. In modeling county-specific HIV counts, we propose Bayesian zero-inflated Poisson models whose rates are functions of both covariate and spatial/random effects. To show how the proposed models work, we used three different publicly available data sets: TIGER Shapefiles, Census 2000, and mortality index files. In addition, we introduce parameter estimation issues of Bayesian zero-inflated Poisson models and discuss MCMC method implications. PMID:22807006
Bayesian Nonparametric Models for Multiway Data Analysis.
Xu, Zenglin; Yan, Feng; Qi, Yuan
2015-02-01
Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches-such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)-amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g., missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models for multiway data analysis. We name these models InfTucker. These new models essentially conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or t processes with nonlinear covariance functions. Moreover, on network data, our models reduce to nonparametric stochastic blockmodels and can be used to discover latent groups and predict missing interactions. To learn the models efficiently from data, we develop a variational inference technique and explore properties of the Kronecker product for computational efficiency. Compared with a classical variational implementation, this technique reduces both time and space complexities by several orders of magnitude. On real multiway and network data, our new models achieved significantly higher prediction accuracy than state-of-art tensor decomposition methods and blockmodels. PMID:26353255
Bayesian Model Selection with Network Based Diffusion Analysis.
Whalen, Andrew; Hoppitt, William J E
2016-01-01
A number of recent studies have used Network Based Diffusion Analysis (NBDA) to detect the role of social transmission in the spread of a novel behavior through a population. In this paper we present a unified framework for performing NBDA in a Bayesian setting, and demonstrate how the Watanabe Akaike Information Criteria (WAIC) can be used for model selection. We present a specific example of applying this method to Time to Acquisition Diffusion Analysis (TADA). To examine the robustness of this technique, we performed a large scale simulation study and found that NBDA using WAIC could recover the correct model of social transmission under a wide range of cases, including under the presence of random effects, individual level variables, and alternative models of social transmission. This work suggests that NBDA is an effective and widely applicable tool for uncovering whether social transmission underpins the spread of a novel behavior, and may still provide accurate results even when key model assumptions are relaxed. PMID:27092089
Bayesian Model Selection with Network Based Diffusion Analysis
Whalen, Andrew; Hoppitt, William J. E.
2016-01-01
A number of recent studies have used Network Based Diffusion Analysis (NBDA) to detect the role of social transmission in the spread of a novel behavior through a population. In this paper we present a unified framework for performing NBDA in a Bayesian setting, and demonstrate how the Watanabe Akaike Information Criteria (WAIC) can be used for model selection. We present a specific example of applying this method to Time to Acquisition Diffusion Analysis (TADA). To examine the robustness of this technique, we performed a large scale simulation study and found that NBDA using WAIC could recover the correct model of social transmission under a wide range of cases, including under the presence of random effects, individual level variables, and alternative models of social transmission. This work suggests that NBDA is an effective and widely applicable tool for uncovering whether social transmission underpins the spread of a novel behavior, and may still provide accurate results even when key model assumptions are relaxed. PMID:27092089
On Bayesian estimation of marginal structural models.
Saarela, Olli; Stephens, David A; Moodie, Erica E M; Klein, Marina B
2015-06-01
The purpose of inverse probability of treatment (IPT) weighting in estimation of marginal treatment effects is to construct a pseudo-population without imbalances in measured covariates, thus removing the effects of confounding and informative censoring when performing inference. In this article, we formalize the notion of such a pseudo-population as a data generating mechanism with particular characteristics, and show that this leads to a natural Bayesian interpretation of IPT weighted estimation. Using this interpretation, we are able to propose the first fully Bayesian procedure for estimating parameters of marginal structural models using an IPT weighting. Our approach suggests that the weights should be derived from the posterior predictive treatment assignment and censoring probabilities, answering the question of whether and how the uncertainty in the estimation of the weights should be incorporated in Bayesian inference of marginal treatment effects. The proposed approach is compared to existing methods in simulated data, and applied to an analysis of the Canadian Co-infection Cohort. PMID:25677103
Bayesian Kinematic Finite Fault Source Models (Invited)
NASA Astrophysics Data System (ADS)
Minson, S. E.; Simons, M.; Beck, J. L.
2010-12-01
Finite fault earthquake source models are inherently under-determined: there is no unique solution to the inverse problem of determining the rupture history at depth as a function of time and space when our data are only limited observations at the Earth's surface. Traditional inverse techniques rely on model constraints and regularization to generate one model from the possibly broad space of all possible solutions. However, Bayesian methods allow us to determine the ensemble of all possible source models which are consistent with the data and our a priori assumptions about the physics of the earthquake source. Until now, Bayesian techniques have been of limited utility because they are computationally intractable for problems with as many free parameters as kinematic finite fault models. We have developed a methodology called Cascading Adaptive Tempered Metropolis In Parallel (CATMIP) which allows us to sample very high-dimensional problems in a parallel computing framework. The CATMIP algorithm combines elements of simulated annealing and genetic algorithms with the Metropolis algorithm to dynamically optimize the algorithm's efficiency as it runs. We will present synthetic performance tests of finite fault models made with this methodology as well as a kinematic source model for the 2007 Mw 7.7 Tocopilla, Chile earthquake. This earthquake was well recorded by multiple ascending and descending interferograms and a network of high-rate GPS stations whose records can be used as near-field seismograms.
A Bayesian Shrinkage Approach for AMMI Models.
da Silva, Carlos Pereira; de Oliveira, Luciano Antonio; Nuvunga, Joel Jorge; Pamplona, Andrezza Kéllen Alves; Balestre, Marcio
2015-01-01
Linear-bilinear models, especially the additive main effects and multiplicative interaction (AMMI) model, are widely applicable to genotype-by-environment interaction (GEI) studies in plant breeding programs. These models allow a parsimonious modeling of GE interactions, retaining a small number of principal components in the analysis. However, one aspect of the AMMI model that is still debated is the selection criteria for determining the number of multiplicative terms required to describe the GE interaction pattern. Shrinkage estimators have been proposed as selection criteria for the GE interaction components. In this study, a Bayesian approach was combined with the AMMI model with shrinkage estimators for the principal components. A total of 55 maize genotypes were evaluated in nine different environments using a complete blocks design with three replicates. The results show that the traditional Bayesian AMMI model produces low shrinkage of singular values but avoids the usual pitfalls in determining the credible intervals in the biplot. On the other hand, Bayesian shrinkage AMMI models have difficulty with the credible interval for model parameters, but produce stronger shrinkage of the principal components, converging to GE matrices that have more shrinkage than those obtained using mixed models. This characteristic allowed more parsimonious models to be chosen, and resulted in models being selected that were similar to those obtained by the Cornelius F-test (α = 0.05) in traditional AMMI models and cross validation based on leave-one-out. This characteristic allowed more parsimonious models to be chosen and more GEI pattern retained on the first two components. The resulting model chosen by posterior distribution of singular value was also similar to those produced by the cross-validation approach in traditional AMMI models. Our method enables the estimation of credible interval for AMMI biplot plus the choice of AMMI model based on direct posterior
A Bayesian Shrinkage Approach for AMMI Models
de Oliveira, Luciano Antonio; Nuvunga, Joel Jorge; Pamplona, Andrezza Kéllen Alves
2015-01-01
Linear-bilinear models, especially the additive main effects and multiplicative interaction (AMMI) model, are widely applicable to genotype-by-environment interaction (GEI) studies in plant breeding programs. These models allow a parsimonious modeling of GE interactions, retaining a small number of principal components in the analysis. However, one aspect of the AMMI model that is still debated is the selection criteria for determining the number of multiplicative terms required to describe the GE interaction pattern. Shrinkage estimators have been proposed as selection criteria for the GE interaction components. In this study, a Bayesian approach was combined with the AMMI model with shrinkage estimators for the principal components. A total of 55 maize genotypes were evaluated in nine different environments using a complete blocks design with three replicates. The results show that the traditional Bayesian AMMI model produces low shrinkage of singular values but avoids the usual pitfalls in determining the credible intervals in the biplot. On the other hand, Bayesian shrinkage AMMI models have difficulty with the credible interval for model parameters, but produce stronger shrinkage of the principal components, converging to GE matrices that have more shrinkage than those obtained using mixed models. This characteristic allowed more parsimonious models to be chosen, and resulted in models being selected that were similar to those obtained by the Cornelius F-test (α = 0.05) in traditional AMMI models and cross validation based on leave-one-out. This characteristic allowed more parsimonious models to be chosen and more GEI pattern retained on the first two components. The resulting model chosen by posterior distribution of singular value was also similar to those produced by the cross-validation approach in traditional AMMI models. Our method enables the estimation of credible interval for AMMI biplot plus the choice of AMMI model based on direct posterior
Model Comparison of Bayesian Semiparametric and Parametric Structural Equation Models
ERIC Educational Resources Information Center
Song, Xin-Yuan; Xia, Ye-Mao; Pan, Jun-Hao; Lee, Sik-Yum
2011-01-01
Structural equation models have wide applications. One of the most important issues in analyzing structural equation models is model comparison. This article proposes a Bayesian model comparison statistic, namely the "L[subscript nu]"-measure for both semiparametric and parametric structural equation models. For illustration purposes, we consider…
A Nonparametric Bayesian Model for Nested Clustering.
Lee, Juhee; Müller, Peter; Zhu, Yitan; Ji, Yuan
2016-01-01
We propose a nonparametric Bayesian model for clustering where clusters of experimental units are determined by a shared pattern of clustering another set of experimental units. The proposed model is motivated by the analysis of protein activation data, where we cluster proteins such that all proteins in one cluster give rise to the same clustering of patients. That is, we define clusters of proteins by the way that patients group with respect to the corresponding protein activations. This is in contrast to (almost) all currently available models that use shared parameters in the sampling model to define clusters. This includes in particular model based clustering, Dirichlet process mixtures, product partition models, and more. We show results for two typical biostatistical inference problems that give rise to clustering. PMID:26519174
Bayesian nonparametric models for ranked set sampling.
Gemayel, Nader; Stasny, Elizabeth A; Wolfe, Douglas A
2015-04-01
Ranked set sampling (RSS) is a data collection technique that combines measurement with judgment ranking for statistical inference. This paper lays out a formal and natural Bayesian framework for RSS that is analogous to its frequentist justification, and that does not require the assumption of perfect ranking or use of any imperfect ranking models. Prior beliefs about the judgment order statistic distributions and their interdependence are embodied by a nonparametric prior distribution. Posterior inference is carried out by means of Markov chain Monte Carlo techniques, and yields estimators of the judgment order statistic distributions (and of functionals of those distributions). PMID:25326663
Bayesian POT modeling for historical data
NASA Astrophysics Data System (ADS)
Parent, Eric; Bernier, Jacques
2003-04-01
When designing hydraulic structures, civil engineers have to evaluate design floods, i.e. events generally much rarer that the ones that have already been systematically recorded. To extrapolate towards extreme value events, taking advantage of further information such as historical data, has been an early concern among hydrologists. Most methods described in the hydrological literature are designed from a frequentist interpretation of probabilities, although such probabilities are commonly interpreted as subjective decisional bets by the end user. This paper adopts a Bayesian setting to deal with the classical Poisson-Pareto peak over treshold (POT) model when a sample of historical data is available. Direct probalistic statements can be made about the unknown parameters, thus improving communication with decision makers. On the Garonne case study, we point out that twelve historical events, however imprecise they might be, greatly reduce uncertainty. The 90% credible interval for the 1000 year flood becomes 40% smaller when taking into account historical data. Any kind of uncertainty (model uncertainty, imprecise range for historical events, missing data) can be incorporated into the decision analysis. Tractable and versatile data augmentation algorithms are implemented by Monte Carlo Markov Chain tools. Advantage is taken from a semi-conjugate prior, flexible enough to elicit expert knowledge about extreme behavior of the river flows. The data augmentation algorithm allows to deal with imprecise historical data in the POT model. A direct hydrological meaning is given to the latent variables, which are the Bayesian keytool to model unobserved past floods in the historical series.
Model feedback in Bayesian propensity score estimation.
Zigler, Corwin M; Watts, Krista; Yeh, Robert W; Wang, Yun; Coull, Brent A; Dominici, Francesca
2013-03-01
Methods based on the propensity score comprise one set of valuable tools for comparative effectiveness research and for estimating causal effects more generally. These methods typically consist of two distinct stages: (1) a propensity score stage where a model is fit to predict the propensity to receive treatment (the propensity score), and (2) an outcome stage where responses are compared in treated and untreated units having similar values of the estimated propensity score. Traditional techniques conduct estimation in these two stages separately; estimates from the first stage are treated as fixed and known for use in the second stage. Bayesian methods have natural appeal in these settings because separate likelihoods for the two stages can be combined into a single joint likelihood, with estimation of the two stages carried out simultaneously. One key feature of joint estimation in this context is "feedback" between the outcome stage and the propensity score stage, meaning that quantities in a model for the outcome contribute information to posterior distributions of quantities in the model for the propensity score. We provide a rigorous assessment of Bayesian propensity score estimation to show that model feedback can produce poor estimates of causal effects absent strategies that augment propensity score adjustment with adjustment for individual covariates. We illustrate this phenomenon with a simulation study and with a comparative effectiveness investigation of carotid artery stenting versus carotid endarterectomy among 123,286 Medicare beneficiaries hospitlized for stroke in 2006 and 2007. PMID:23379793
Experience With Bayesian Image Based Surface Modeling
NASA Technical Reports Server (NTRS)
Stutz, John C.
2005-01-01
Bayesian surface modeling from images requires modeling both the surface and the image generation process, in order to optimize the models by comparing actual and generated images. Thus it differs greatly, both conceptually and in computational difficulty, from conventional stereo surface recovery techniques. But it offers the possibility of using any number of images, taken under quite different conditions, and by different instruments that provide independent and often complementary information, to generate a single surface model that fuses all available information. I describe an implemented system, with a brief introduction to the underlying mathematical models and the compromises made for computational efficiency. I describe successes and failures achieved on actual imagery, where we went wrong and what we did right, and how our approach could be improved. Lastly I discuss how the same approach can be extended to distinct types of instruments, to achieve true sensor fusion.
A Hierarchical Bayesian Model for Crowd Emotions
Urizar, Oscar J.; Baig, Mirza S.; Barakova, Emilia I.; Regazzoni, Carlo S.; Marcenaro, Lucio; Rauterberg, Matthias
2016-01-01
Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366
A Hierarchical Bayesian Model for Crowd Emotions.
Urizar, Oscar J; Baig, Mirza S; Barakova, Emilia I; Regazzoni, Carlo S; Marcenaro, Lucio; Rauterberg, Matthias
2016-01-01
Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls
Krøigård, Thomas; Gaist, David; Otto, Marit; Højlund, Dorthe; Selmar, Peter E; Sindrup, Søren H
2014-08-01
The reproducibility of variables commonly included in studies of peripheral nerve conduction in healthy individuals has not previously been analyzed using a random effects regression model. We examined the temporal changes and variability of standard nerve conduction measures in the leg. Peroneal nerve distal motor latency, motor conduction velocity, and compound motor action potential amplitude; sural nerve sensory action potential amplitude and sensory conduction velocity; and tibial nerve minimal F-wave latency were examined in 51 healthy subjects, aged 40 to 67 years. They were reexamined after 2 and 26 weeks. There was no change in the variables except for a minor decrease in sural nerve sensory action potential amplitude and a minor increase in tibial nerve minimal F-wave latency. Reproducibility was best for peroneal nerve distal motor latency and motor conduction velocity, sural nerve sensory conduction velocity, and tibial nerve minimal F-wave latency. Between-subject variability was greater than within-subject variability. Sample sizes ranging from 21 to 128 would be required to show changes twice the magnitude of the spontaneous changes observed in this study. Nerve conduction studies have a high reproducibility, and variables are mainly unaltered during 6 months. This study provides a solid basis for the planning of future clinical trials assessing changes in nerve conduction. PMID:25083853
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology. PMID:23687472
ERIC Educational Resources Information Center
Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan
2011-01-01
Estimation of parameters of random effects models from samples collected via complex multistage designs is considered. One way to reduce estimation bias due to unequal probabilities of selection is to incorporate sampling weights. Many researchers have been proposed various weighting methods (Korn, & Graubard, 2003; Pfeffermann, Skinner, Holmes,…
Merging Digital Surface Models Implementing Bayesian Approaches
NASA Astrophysics Data System (ADS)
Sadeq, H.; Drummond, J.; Li, Z.
2016-06-01
In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades). It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures.
Orbanz, Peter; Roy, Daniel M
2015-02-01
The natural habitat of most Bayesian methods is data represented by exchangeable sequences of observations, for which de Finetti's theorem provides the theoretical foundation. Dirichlet process clustering, Gaussian process regression, and many other parametric and nonparametric Bayesian models fall within the remit of this framework; many problems arising in modern data analysis do not. This article provides an introduction to Bayesian models of graphs, matrices, and other data that can be modeled by random structures. We describe results in probability theory that generalize de Finetti's theorem to such data and discuss their relevance to nonparametric Bayesian modeling. With the basic ideas in place, we survey example models available in the literature; applications of such models include collaborative filtering, link prediction, and graph and network analysis. We also highlight connections to recent developments in graph theory and probability, and sketch the more general mathematical foundation of Bayesian methods for other types of data beyond sequences and arrays. PMID:26353253
Model parameter updating using Bayesian networks
Treml, C. A.; Ross, Timothy J.
2004-01-01
This paper outlines a model parameter updating technique for a new method of model validation using a modified model reference adaptive control (MRAC) framework with Bayesian Networks (BNs). The model parameter updating within this method is generic in the sense that the model/simulation to be validated is treated as a black box. It must have updateable parameters to which its outputs are sensitive, and those outputs must have metrics that can be compared to that of the model reference, i.e., experimental data. Furthermore, no assumptions are made about the statistics of the model parameter uncertainty, only upper and lower bounds need to be specified. This method is designed for situations where a model is not intended to predict a complete point-by-point time domain description of the item/system behavior; rather, there are specific points, features, or events of interest that need to be predicted. These specific points are compared to the model reference derived from actual experimental data. The logic for updating the model parameters to match the model reference is formed via a BN. The nodes of this BN consist of updateable model input parameters and the specific output values or features of interest. Each time the model is executed, the input/output pairs are used to adapt the conditional probabilities of the BN. Each iteration further refines the inferred model parameters to produce the desired model output. After parameter updating is complete and model inputs are inferred, reliabilities for the model output are supplied. Finally, this method is applied to a simulation of a resonance control cooling system for a prototype coupled cavity linac. The results are compared to experimental data.
A Bayesian model for cluster detection.
Wakefield, Jonathan; Kim, Albert
2013-09-01
The detection of areas in which the risk of a particular disease is significantly elevated, leading to an excess of cases, is an important enterprise in spatial epidemiology. Various frequentist approaches have been suggested for the detection of "clusters" within a hypothesis testing framework. Unfortunately, these suffer from a number of drawbacks including the difficulty in specifying a p-value threshold at which to call significance, the inherent multiplicity problem, and the possibility of multiple clusters. In this paper, we suggest a Bayesian approach to detecting "areas of clustering" in which the study region is partitioned into, possibly multiple, "zones" within which the risk is either at a null, or non-null, level. Computation is carried out using Markov chain Monte Carlo, tuned to the model that we develop. The method is applied to leukemia data in upstate New York. PMID:23476026
Bayesian model selection for LISA pathfinder
NASA Astrophysics Data System (ADS)
Karnesis, Nikolaos; Nofrarias, Miquel; Sopuerta, Carlos F.; Gibert, Ferran; Armano, Michele; Audley, Heather; Congedo, Giuseppe; Diepholz, Ingo; Ferraioli, Luigi; Hewitson, Martin; Hueller, Mauro; Korsakova, Natalia; McNamara, Paul W.; Plagnol, Eric; Vitale, Stefano
2014-03-01
The main goal of the LISA Pathfinder (LPF) mission is to fully characterize the acceleration noise models and to test key technologies for future space-based gravitational-wave observatories similar to the eLISA concept. The data analysis team has developed complex three-dimensional models of the LISA Technology Package (LTP) experiment onboard the LPF. These models are used for simulations, but, more importantly, they will be used for parameter estimation purposes during flight operations. One of the tasks of the data analysis team is to identify the physical effects that contribute significantly to the properties of the instrument noise. A way of approaching this problem is to recover the essential parameters of a LTP model fitting the data. Thus, we want to define the simplest model that efficiently explains the observations. To do so, adopting a Bayesian framework, one has to estimate the so-called Bayes factor between two competing models. In our analysis, we use three main different methods to estimate it: the reversible jump Markov chain Monte Carlo method, the Schwarz criterion, and the Laplace approximation. They are applied to simulated LPF experiments in which the most probable LTP model that explains the observations is recovered. The same type of analysis presented in this paper is expected to be followed during flight operations. Moreover, the correlation of the output of the aforementioned methods with the design of the experiment is explored.
Modeling residual hydrologic errors with Bayesian inference
NASA Astrophysics Data System (ADS)
Smith, Tyler; Marshall, Lucy; Sharma, Ashish
2015-09-01
Hydrologic modelers are confronted with the challenge of producing estimates of the uncertainty associated with model predictions across an array of catchments and hydrologic flow regimes. Formal Bayesian approaches are commonly employed for parameter calibration and uncertainty analysis, but are often criticized for making strong assumptions about the nature of model residuals via the likelihood function that may not be well satisfied (or even checked). This technical note outlines a residual error model (likelihood function) specification framework that aims to provide guidance for the application of more appropriate residual error models through a nested approach that is both flexible and extendible. The framework synthesizes many previously employed residual error models and has been applied to four synthetic datasets (of differing error structure) and a real dataset from the Black River catchment in Queensland, Australia. Each residual error model was investigated and assessed under a top-down approach focused on its ability to properly characterize the errors. The results of these test applications indicate that a multifaceted assessment strategy is necessary to determine the adequacy of an individual likelihood function.
Bayesian Student Modeling and the Problem of Parameter Specification.
ERIC Educational Resources Information Center
Millan, Eva; Agosta, John Mark; Perez de la Cruz, Jose Luis
2001-01-01
Discusses intelligent tutoring systems and the application of Bayesian networks to student modeling. Considers reasons for not using Bayesian networks, including the computational complexity of the algorithms and the difficulty of knowledge acquisition, and proposes an approach to simplify knowledge acquisition that applies causal independence to…
A Tutorial Introduction to Bayesian Models of Cognitive Development
ERIC Educational Resources Information Center
Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei
2011-01-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…
Implementing Relevance Feedback in the Bayesian Network Retrieval Model.
ERIC Educational Resources Information Center
de Campos, Luis M.; Fernandez-Luna, Juan M.; Huete, Juan F.
2003-01-01
Discussion of relevance feedback in information retrieval focuses on a proposal for the Bayesian Network Retrieval Model. Bases the proposal on the propagation of partial evidences in the Bayesian network, representing new information obtained from the user's relevance judgments to compute the posterior relevance probabilities of the documents…
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
Bayesian analysis of the backreaction models
Kurek, Aleksandra; Bolejko, Krzysztof; Szydlowski, Marek
2010-03-15
We present a Bayesian analysis of four different types of backreaction models, which are based on the Buchert equations. In this approach, one considers a solution to the Einstein equations for a general matter distribution and then an average of various observable quantities is taken. Such an approach became of considerable interest when it was shown that it could lead to agreement with observations without resorting to dark energy. In this paper we compare the {Lambda}CDM model and the backreaction models with type Ia supernovae, baryon acoustic oscillations, and cosmic microwave background data, and find that the former is favored. However, the tested models were based on some particular assumptions about the relation between the average spatial curvature and the backreaction, as well as the relation between the curvature and curvature index. In this paper we modified the latter assumption, leaving the former unchanged. We find that, by varying the relation between the curvature and curvature index, we can obtain a better fit. Therefore, some further work is still needed--in particular, the relation between the backreaction and the curvature should be revisited in order to fully determine the feasibility of the backreaction models to mimic dark energy.
Scale Mixture Models with Applications to Bayesian Inference
NASA Astrophysics Data System (ADS)
Qin, Zhaohui S.; Damien, Paul; Walker, Stephen
2003-11-01
Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.
Stochastic model updating utilizing Bayesian approach and Gaussian process model
NASA Astrophysics Data System (ADS)
Wan, Hua-Ping; Ren, Wei-Xin
2016-03-01
Stochastic model updating (SMU) has been increasingly applied in quantifying structural parameter uncertainty from responses variability. SMU for parameter uncertainty quantification refers to the problem of inverse uncertainty quantification (IUQ), which is a nontrivial task. Inverse problem solved with optimization usually brings about the issues of gradient computation, ill-conditionedness, and non-uniqueness. Moreover, the uncertainty present in response makes the inverse problem more complicated. In this study, Bayesian approach is adopted in SMU for parameter uncertainty quantification. The prominent strength of Bayesian approach for IUQ problem is that it solves IUQ problem in a straightforward manner, which enables it to avoid the previous issues. However, when applied to engineering structures that are modeled with a high-resolution finite element model (FEM), Bayesian approach is still computationally expensive since the commonly used Markov chain Monte Carlo (MCMC) method for Bayesian inference requires a large number of model runs to guarantee the convergence. Herein we reduce computational cost in two aspects. On the one hand, the fast-running Gaussian process model (GPM) is utilized to approximate the time-consuming high-resolution FEM. On the other hand, the advanced MCMC method using delayed rejection adaptive Metropolis (DRAM) algorithm that incorporates local adaptive strategy with global adaptive strategy is employed for Bayesian inference. In addition, we propose the use of the powerful variance-based global sensitivity analysis (GSA) in parameter selection to exclude non-influential parameters from calibration parameters, which yields a reduced-order model and thus further alleviates the computational burden. A simulated aluminum plate and a real-world complex cable-stayed pedestrian bridge are presented to illustrate the proposed framework and verify its feasibility.
Otis, D.L.; White, Gary C.
2004-01-01
Increased population survival rate after an episode of seasonal exploitation is considered a type of compensatory population response. Lack of an increase is interpreted as evidence that exploitation results in added annual mortality in the population. Despite its importance to management of exploited species, there are limited statistical techniques for comparing relative support for these two alternative models. For exploited bird species, the most common technique is to use a fixed effect, deterministic ultrastructure model incorporated into band recovery models to estimate the relationship between harvest and survival rate. We present a new likelihood-based technique within a framework that assumes that survival and harvest are random effects that covary through time. We conducted a Monte Carlo simulation study under this framework to evaluate the performance of these two techniques. The ultrastructure models performed poorly in all simulated scenarios, due mainly to pathological distributional properties. The random effects estimators and their associated estimators of precision had relatively small negative bias under most scenarios, and profile likelihood intervals achieved nominal coverage. We suggest that the random effects estimation method approach has many advantages compared to the ultrastructure models, and that evaluation of robustness and generalization to more complex population structures are topics for additional research. ?? 2004 Museu de Cie??ncies Naturals.
A guide to Bayesian model selection for ecologists
Hooten, Mevin B.; Hobbs, N.T.
2015-01-01
The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.
Bayesian analysis of a disability model for lung cancer survival.
Armero, C; Cabras, S; Castellanos, M E; Perra, S; Quirós, A; Oruezábal, M J; Sánchez-Rubio, J
2016-02-01
Bayesian reasoning, survival analysis and multi-state models are used to assess survival times for Stage IV non-small-cell lung cancer patients and the evolution of the disease over time. Bayesian estimation is done using minimum informative priors for the Weibull regression survival model, leading to an automatic inferential procedure. Markov chain Monte Carlo methods have been used for approximating posterior distributions and the Bayesian information criterion has been considered for covariate selection. In particular, the posterior distribution of the transition probabilities, resulting from the multi-state model, constitutes a very interesting tool which could be useful to help oncologists and patients make efficient and effective decisions. PMID:22767866
Bayesian Test of Significance for Conditional Independence: The Multinomial Model
NASA Astrophysics Data System (ADS)
de Morais Andrade, Pablo; Stern, Julio; de Bragança Pereira, Carlos
2014-03-01
Conditional independence tests (CI tests) have received special attention lately in Machine Learning and Computational Intelligence related literature as an important indicator of the relationship among the variables used by their models. In the field of Probabilistic Graphical Models (PGM)--which includes Bayesian Networks (BN) models--CI tests are especially important for the task of learning the PGM structure from data. In this paper, we propose the Full Bayesian Significance Test (FBST) for tests of conditional independence for discrete datasets. FBST is a powerful Bayesian test for precise hypothesis, as an alternative to frequentist's significance tests (characterized by the calculation of the \\emph{p-value}).
Constructive Epistemic Modeling: A Hierarchical Bayesian Model Averaging Method
NASA Astrophysics Data System (ADS)
Tsai, F. T. C.; Elshall, A. S.
2014-12-01
Constructive epistemic modeling is the idea that our understanding of a natural system through a scientific model is a mental construct that continually develops through learning about and from the model. Using the hierarchical Bayesian model averaging (HBMA) method [1], this study shows that segregating different uncertain model components through a BMA tree of posterior model probabilities, model prediction, within-model variance, between-model variance and total model variance serves as a learning tool [2]. First, the BMA tree of posterior model probabilities permits the comparative evaluation of the candidate propositions of each uncertain model component. Second, systemic model dissection is imperative for understanding the individual contribution of each uncertain model component to the model prediction and variance. Third, the hierarchical representation of the between-model variance facilitates the prioritization of the contribution of each uncertain model component to the overall model uncertainty. We illustrate these concepts using the groundwater modeling of a siliciclastic aquifer-fault system. The sources of uncertainty considered are from geological architecture, formation dip, boundary conditions and model parameters. The study shows that the HBMA analysis helps in advancing knowledge about the model rather than forcing the model to fit a particularly understanding or merely averaging several candidate models. [1] Tsai, F. T.-C., and A. S. Elshall (2013), Hierarchical Bayesian model averaging for hydrostratigraphic modeling: Uncertainty segregation and comparative evaluation. Water Resources Research, 49, 5520-5536, doi:10.1002/wrcr.20428. [2] Elshall, A.S., and F. T.-C. Tsai (2014). Constructive epistemic modeling of groundwater flow with geological architecture and boundary condition uncertainty under Bayesian paradigm, Journal of Hydrology, 517, 105-119, doi: 10.1016/j.jhydrol.2014.05.027.
Integrative variable selection via Bayesian model uncertainty.
Quintana, M A; Conti, D V
2013-12-10
We are interested in developing integrative approaches for variable selection problems that incorporate external knowledge on a set of predictors of interest. In particular, we have developed an integrative Bayesian model uncertainty (iBMU) method, which formally incorporates multiple sources of data via a second-stage probit model on the probability that any predictor is associated with the outcome of interest. Using simulations, we demonstrate that iBMU leads to an increase in power to detect true marginal associations over more commonly used variable selection techniques, such as least absolute shrinkage and selection operator and elastic net. In addition, iBMU leads to a more efficient model search algorithm over the basic BMU method even when the predictor-level covariates are only modestly informative. The increase in power and efficiency of our method becomes more substantial as the predictor-level covariates become more informative. Finally, we demonstrate the power and flexibility of iBMU for integrating both gene structure and functional biomarker information into a candidate gene study investigating over 50 genes in the brain reward system and their role with smoking cessation from the Pharmacogenetics of Nicotine Addiction and Treatment Consortium. PMID:23824835
Entropic Priors and Bayesian Model Selection
NASA Astrophysics Data System (ADS)
Brewer, Brendon J.; Francis, Matthew J.
2009-12-01
We demonstrate that the principle of maximum relative entropy (ME), used judiciously, can ease the specification of priors in model selection problems. The resulting effect is that models that make sharp predictions are disfavoured, weakening the usual Bayesian ``Occam's Razor.'' This is illustrated with a simple example involving what Jaynes called a ``sure thing'' hypothesis. Jaynes' resolution of the situation involved introducing a large number of alternative ``sure thing'' hypotheses that were possible before we observed the data. However, in more complex situations, it may not be possible to explicitly enumerate large numbers of alternatives. The entropic priors formalism produces the desired result without modifying the hypothesis space or requiring explicit enumeration of alternatives; all that is required is a good model for the prior predictive distribution for the data. This idea is illustrated with a simple rigged-lottery example, and we outline how this idea may help to resolve a recent debate amongst cosmologists: is dark energy a cosmological constant, or has it evolved with time in some way? And how shall we decide, when the data are in?
Modeling Grade IV Gas Emboli using a Limited Failure Population Model with Random Effects
NASA Technical Reports Server (NTRS)
Thompson, Laura A.; Conkin, Johnny; Chhikara, Raj S.; Powell, Michael R.
2002-01-01
Venous gas emboli (VGE) (gas bubbles in venous blood) are associated with an increased risk of decompression sickness (DCS) in hypobaric environments. A high grade of VGE can be a precursor to serious DCS. In this paper, we model time to Grade IV VGE considering a subset of individuals assumed to be immune from experiencing VGE. Our data contain monitoring test results from subjects undergoing up to 13 denitrogenation test procedures prior to exposure to a hypobaric environment. The onset time of Grade IV VGE is recorded as contained within certain time intervals. We fit a parametric (lognormal) mixture survival model to the interval-and right-censored data to account for the possibility of a subset of "cured" individuals who are immune to the event. Our model contains random subject effects to account for correlations between repeated measurements on a single individual. Model assessments and cross-validation indicate that this limited failure population mixture model is an improvement over a model that does not account for the potential of a fraction of cured individuals. We also evaluated some alternative mixture models. Predictions from the best fitted mixture model indicate that the actual process is reasonably approximated by a limited failure population model.
Two-Stage Bayesian Model Averaging in Endogenous Variable Models.
Lenkoski, Alex; Eicher, Theo S; Raftery, Adrian E
2014-01-01
Economic modeling in the presence of endogeneity is subject to model uncertainty at both the instrument and covariate level. We propose a Two-Stage Bayesian Model Averaging (2SBMA) methodology that extends the Two-Stage Least Squares (2SLS) estimator. By constructing a Two-Stage Unit Information Prior in the endogenous variable model, we are able to efficiently combine established methods for addressing model uncertainty in regression models with the classic technique of 2SLS. To assess the validity of instruments in the 2SBMA context, we develop Bayesian tests of the identification restriction that are based on model averaged posterior predictive p-values. A simulation study showed that 2SBMA has the ability to recover structure in both the instrument and covariate set, and substantially improves the sharpness of resulting coefficient estimates in comparison to 2SLS using the full specification in an automatic fashion. Due to the increased parsimony of the 2SBMA estimate, the Bayesian Sargan test had a power of 50 percent in detecting a violation of the exogeneity assumption, while the method based on 2SLS using the full specification had negligible power. We apply our approach to the problem of development accounting, and find support not only for institutions, but also for geography and integration as development determinants, once both model uncertainty and endogeneity have been jointly addressed. PMID:24223471
Calibrating Bayesian Network Representations of Social-Behavioral Models
Whitney, Paul D.; Walsh, Stephen J.
2010-04-08
While human behavior has long been studied, recent and ongoing advances in computational modeling present opportunities for recasting research outcomes in human behavior. In this paper we describe how Bayesian networks can represent outcomes of human behavior research. We demonstrate a Bayesian network that represents political radicalization research – and show a corresponding visual representation of aspects of this research outcome. Since Bayesian networks can be quantitatively compared with external observations, the representation can also be used for empirical assessments of the research which the network summarizes. For a political radicalization model based on published research, we show this empirical comparison with data taken from the Minorities at Risk Organizational Behaviors database.
BAYESIAN METHODS FOR REGIONAL-SCALE EUTROPHICATION MODELS. (R830887)
We demonstrate a Bayesian classification and regression tree (CART) approach to link multiple environmental stressors to biological responses and quantify uncertainty in model predictions. Such an approach can: (1) report prediction uncertainty, (2) be consistent with the amou...
Multivariate Bayesian Models of Extreme Rainfall
NASA Astrophysics Data System (ADS)
Rahill-Marier, B.; Devineni, N.; Lall, U.; Farnham, D.
2013-12-01
Accounting for spatial heterogeneity in extreme rainfall has important ramifications in hydrological design and climate models alike. Traditional methods, including areal reduction factors and kriging, are sensitive to catchment shape assumptions and return periods, and do not explicitly model spatial dependence between between data points. More recent spatially dense rainfall simulators depend on newer data sources such as radar and may struggle to reproduce extremes because of physical assumptions in the model and short historical records. Rain gauges offer the longest historical record, key when considering rainfall extremes and changes over time, and particularly relevant in today's environment of designing for climate change. In this paper we propose a probabilistic approach of accounting for spatial dependence using the lengthy but spatially disparate hourly rainfall network in the greater New York City area. We build a hierarchical Bayesian model allowing extremes at one station to co-vary with concurrent rainfall fields occurring at other stations. Subsequently we pool across the extreme rainfall fields of all stations, and demonstrate that the expected catchment-wide events are significantly lower when considering spatial fields instead of maxima-only fields. We additionally demonstrate the importance of using concurrent spatial fields, rather than annual maxima, in producing covariance matrices that describe true storm dynamics. This approach is also unique in that it considers short duration storms - from one hour to twenty-four hours - rather than the daily values typically derived from rainfall gauges. The same methodology can be extended to include the radar fields available in the past decade. The hierarchical multilevel approach lends itself easily to integration of long-record parameters and short-record parameters at a station or regional level. In addition climate covariates can be introduced to support the relationship of spatial covariance with
A Bayesian observer model constrained by efficient coding can explain 'anti-Bayesian' percepts.
Wei, Xue-Xin; Stocker, Alan A
2015-10-01
Bayesian observer models provide a principled account of the fact that our perception of the world rarely matches physical reality. The standard explanation is that our percepts are biased toward our prior beliefs. However, reported psychophysical data suggest that this view may be simplistic. We propose a new model formulation based on efficient coding that is fully specified for any given natural stimulus distribution. The model makes two new and seemingly anti-Bayesian predictions. First, it predicts that perception is often biased away from an observer's prior beliefs. Second, it predicts that stimulus uncertainty differentially affects perceptual bias depending on whether the uncertainty is induced by internal or external noise. We found that both model predictions match reported perceptual biases in perceived visual orientation and spatial frequency, and were able to explain data that have not been explained before. The model is general and should prove applicable to other perceptual variables and tasks. PMID:26343249
Liu, Xiaolei; Huang, Meng; Fan, Bin; Buckler, Edward S.; Zhang, Zhiwu
2016-01-01
False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises true positives. The modified MLM method, Multiple Loci Linear Mixed Model (MLMM), incorporates multiple markers simultaneously as covariates in a stepwise MLM to partially remove the confounding between testing markers and kinship. To completely eliminate the confounding, we divided MLMM into two parts: Fixed Effect Model (FEM) and a Random Effect Model (REM) and use them iteratively. FEM contains testing markers, one at a time, and multiple associated markers as covariates to control false positives. To avoid model over-fitting problem in FEM, the associated markers are estimated in REM by using them to define kinship. The P values of testing markers and the associated markers are unified at each iteration. We named the new method as Fixed and random model Circulating Probability Unification (FarmCPU). Both real and simulated data analyses demonstrated that FarmCPU improves statistical power compared to current methods. Additional benefits include an efficient computing time that is linear to both number of individuals and number of markers. Now, a dataset with half million individuals and half million markers can be analyzed within three days. PMID:26828793
Evaluating Individualized Reading Programs: A Bayesian Model.
ERIC Educational Resources Information Center
Maxwell, Martha
Simple Bayesian approaches can be applied to answer specific questions in evaluating an individualized reading program. A small reading and study skills program located in the counseling center of a major research university collected and compiled data on student characteristics such as class, number of sessions attended, grade point average, and…
Using consensus bayesian network to model the reactive oxygen species regulatory pathway.
Hu, Liangdong; Wang, Limin
2013-01-01
Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway. PMID:23457624
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. PMID:27179874
Social Science and the Bayesian Probability Explanation Model
NASA Astrophysics Data System (ADS)
Yin, Jie; Zhao, Lei
2014-03-01
C. G. Hempel, one of the logical empiricists, who builds up his probability explanation model by using the empiricist view of probability, this model encountered many difficulties in the scientific explanation in which Hempel is difficult to make a reasonable defense. Based on the bayesian probability theory, the Bayesian probability model provides an approach of a subjective probability explanation based on the subjective probability, using the subjectivist view of probability. On the one hand, this probability model establishes the epistemological status of the subject in the social science; On the other hand, it provides a feasible explanation model for the social scientific explanation, which has important methodological significance.
Bayesian calibration of a flood inundation model using spatial data
NASA Astrophysics Data System (ADS)
Hall, Jim W.; Manning, Lucy J.; Hankin, Robin K. S.
2011-05-01
Bayesian theory of model calibration provides a coherent framework for distinguishing and encoding multiple sources of uncertainty in probabilistic predictions of flooding. This paper demonstrates the use of a Bayesian approach to computer model calibration, where the calibration data are in the form of spatial observations of flood extent. The Bayesian procedure involves generating posterior distributions of the flood model calibration parameters and observation error, as well as a Gaussian model inadequacy function, which represents the discrepancy between the best model predictions and reality. The approach is first illustrated with a simple didactic example and is then applied to a flood model of a reach of the river Thames in the UK. A predictive spatial distribution of flooding is generated for a flood of given severity.
Bayesian Estimation of the Logistic Positive Exponent IRT Model
ERIC Educational Resources Information Center
Bolfarine, Heleno; Bazan, Jorge Luis
2010-01-01
A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…
Estimating Tree Height-Diameter Models with the Bayesian Method
Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei
2014-01-01
Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the “best” model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2. PMID:24711733
Estimating tree height-diameter models with the Bayesian method.
Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei
2014-01-01
Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the "best" model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2. PMID:24711733
On the Adequacy of Bayesian Evaluations of Categorization Models: Reply to Vanpaemel and Lee (2012)
ERIC Educational Resources Information Center
Wills, Andy J.; Pothos, Emmanuel M.
2012-01-01
Vanpaemel and Lee (2012) argued, and we agree, that the comparison of formal models can be facilitated by Bayesian methods. However, Bayesian methods neither precede nor supplant our proposals (Wills & Pothos, 2012), as Bayesian methods can be applied both to our proposals and to their polar opposites. Furthermore, the use of Bayesian methods to…
A Bayesian semiparametric model for bivariate sparse longitudinal data.
Das, Kiranmoy; Li, Runze; Sengupta, Subhajit; Wu, Rongling
2013-09-30
Mixed-effects models have recently become popular for analyzing sparse longitudinal data that arise naturally in biological, agricultural and biomedical studies. Traditional approaches assume independent residuals over time and explain the longitudinal dependence by random effects. However, when bivariate or multivariate traits are measured longitudinally, this fundamental assumption is likely to be violated because of intertrait dependence over time. We provide a more general framework where the dependence of the observations from the same subject over time is not assumed to be explained completely by the random effects of the model. We propose a novel, mixed model-based approach and estimate the error-covariance structure nonparametrically under a generalized linear model framework. We use penalized splines to model the general effect of time, and we consider a Dirichlet process mixture of normal prior for the random-effects distribution. We analyze blood pressure data from the Framingham Heart Study where body mass index, gender and time are treated as covariates. We compare our method with traditional methods including parametric modeling of the random effects and independent residual errors over time. We conduct extensive simulation studies to investigate the practical usefulness of the proposed method. The current approach is very helpful in analyzing bivariate irregular longitudinal traits. PMID:23553747
Bayesian non-parametrics and the probabilistic approach to modelling
Ghahramani, Zoubin
2013-01-01
Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman’s coalescent, Dirichlet diffusion trees and Wishart processes. PMID:23277609
Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis
ERIC Educational Resources Information Center
Ansari, Asim; Iyengar, Raghuram
2006-01-01
We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…
Bayesian Network Models for Local Dependence among Observable Outcome Variables
ERIC Educational Resources Information Center
Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli
2009-01-01
Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context--ignores dependence among observables; (b) compensatory context--introduces…
On the Bayesian Nonparametric Generalization of IRT-Type Models
ERIC Educational Resources Information Center
San Martin, Ernesto; Jara, Alejandro; Rolin, Jean-Marie; Mouchart, Michel
2011-01-01
We study the identification and consistency of Bayesian semiparametric IRT-type models, where the uncertainty on the abilities' distribution is modeled using a prior distribution on the space of probability measures. We show that for the semiparametric Rasch Poisson counts model, simple restrictions ensure the identification of a general…
A General Bayesian Model for Testlets: Theory and Applications.
ERIC Educational Resources Information Center
Wang, Xiaohui; Bradlow, Eric T.; Wainer, Howard
2002-01-01
Proposes a modified version of commonly employed item response models in a fully Bayesian framework and obtains inferences under the model using Markov chain Monte Carlo techniques. Demonstrates use of the model in a series of simulations and with operational data from the North Carolina Test of Computer Skills and the Test of Spoken English…
Involving Stakeholders in Building Integrated Fisheries Models Using Bayesian Methods
NASA Astrophysics Data System (ADS)
Haapasaari, Päivi; Mäntyniemi, Samu; Kuikka, Sakari
2013-06-01
A participatory Bayesian approach was used to investigate how the views of stakeholders could be utilized to develop models to help understand the Central Baltic herring fishery. In task one, we applied the Bayesian belief network methodology to elicit the causal assumptions of six stakeholders on factors that influence natural mortality, growth, and egg survival of the herring stock in probabilistic terms. We also integrated the expressed views into a meta-model using the Bayesian model averaging (BMA) method. In task two, we used influence diagrams to study qualitatively how the stakeholders frame the management problem of the herring fishery and elucidate what kind of causalities the different views involve. The paper combines these two tasks to assess the suitability of the methodological choices to participatory modeling in terms of both a modeling tool and participation mode. The paper also assesses the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology provides a flexible tool that can be adapted to different kinds of needs and challenges of participatory modeling. The ability of the approach to deal with small data sets makes it cost-effective in participatory contexts. However, the BMA methodology used in modeling the biological uncertainties is so complex that it needs further development before it can be introduced to wider use in participatory contexts.
Involving stakeholders in building integrated fisheries models using Bayesian methods.
Haapasaari, Päivi; Mäntyniemi, Samu; Kuikka, Sakari
2013-06-01
A participatory Bayesian approach was used to investigate how the views of stakeholders could be utilized to develop models to help understand the Central Baltic herring fishery. In task one, we applied the Bayesian belief network methodology to elicit the causal assumptions of six stakeholders on factors that influence natural mortality, growth, and egg survival of the herring stock in probabilistic terms. We also integrated the expressed views into a meta-model using the Bayesian model averaging (BMA) method. In task two, we used influence diagrams to study qualitatively how the stakeholders frame the management problem of the herring fishery and elucidate what kind of causalities the different views involve. The paper combines these two tasks to assess the suitability of the methodological choices to participatory modeling in terms of both a modeling tool and participation mode. The paper also assesses the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology provides a flexible tool that can be adapted to different kinds of needs and challenges of participatory modeling. The ability of the approach to deal with small data sets makes it cost-effective in participatory contexts. However, the BMA methodology used in modeling the biological uncertainties is so complex that it needs further development before it can be introduced to wider use in participatory contexts. PMID:23604267
A Bayesian Approach for Analyzing Longitudinal Structural Equation Models
ERIC Educational Resources Information Center
Song, Xin-Yuan; Lu, Zhao-Hua; Hser, Yih-Ing; Lee, Sik-Yum
2011-01-01
This article considers a Bayesian approach for analyzing a longitudinal 2-level nonlinear structural equation model with covariates, and mixed continuous and ordered categorical variables. The first-level model is formulated for measures taken at each time point nested within individuals for investigating their characteristics that are dynamically…
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models
ERIC Educational Resources Information Center
Price, Larry R.
2012-01-01
The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Bayesian Finite Mixtures for Nonlinear Modeling of Educational Data.
ERIC Educational Resources Information Center
Tirri, Henry; And Others
A Bayesian approach for finding latent classes in data is discussed. The approach uses finite mixture models to describe the underlying structure in the data and demonstrate that the possibility of using full joint probability models raises interesting new prospects for exploratory data analysis. The concepts and methods discussed are illustrated…
Bayesian Semiparametric Structural Equation Models with Latent Variables
ERIC Educational Resources Information Center
Yang, Mingan; Dunson, David B.
2010-01-01
Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions of unknowns, including the latent variables. In…
Bayesian Estimation of the DINA Model with Gibbs Sampling
ERIC Educational Resources Information Center
Culpepper, Steven Andrew
2015-01-01
A Bayesian model formulation of the deterministic inputs, noisy "and" gate (DINA) model is presented. Gibbs sampling is employed to simulate from the joint posterior distribution of item guessing and slipping parameters, subject attribute parameters, and latent class probabilities. The procedure extends concepts in Béguin and Glas,…
Bayesian log-periodic model for financial crashes
NASA Astrophysics Data System (ADS)
Rodríguez-Caballero, Carlos Vladimir; Knapik, Oskar
2014-10-01
This paper introduces a Bayesian approach in econophysics literature about financial bubbles in order to estimate the most probable time for a financial crash to occur. To this end, we propose using noninformative prior distributions to obtain posterior distributions. Since these distributions cannot be performed analytically, we develop a Markov Chain Monte Carlo algorithm to draw from posterior distributions. We consider three Bayesian models that involve normal and Student's t-distributions in the disturbances and an AR(1)-GARCH(1,1) structure only within the first case. In the empirical part of the study, we analyze a well-known example of financial bubble - the S&P 500 1987 crash - to show the usefulness of the three methods under consideration and crashes of Merval-94, Bovespa-97, IPCMX-94, Hang Seng-97 using the simplest method. The novelty of this research is that the Bayesian models provide 95% credible intervals for the estimated crash time.
Bayesian methods for characterizing unknown parameters of material models
Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.
2016-02-04
A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less
Bayesian Joint Modelling for Object Localisation in Weakly Labelled Images.
Shi, Zhiyuan; Hospedales, Timothy M; Xiang, Tao
2015-10-01
We address the problem of localisation of objects as bounding boxes in images and videos with weak labels. This weakly supervised object localisation problem has been tackled in the past using discriminative models where each object class is localised independently from other classes. In this paper, a novel framework based on Bayesian joint topic modelling is proposed, which differs significantly from the existing ones in that: (1) All foreground object classes are modelled jointly in a single generative model that encodes multiple object co-existence so that "explaining away" inference can resolve ambiguity and lead to better learning and localisation. (2) Image backgrounds are shared across classes to better learn varying surroundings and "push out" objects of interest. (3) Our model can be learned with a mixture of weakly labelled and unlabelled data, allowing the large volume of unlabelled images on the Internet to be exploited for learning. Moreover, the Bayesian formulation enables the exploitation of various types of prior knowledge to compensate for the limited supervision offered by weakly labelled data, as well as Bayesian domain adaptation for transfer learning. Extensive experiments on the PASCAL VOC, ImageNet and YouTube-Object videos datasets demonstrate the effectiveness of our Bayesian joint model for weakly supervised object localisation. PMID:26340253
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided. PMID:26019004
A mixture model with random-effects components for classifying sibling pairs.
Martella, F; Vermunt, J K; Beekman, M; Westendorp, R G J; Slagboom, P E; Houwing-Duistermaat, J J
2011-11-30
In healthy aging research, typically multiple health outcomes are measured, representing health status. The aim of this paper was to develop a model-based clustering approach to identify homogeneous sibling pairs according to their health status. Model-based clustering approaches will be considered on the basis of linear mixed effect model for the mixture components. Class memberships of siblings within pairs are allowed to be correlated, and within a class the correlation between siblings is modeled using random sibling pair effects. We propose an expectation-maximization algorithm for maximum likelihood estimation. Model performance is evaluated via simulations in terms of estimating the correct parameters, degree of agreement, and the ability to detect the correct number of clusters. The performance of our model is compared with the performance of standard model-based clustering approaches. The methods are used to classify sibling pairs from the Leiden Longevity Study according to their health status. Our results suggest that homogeneous healthy sibling pairs are associated with a longer life span. Software is available for fitting the new models. PMID:21905068
Examples of Mixed-Effects Modeling with Crossed Random Effects and with Binomial Data
ERIC Educational Resources Information Center
Quene, Hugo; van den Bergh, Huub
2008-01-01
Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not nested. Traditional ANOVAs are compared against…
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies
Measuring Learning Progressions Using Bayesian Modeling in Complex Assessments
ERIC Educational Resources Information Center
Rutstein, Daisy Wise
2012-01-01
This research examines issues regarding model estimation and robustness in the use of Bayesian Inference Networks (BINs) for measuring Learning Progressions (LPs). It provides background information on LPs and how they might be used in practice. Two simulation studies are performed, along with real data examples. The first study examines the case…
Shortlist B: A Bayesian Model of Continuous Speech Recognition
ERIC Educational Resources Information Center
Norris, Dennis; McQueen, James M.
2008-01-01
A Bayesian model of continuous speech recognition is presented. It is based on Shortlist (D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract prelexical and lexical representations, a feedforward…
Probabilistic climate change predictions applying Bayesian model averaging.
Min, Seung-Ki; Simonis, Daniel; Hense, Andreas
2007-08-15
This study explores the sensitivity of probabilistic predictions of the twenty-first century surface air temperature (SAT) changes to different multi-model averaging methods using available simulations from the Intergovernmental Panel on Climate Change fourth assessment report. A way of observationally constrained prediction is provided by training multi-model simulations for the second half of the twentieth century with respect to long-term components. The Bayesian model averaging (BMA) produces weighted probability density functions (PDFs) and we compare two methods of estimating weighting factors: Bayes factor and expectation-maximization algorithm. It is shown that Bayesian-weighted PDFs for the global mean SAT changes are characterized by multi-modal structures from the middle of the twenty-first century onward, which are not clearly seen in arithmetic ensemble mean (AEM). This occurs because BMA tends to select a few high-skilled models and down-weight the others. Additionally, Bayesian results exhibit larger means and broader PDFs in the global mean predictions than the unweighted AEM. Multi-modality is more pronounced in the continental analysis using 30-year mean (2070-2099) SATs while there is only a little effect of Bayesian weighting on the 5-95% range. These results indicate that this approach to observationally constrained probabilistic predictions can be highly sensitive to the method of training, particularly for the later half of the twenty-first century, and that a more comprehensive approach combining different regions and/or variables is required. PMID:17569647
Resolution-matrix-constrained model updates for bayesian seismic tomography
NASA Astrophysics Data System (ADS)
Fontanini, Francesco; Bleibinhaus, Florian
2015-04-01
One of the most important issues of interpreting seismic tomography models is the need to provide a quantification of their uncertainty. Bayesian approach to inverse problems offers a rigorous way to quantitatively estimate this uncertainty at the price of an higher computation time. Optimizing bayesian algorithms is therefore a key problem. We are developing a multivariate model-updating scheme that makes use of the constraints provided by the Model Resolution Matrix , aiming to a more efficient sampling of the model space. The Resolution Matrix relates the true model to the estimate, its off-diagonal values provide a set of trade-off relations between model parameters used in our algorithm to obtain optimized model updates.
Standardized Mean Differences in Two-Level Cross-Classified Random Effects Models
ERIC Educational Resources Information Center
Lai, Mark H. C.; Kwok, Oi-Man
2014-01-01
Multilevel modeling techniques are becoming more popular in handling data with multilevel structure in educational and behavioral research. Recently, researchers have paid more attention to cross-classified data structure that naturally arises in educational settings. However, unlike traditional single-level research, methodological studies about…
Deletion diagnostics for the generalised linear mixed model with independent random effects.
Ganguli, B; Roy, S Sen; Naskar, M; Malloy, E J; Eisen, E A
2016-04-30
The Generalised linear mixed model (GLMM) is widely used for modelling environmental data. However, such data are prone to influential observations, which can distort the estimated exposure-response curve particularly in regions of high exposure. Deletion diagnostics for iterative estimation schemes commonly derive the deleted estimates based on a single iteration of the full system holding certain pivotal quantities such as the information matrix to be constant. In this paper, we present an approximate formula for the deleted estimates and Cook's distance for the GLMM, which does not assume that the estimates of variance parameters are unaffected by deletion. The procedure allows the user to calculate standardised DFBETAs for mean as well as variance parameters. In certain cases such as when using the GLMM as a device for smoothing, such residuals for the variance parameters are interesting in their own right. In general, the procedure leads to deleted estimates of mean parameters, which are corrected for the effect of deletion on variance components as estimation of the two sets of parameters is interdependent. The probabilistic behaviour of these residuals is investigated and a simulation based procedure suggested for their standardisation. The method is used to identify influential individuals in an occupational cohort exposed to silica. The results show that failure to conduct post model fitting diagnostics for variance components can lead to erroneous conclusions about the fitted curve and unstable confidence intervals. Copyright © 2015 John Wiley & Sons, Ltd. PMID:26626135
Roy, Vivekananda; Evangelou, Evangelos; Zhu, Zhengyuan
2016-03-01
Spatial generalized linear mixed models (SGLMMs) are popular models for spatial data with a non-Gaussian response. Binomial SGLMMs with logit or probit link functions are often used to model spatially dependent binomial random variables. It is known that for independent binomial data, the robit regression model provides a more robust (against extreme observations) alternative to the more popular logistic and probit models. In this article, we introduce a Bayesian spatial robit model for spatially dependent binomial data. Since constructing a meaningful prior on the link function parameter as well as the spatial correlation parameters in SGLMMs is difficult, we propose an empirical Bayes (EB) approach for the estimation of these parameters as well as for the prediction of the random effects. The EB methodology is implemented by efficient importance sampling methods based on Markov chain Monte Carlo (MCMC) algorithms. Our simulation study shows that the robit model is robust against model misspecification, and our EB method results in estimates with less bias than full Bayesian (FB) analysis. The methodology is applied to a Celastrus Orbiculatus data, and a Rhizoctonia root data. For the former, which is known to contain outlying observations, the robit model is shown to do better for predicting the spatial distribution of an invasive species. For the latter, our approach is doing as well as the classical models for predicting the disease severity for a root disease, as the probit link is shown to be appropriate. Though this article is written for Binomial SGLMMs for brevity, the EB methodology is more general and can be applied to other types of SGLMMs. In the accompanying R package geoBayes, implementations for other SGLMMs such as Poisson and Gamma SGLMMs are provided. PMID:26331903
Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...
Lu, Xiaosun; Huang, Yangxin
2014-07-20
It is a common practice to analyze complex longitudinal data using nonlinear mixed-effects (NLME) models with normality assumption. The NLME models with normal distributions provide the most popular framework for modeling continuous longitudinal outcomes, assuming individuals are from a homogeneous population and relying on random-effects to accommodate inter-individual variation. However, the following two issues may standout: (i) normality assumption for model errors may cause lack of robustness and subsequently lead to invalid inference and unreasonable estimates, particularly, if the data exhibit skewness and (ii) a homogeneous population assumption may be unrealistically obscuring important features of between-subject and within-subject variations, which may result in unreliable modeling results. There has been relatively few studies concerning longitudinal data with both heterogeneity and skewness features. In the last two decades, the skew distributions have shown beneficial in dealing with asymmetric data in various applications. In this article, our objective is to address the simultaneous impact of both features arisen from longitudinal data by developing a flexible finite mixture of NLME models with skew distributions under Bayesian framework that allows estimates of both model parameters and class membership probabilities for longitudinal data. Simulation studies are conducted to assess the performance of the proposed models and methods, and a real example from an AIDS clinical trial illustrates the methodology by modeling the viral dynamics to compare potential models with different distribution specifications; the analysis results are reported. PMID:24623529
Bayesian inference in camera trapping studies for a class of spatial capture-recapture models
Royle, J. Andrew; Karanth, K. Ullas; Gopalaswamy, Arjun M.; Kumar, N. Samba
2009-01-01
We develop a class of models for inference about abundance or density using spatial capture-recapture data from studies based on camera trapping and related methods. The model is a hierarchical model composed of two components: a point process model describing the distribution of individuals in space (or their home range centers) and a model describing the observation of individuals in traps. We suppose that trap- and individual-specific capture probabilities are a function of distance between individual home range centers and trap locations. We show that the models can be regarded as generalized linear mixed models, where the individual home range centers are random effects. We adopt a Bayesian framework for inference under these models using a formulation based on data augmentation. We apply the models to camera trapping data on tigers from the Nagarahole Reserve, India, collected over 48 nights in 2006. For this study, 120 camera locations were used, but cameras were only operational at 30 locations during any given sample occasion. Movement of traps is common in many camera-trapping studies and represents an important feature of the observation model that we address explicitly in our application.
A Bayesian population PBPK model for multiroute chloroform exposure
Yang, Yuching; Xu, Xu; Georgopoulos, Panos G.
2011-01-01
A Bayesian hierarchical model was developed to estimate the parameters in a physiologically based pharmacokinetic (PBPK) model for chloroform using prior information and biomarker data from different exposure pathways. In particular, the model provides a quantitative description of the changes in physiological parameters associated with hot-water bath and showering scenarios. Through Bayesian inference, uncertainties in the PBPK parameters were reduced from the prior distributions. Prediction of biomarker data with the calibrated PBPK model was improved by the calibration. The posterior results indicate that blood flow rates varied under two different exposure scenarios, with a two-fold increase of the skin's blood flow rate predicted in the hot-bath scenario. This result highlights the importance of considering scenario-specific parameters in PBPK modeling. To demonstrate the application of a probability approach in toxicological assessment, results from the posterior distributions from this calibrated model were used to predict target tissue dose based on the rate of chloroform metabolized in liver. This study demonstrates the use of the Bayesian approach to optimize PBPK model parameters for typical household exposure scenarios. PMID:19471319
Hwang, Beom Seuk; Pennell, Michael L
2014-03-30
Many dose-response studies collect data on correlated outcomes. For example, in developmental toxicity studies, uterine weight and presence of malformed pups are measured on the same dam. Joint modeling can result in more efficient inferences than independent models for each outcome. Most methods for joint modeling assume standard parametric response distributions. However, in toxicity studies, it is possible that response distributions vary in location and shape with dose, which may not be easily captured by standard models. To address this issue, we propose a semiparametric Bayesian joint model for a binary and continuous response. In our model, a kernel stick-breaking process prior is assigned to the distribution of a random effect shared across outcomes, which allows flexible changes in distribution shape with dose shared across outcomes. The model also includes outcome-specific fixed effects to allow different location effects. In simulation studies, we found that the proposed model provides accurate estimates of toxicological risk when the data do not satisfy assumptions of standard parametric models. We apply our method to data from a developmental toxicity study of ethylene glycol diethyl ether. PMID:24123309
Application of hierarchical Bayesian unmixing models in river sediment source apportionment
NASA Astrophysics Data System (ADS)
Blake, Will; Smith, Hugh; Navas, Ana; Bodé, Samuel; Goddard, Rupert; Zou Kuzyk, Zou; Lennard, Amy; Lobb, David; Owens, Phil; Palazon, Leticia; Petticrew, Ellen; Gaspar, Leticia; Stock, Brian; Boeckx, Pacsal; Semmens, Brice
2016-04-01
Fingerprinting and unmixing concepts are used widely across environmental disciplines for forensic evaluation of pollutant sources. In aquatic and marine systems, this includes tracking the source of organic and inorganic pollutants in water and linking problem sediment to soil erosion and land use sources. It is, however, the particular complexity of ecological systems that has driven creation of the most sophisticated mixing models, primarily to (i) evaluate diet composition in complex ecological food webs, (ii) inform population structure and (iii) explore animal movement. In the context of the new hierarchical Bayesian unmixing model, MIXSIAR, developed to characterise intra-population niche variation in ecological systems, we evaluate the linkage between ecological 'prey' and 'consumer' concepts and river basin sediment 'source' and sediment 'mixtures' to exemplify the value of ecological modelling tools to river basin science. Recent studies have outlined advantages presented by Bayesian unmixing approaches in handling complex source and mixture datasets while dealing appropriately with uncertainty in parameter probability distributions. MixSIAR is unique in that it allows individual fixed and random effects associated with mixture hierarchy, i.e. factors that might exert an influence on model outcome for mixture groups, to be explored within the source-receptor framework. This offers new and powerful ways of interpreting river basin apportionment data. In this contribution, key components of the model are evaluated in the context of common experimental designs for sediment fingerprinting studies namely simple, nested and distributed catchment sampling programmes. Illustrative examples using geochemical and compound specific stable isotope datasets are presented and used to discuss best practice with specific attention to (1) the tracer selection process, (2) incorporation of fixed effects relating to sample timeframe and sediment type in the modelling
Bayesian point event modeling in spatial and environmental epidemiology.
Lawson, Andrew B
2012-10-01
This paper reviews the current state of point event modeling in spatial epidemiology from a Bayesian perspective. Point event (or case event) data arise when geo-coded addresses of disease events are available. Often, this level of spatial resolution would not be accessible due to medical confidentiality constraints. However, for the examination of small spatial scales, it is important to be capable of examining point process data directly. Models for such data are usually formulated based on point process theory. In addition, special conditioning arguments can lead to simpler Bernoulli likelihoods and logistic spatial models. Goodness-of-fit diagnostics and Bayesian residuals are also considered. Applications within putative health hazard risk assessment, cluster detection, and linkage to environmental risk fields (misalignment) are considered. PMID:23035034
HIBAYES: Global 21-cm Bayesian Monte-Carlo Model Fitting
NASA Astrophysics Data System (ADS)
Zwart, Jonathan T. L.; Price, Daniel; Bernardi, Gianni
2016-06-01
HIBAYES implements fully-Bayesian extraction of the sky-averaged (global) 21-cm signal from the Cosmic Dawn and Epoch of Reionization in the presence of foreground emission. User-defined likelihood and prior functions are called by the sampler PyMultiNest (ascl:1606.005) in order to jointly explore the full (signal plus foreground) posterior probability distribution and evaluate the Bayesian evidence for a given model. Implemented models, for simulation and fitting, include gaussians (HI signal) and polynomials (foregrounds). Some simple plotting and analysis tools are supplied. The code can be extended to other models (physical or empirical), to incorporate data from other experiments, or to use alternative Monte-Carlo sampling engines as required.
Application of the Bayesian dynamic survival model in medicine.
He, Jianghua; McGee, Daniel L; Niu, Xufeng
2010-02-10
The Bayesian dynamic survival model (BDSM), a time-varying coefficient survival model from the Bayesian prospective, was proposed in early 1990s but has not been widely used or discussed. In this paper, we describe the model structure of the BDSM and introduce two estimation approaches for BDSMs: the Markov Chain Monte Carlo (MCMC) approach and the linear Bayesian (LB) method. The MCMC approach estimates model parameters through sampling and is computationally intensive. With the newly developed geoadditive survival models and software BayesX, the BDSM is available for general applications. The LB approach is easier in terms of computations but it requires the prespecification of some unknown smoothing parameters. In a simulation study, we use the LB approach to show the effects of smoothing parameters on the performance of the BDSM and propose an ad hoc method for identifying appropriate values for those parameters. We also demonstrate the performance of the MCMC approach compared with the LB approach and a penalized partial likelihood method available in software R packages. A gastric cancer trial is utilized to illustrate the application of the BDSM. PMID:20014356
The impact of spatial scales and spatial smoothing on the outcome of bayesian spatial model.
Kang, Su Yun; McGree, James; Mengersen, Kerrie
2013-01-01
Discretization of a geographical region is quite common in spatial analysis. There have been few studies into the impact of different geographical scales on the outcome of spatial models for different spatial patterns. This study aims to investigate the impact of spatial scales and spatial smoothing on the outcomes of modelling spatial point-based data. Given a spatial point-based dataset (such as occurrence of a disease), we study the geographical variation of residual disease risk using regular grid cells. The individual disease risk is modelled using a logistic model with the inclusion of spatially unstructured and/or spatially structured random effects. Three spatial smoothness priors for the spatially structured component are employed in modelling, namely an intrinsic Gaussian Markov random field, a second-order random walk on a lattice, and a Gaussian field with Matérn correlation function. We investigate how changes in grid cell size affect model outcomes under different spatial structures and different smoothness priors for the spatial component. A realistic example (the Humberside data) is analyzed and a simulation study is described. Bayesian computation is carried out using an integrated nested Laplace approximation. The results suggest that the performance and predictive capacity of the spatial models improve as the grid cell size decreases for certain spatial structures. It also appears that different spatial smoothness priors should be applied for different patterns of point data. PMID:24146799
A Bayesian growth mixture model to examine maternal hypertension and birth outcomes.
Neelon, Brian; Swamy, Geeta K; Burgette, Lane F; Miranda, Marie Lynn
2011-09-30
Maternal hypertension is a major contributor to adverse pregnancy outcomes, including preterm birth (PTB) and low birth weight (LBW). Although several studies have explored the relationship between maternal hypertension and fetal health, few have examined how the longitudinal trajectory of blood pressure, considered over the course of pregnancy, affects birth outcomes. In this paper, we propose a Bayesian growth mixture model to jointly examine the associations between longitudinal blood pressure measurements, PTB, and LBW. The model partitions women into distinct classes characterized by a mean arterial pressure (MAP) curve and joint probabilities of PTB and LBW. Each class contains a unique mixed effects model for MAP with class-specific regression coefficients and random effect covariances. To account for the strong correlation between PTB and LBW, we introduce a bivariate probit model within each class to capture residual within-class dependence between PTB and LBW. The model permits the association between PTB and LBW to vary by class, so that for some classes, PTB and LBW may be positively correlated, whereas for others, they may be uncorrelated or negatively correlated. We also allow maternal covariates to influence the class probabilities via a multinomial logit model. For posterior computation, we propose an efficient MCMC algorithm that combines full-conditional Gibbs and Metropolis steps. We apply our model to a sample of 1027 women enrolled in the Healthy Pregnancy, Healthy Baby Study, a prospective cohort study of host, social, and environmental contributors to disparities in pregnancy outcomes. PMID:21751226
Bayesian Inference of High-Dimensional Dynamical Ocean Models
NASA Astrophysics Data System (ADS)
Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.
2015-12-01
This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
Bayesian approach for network modeling of brain structural features
NASA Astrophysics Data System (ADS)
Joshi, Anand A.; Joshi, Shantanu H.; Leahy, Richard M.; Shattuck, David W.; Dinov, Ivo; Toga, Arthur W.
2010-03-01
Brain connectivity patterns are useful in understanding brain function and organization. Anatomical brain connectivity is largely determined using the physical synaptic connections between neurons. In contrast statistical brain connectivity in a given brain population refers to the interaction and interdependencies of statistics of multitudes of brain features including cortical area, volume, thickness etc. Traditionally, this dependence has been studied by statistical correlations of cortical features. In this paper, we propose the use of Bayesian network modeling for inferring statistical brain connectivity patterns that relate to causal (directed) as well as non-causal (undirected) relationships between cortical surface areas. We argue that for multivariate cortical data, the Bayesian model provides for a more accurate representation by removing the effect of confounding correlations that get introduced due to canonical dependence between the data. Results are presented for a population of 466 brains, where a SEM (structural equation modeling) approach is used to generate a Bayesian network model, as well as a dependency graph for the joint distribution of cortical areas.
A localization model to localize multiple sources using Bayesian inference
NASA Astrophysics Data System (ADS)
Dunham, Joshua Rolv
Accurate localization of a sound source in a room setting is important in both psychoacoustics and architectural acoustics. Binaural models have been proposed to explain how the brain processes and utilizes the interaural time differences (ITDs) and interaural level differences (ILDs) of sound waves arriving at the ears of a listener in determining source location. Recent work shows that applying Bayesian methods to this problem is proving fruitful. In this thesis, pink noise samples are convolved with head-related transfer functions (HRTFs) and compared to combinations of one and two anechoic speech signals convolved with different HRTFs or binaural room impulse responses (BRIRs) to simulate room positions. Through exhaustive calculation of Bayesian posterior probabilities and using a maximal likelihood approach, model selection will determine the number of sources present, and parameter estimation will result in azimuthal direction of the source(s).
Liu, Jung-Tzu; Tsou, Hsiao-Hui; Gordon Lan, K K; Chen, Chi-Tian; Lai, Yi-Hsuan; Chang, Wan-Jung; Tzeng, Chyng-Shyan; Hsiao, Chin-Fu
2016-06-30
In recent years, developing pharmaceutical products via multiregional clinical trials (MRCTs) has become standard. Traditionally, an MRCT would assume that a treatment effect is uniform across regions. However, heterogeneity among regions may have impact upon the evaluation of a medicine's effect. In this study, we consider a random effects model using discrete distribution (DREM) to account for heterogeneous treatment effects across regions for the design and evaluation of MRCTs. We derive an power function for a treatment that is beneficial under DREM and illustrate determination of the overall sample size in an MRCT. We use the concept of consistency based on Method 2 of the Japanese Ministry of Health, Labour, and Welfare's guidance to evaluate the probability for treatment benefit and consistency under DREM. We further derive an optimal sample size allocation over regions to maximize the power for consistency. Moreover, we provide three algorithms for deriving sample size at the desired level of power for benefit and consistency. In practice, regional treatment effects are unknown. Thus, we provide some guidelines on the design of MRCTs with consistency when the regional treatment effect are assumed to fall into a specified interval. Numerical examples are given to illustrate applications of the proposed approach. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26833851
Slice sampling technique in Bayesian extreme of gold price modelling
NASA Astrophysics Data System (ADS)
Rostami, Mohammad; Adam, Mohd Bakri; Ibrahim, Noor Akma; Yahya, Mohamed Hisham
2013-09-01
In this paper, a simulation study of Bayesian extreme values by using Markov Chain Monte Carlo via slice sampling algorithm is implemented. We compared the accuracy of slice sampling with other methods for a Gumbel model. This study revealed that slice sampling algorithm offers more accurate and closer estimates with less RMSE than other methods . Finally we successfully employed this procedure to estimate the parameters of Malaysia extreme gold price from 2000 to 2011.
How to Address Measurement Noise in Bayesian Model Averaging
NASA Astrophysics Data System (ADS)
Schöniger, A.; Wöhling, T.; Nowak, W.
2014-12-01
When confronted with the challenge of selecting one out of several competing conceptual models for a specific modeling task, Bayesian model averaging is a rigorous choice. It ranks the plausibility of models based on Bayes' theorem, which yields an optimal trade-off between performance and complexity. With the resulting posterior model probabilities, their individual predictions are combined into a robust weighted average and the overall predictive uncertainty (including conceptual uncertainty) can be quantified. This rigorous framework does, however, not yet explicitly consider statistical significance of measurement noise in the calibration data set. This is a major drawback, because model weights might be instable due to the uncertainty in noisy data, which may compromise the reliability of model ranking. We present a new extension to the Bayesian model averaging framework that explicitly accounts for measurement noise as a source of uncertainty for the weights. This enables modelers to assess the reliability of model ranking for a specific application and a given calibration data set. Also, the impact of measurement noise on the overall prediction uncertainty can be determined. Technically, our extension is built within a Monte Carlo framework. We repeatedly perturb the observed data with random realizations of measurement error. Then, we determine the robustness of the resulting model weights against measurement noise. We quantify the variability of posterior model weights as weighting variance. We add this new variance term to the overall prediction uncertainty analysis within the Bayesian model averaging framework to make uncertainty quantification more realistic and "complete". We illustrate the importance of our suggested extension with an application to soil-plant model selection, based on studies by Wöhling et al. (2013, 2014). Results confirm that noise in leaf area index or evaporation rate observations produces a significant amount of weighting
Bayesian regression model for seasonal forecast of precipitation over Korea
NASA Astrophysics Data System (ADS)
Jo, Seongil; Lim, Yaeji; Lee, Jaeyong; Kang, Hyun-Suk; Oh, Hee-Seok
2012-08-01
In this paper, we apply three different Bayesian methods to the seasonal forecasting of the precipitation in a region around Korea (32.5°N-42.5°N, 122.5°E-132.5°E). We focus on the precipitation of summer season (June-July-August; JJA) for the period of 1979-2007 using the precipitation produced by the Global Data Assimilation and Prediction System (GDAPS) as predictors. Through cross-validation, we demonstrate improvement for seasonal forecast of precipitation in terms of root mean squared error (RMSE) and linear error in probability space score (LEPS). The proposed methods yield RMSE of 1.09 and LEPS of 0.31 between the predicted and observed precipitations, while the prediction using GDAPS output only produces RMSE of 1.20 and LEPS of 0.33 for CPC Merged Analyzed Precipitation (CMAP) data. For station-measured precipitation data, the RMSE and LEPS of the proposed Bayesian methods are 0.53 and 0.29, while GDAPS output is 0.66 and 0.33, respectively. The methods seem to capture the spatial pattern of the observed precipitation. The Bayesian paradigm incorporates the model uncertainty as an integral part of modeling in a natural way. We provide a probabilistic forecast integrating model uncertainty.
AIC, BIC, Bayesian evidence against the interacting dark energy model
NASA Astrophysics Data System (ADS)
Szydłowski, Marek; Krawiec, Adam; Kurek, Aleksandra; Kamionka, Michał
2015-01-01
Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting CDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative—the CDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam's principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), , baryon acoustic oscillation, the Alcock-Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting CDM model when compared to the CDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the CDM model. Given the weak or almost non-existing support for the interacting CDM model and bearing in mind Occam's razor we are inclined to reject this model.
Dissecting Magnetar Variability with Bayesian Hierarchical Models
NASA Astrophysics Data System (ADS)
Huppenkothen, Daniela; Brewer, Brendon J.; Hogg, David W.; Murray, Iain; Frean, Marcus; Elenbaas, Chris; Watts, Anna L.; Levin, Yuri; van der Horst, Alexander J.; Kouveliotou, Chryssa
2015-09-01
Neutron stars are a prime laboratory for testing physical processes under conditions of strong gravity, high density, and extreme magnetic fields. Among the zoo of neutron star phenomena, magnetars stand out for their bursting behavior, ranging from extremely bright, rare giant flares to numerous, less energetic recurrent bursts. The exact trigger and emission mechanisms for these bursts are not known; favored models involve either a crust fracture and subsequent energy release into the magnetosphere, or explosive reconnection of magnetic field lines. In the absence of a predictive model, understanding the physical processes responsible for magnetar burst variability is difficult. Here, we develop an empirical model that decomposes magnetar bursts into a superposition of small spike-like features with a simple functional form, where the number of model components is itself part of the inference problem. The cascades of spikes that we model might be formed by avalanches of reconnection, or crust rupture aftershocks. Using Markov Chain Monte Carlo sampling augmented with reversible jumps between models with different numbers of parameters, we characterize the posterior distributions of the model parameters and the number of components per burst. We relate these model parameters to physical quantities in the system, and show for the first time that the variability within a burst does not conform to predictions from ideas of self-organized criticality. We also examine how well the properties of the spikes fit the predictions of simplified cascade models for the different trigger mechanisms.
Bayesian Transformation Models for Multivariate Survival Data
DE CASTRO, MÁRIO; CHEN, MING-HUI; IBRAHIM, JOSEPH G.; KLEIN, JOHN P.
2014-01-01
In this paper we propose a general class of gamma frailty transformation models for multivariate survival data. The transformation class includes the commonly used proportional hazards and proportional odds models. The proposed class also includes a family of cure rate models. Under an improper prior for the parameters, we establish propriety of the posterior distribution. A novel Gibbs sampling algorithm is developed for sampling from the observed data posterior distribution. A simulation study is conducted to examine the properties of the proposed methodology. An application to a data set from a cord blood transplantation study is also reported. PMID:24904194
Bayesian inference and model comparison for metallic fatigue data
NASA Astrophysics Data System (ADS)
Babuška, Ivo; Sawlan, Zaid; Scavino, Marco; Szabó, Barna; Tempone, Raúl
2016-06-01
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
3-D model-based Bayesian classification
Soenneland, L.; Tenneboe, P.; Gehrmann, T.; Yrke, O.
1994-12-31
The challenging task of the interpreter is to integrate different pieces of information and combine them into an earth model. The sophistication level of this earth model might vary from the simplest geometrical description to the most complex set of reservoir parameters related to the geometrical description. Obviously the sophistication level also depend on the completeness of the available information. The authors describe the interpreter`s task as a mapping between the observation space and the model space. The information available to the interpreter exists in observation space and the task is to infer a model in model-space. It is well-known that this inversion problem is non-unique. Therefore any attempt to find a solution depend son constraints being added in some manner. The solution will obviously depend on which constraints are introduced and it would be desirable to allow the interpreter to modify the constraints in a problem-dependent manner. They will present a probabilistic framework that gives the interpreter the tools to integrate the different types of information and produce constrained solutions. The constraints can be adapted to the problem at hand.
Bayesian Local Contamination Models for Multivariate Outliers
Page, Garritt L.; Dunson, David B.
2013-01-01
In studies where data are generated from multiple locations or sources it is common for there to exist observations that are quite unlike the majority. Motivated by the application of establishing a reference value in an inter-laboratory setting when outlying labs are present, we propose a local contamination model that is able to accommodate unusual multivariate realizations in a flexible way. The proposed method models the process level of a hierarchical model using a mixture with a parametric component and a possibly nonparametric contamination. Much of the flexibility in the methodology is achieved by allowing varying random subsets of the elements in the lab-specific mean vectors to be allocated to the contamination component. Computational methods are developed and the methodology is compared to three other possible approaches using a simulation study. We apply the proposed method to a NIST/NOAA sponsored inter-laboratory study which motivated the methodological development. PMID:24363465
Predicting coastal cliff erosion using a Bayesian probabilistic model
Hapke, C.; Plant, N.
2010-01-01
Regional coastal cliff retreat is difficult to model due to the episodic nature of failures and the along-shore variability of retreat events. There is a growing demand, however, for predictive models that can be used to forecast areas vulnerable to coastal erosion hazards. Increasingly, probabilistic models are being employed that require data sets of high temporal density to define the joint probability density function that relates forcing variables (e.g. wave conditions) and initial conditions (e.g. cliff geometry) to erosion events. In this study we use a multi-parameter Bayesian network to investigate correlations between key variables that control and influence variations in cliff retreat processes. The network uses Bayesian statistical methods to estimate event probabilities using existing observations. Within this framework, we forecast the spatial distribution of cliff retreat along two stretches of cliffed coast in Southern California. The input parameters are the height and slope of the cliff, a descriptor of material strength based on the dominant cliff-forming lithology, and the long-term cliff erosion rate that represents prior behavior. The model is forced using predicted wave impact hours. Results demonstrate that the Bayesian approach is well-suited to the forward modeling of coastal cliff retreat, with the correct outcomes forecast in 70-90% of the modeled transects. The model also performs well in identifying specific locations of high cliff erosion, thus providing a foundation for hazard mapping. This approach can be employed to predict cliff erosion at time-scales ranging from storm events to the impacts of sea-level rise at the century-scale. ?? 2010.
Bayesian sensitivity analysis of bifurcating nonlinear models
NASA Astrophysics Data System (ADS)
Becker, W.; Worden, K.; Rowson, J.
2013-01-01
Sensitivity analysis allows one to investigate how changes in input parameters to a system affect the output. When computational expense is a concern, metamodels such as Gaussian processes can offer considerable computational savings over Monte Carlo methods, albeit at the expense of introducing a data modelling problem. In particular, Gaussian processes assume a smooth, non-bifurcating response surface. This work highlights a recent extension to Gaussian processes which uses a decision tree to partition the input space into homogeneous regions, and then fits separate Gaussian processes to each region. In this way, bifurcations can be modelled at region boundaries and different regions can have different covariance properties. To test this method, both the treed and standard methods were applied to the bifurcating response of a Duffing oscillator and a bifurcating FE model of a heart valve. It was found that the treed Gaussian process provides a practical way of performing uncertainty and sensitivity analysis on large, potentially-bifurcating models, which cannot be dealt with by using a single GP, although an open problem remains how to manage bifurcation boundaries that are not parallel to coordinate axes.
DPpackage: Bayesian Non- and Semi-parametric Modelling in R.
Jara, Alejandro; Hanson, Timothy E; Quintana, Fernando A; Müller, Peter; Rosner, Gary L
2011-04-01
Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian non- and semi-parametric models in R, DPpackage. Currently DPpackage includes models for marginal and conditional density estimation, ROC curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison, and for eliciting the precision parameter of the Dirichlet process prior. To maximize computational efficiency, the actual sampling for each model is carried out using compiled FORTRAN. PMID:21796263
Bayesian calibration of hyperelastic constitutive models of soft tissue.
Madireddy, Sandeep; Sista, Bhargava; Vemaganti, Kumar
2016-06-01
There is inherent variability in the experimental response used to characterize the hyperelastic mechanical response of soft tissues. This has to be accounted for while estimating the parameters in the constitutive models to obtain reliable estimates of the quantities of interest. The traditional least squares method of parameter estimation does not give due importance to this variability. We use a Bayesian calibration framework based on nested Monte Carlo sampling to account for the variability in the experimental data and its effect on the estimated parameters through a systematic probability-based treatment. We consider three different constitutive models to represent the hyperelastic nature of soft tissue: Mooney-Rivlin model, exponential model, and Ogden model. Three stress-strain data sets corresponding to the deformation of agarose gel, bovine liver tissue, and porcine brain tissue are considered. Bayesian fits and parameter estimates are compared with the corresponding least squares values. Finally, we propagate the uncertainty in the parameters to a quantity of interest (QoI), namely the force-indentation response, to study the effect of model form on the values of the QoI. Our results show that the quality of the fit alone is insufficient to determine the adequacy of the model, and due importance has to be given to the maximum likelihood value, the landscape of the likelihood distribution, and model complexity. PMID:26751706
Lack of confidence in approximate Bayesian computation model choice.
Robert, Christian P; Cornuet, Jean-Marie; Marin, Jean-Michel; Pillai, Natesh S
2011-09-13
Approximate Bayesian computation (ABC) have become an essential tool for the analysis of complex stochastic models. Grelaud et al. [(2009) Bayesian Anal 3:427-442] advocated the use of ABC for model choice in the specific case of Gibbs random fields, relying on an intermodel sufficiency property to show that the approximation was legitimate. We implemented ABC model choice in a wide range of phylogenetic models in the Do It Yourself-ABC (DIY-ABC) software [Cornuet et al. (2008) Bioinformatics 24:2713-2719]. We now present arguments as to why the theoretical arguments for ABC model choice are missing, because the algorithm involves an unknown loss of information induced by the use of insufficient summary statistics. The approximation error of the posterior probabilities of the models under comparison may thus be unrelated with the computational effort spent in running an ABC algorithm. We then conclude that additional empirical verifications of the performances of the ABC procedure as those available in DIY-ABC are necessary to conduct model choice. PMID:21876135
Bayesian analysis of physiologically based toxicokinetic and toxicodynamic models.
Hack, C Eric
2006-04-17
Physiologically based toxicokinetic (PBTK) and toxicodynamic (TD) models of bromate in animals and humans would improve our ability to accurately estimate the toxic doses in humans based on available animal studies. These mathematical models are often highly parameterized and must be calibrated in order for the model predictions of internal dose to adequately fit the experimentally measured doses. Highly parameterized models are difficult to calibrate and it is difficult to obtain accurate estimates of uncertainty or variability in model parameters with commonly used frequentist calibration methods, such as maximum likelihood estimation (MLE) or least squared error approaches. The Bayesian approach called Markov chain Monte Carlo (MCMC) analysis can be used to successfully calibrate these complex models. Prior knowledge about the biological system and associated model parameters is easily incorporated in this approach in the form of prior parameter distributions, and the distributions are refined or updated using experimental data to generate posterior distributions of parameter estimates. The goal of this paper is to give the non-mathematician a brief description of the Bayesian approach and Markov chain Monte Carlo analysis, how this technique is used in risk assessment, and the issues associated with this approach. PMID:16466842
Bayesian partial linear model for skewed longitudinal data.
Tang, Yuanyuan; Sinha, Debajyoti; Pati, Debdeep; Lipsitz, Stuart; Lipshultz, Steven
2015-07-01
Unlike majority of current statistical models and methods focusing on mean response for highly skewed longitudinal data, we present a novel model for such data accommodating a partially linear median regression function, a skewed error distribution and within subject association structures. We provide theoretical justifications for our methods including asymptotic properties of the posterior and associated semiparametric Bayesian estimators. We also provide simulation studies to investigate the finite sample properties of our methods. Several advantages of our method compared with existing methods are demonstrated via analysis of a cardiotoxicity study of children of HIV-infected mothers. PMID:25792623
Goodness-of-fit diagnostics for Bayesian hierarchical models.
Yuan, Ying; Johnson, Valen E
2012-03-01
This article proposes methodology for assessing goodness of fit in Bayesian hierarchical models. The methodology is based on comparing values of pivotal discrepancy measures (PDMs), computed using parameter values drawn from the posterior distribution, to known reference distributions. Because the resulting diagnostics can be calculated from standard output of Markov chain Monte Carlo algorithms, their computational costs are minimal. Several simulation studies are provided, each of which suggests that diagnostics based on PDMs have higher statistical power than comparable posterior-predictive diagnostic checks in detecting model departures. The proposed methodology is illustrated in a clinical application; an application to discrete data is described in supplementary material. PMID:22050079
A study of finite mixture model: Bayesian approach on financial time series data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-07-01
Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty.
Baele, Guy; Lemey, Philippe; Suchard, Marc A
2016-03-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of "working distributions" to facilitate--or shorten--the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a "working" distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different "working" distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
Bayesian joint modeling of longitudinal and spatial survival AIDS data.
Martins, Rui; Silva, Giovani L; Andreozzi, Valeska
2016-08-30
Joint analysis of longitudinal and survival data has received increasing attention in the recent years, especially for analyzing cancer and AIDS data. As both repeated measurements (longitudinal) and time-to-event (survival) outcomes are observed in an individual, a joint modeling is more appropriate because it takes into account the dependence between the two types of responses, which are often analyzed separately. We propose a Bayesian hierarchical model for jointly modeling longitudinal and survival data considering functional time and spatial frailty effects, respectively. That is, the proposed model deals with non-linear longitudinal effects and spatial survival effects accounting for the unobserved heterogeneity among individuals living in the same region. This joint approach is applied to a cohort study of patients with HIV/AIDS in Brazil during the years 2002-2006. Our Bayesian joint model presents considerable improvements in the estimation of survival times of the Brazilian HIV/AIDS patients when compared with those obtained through a separate survival model and shows that the spatial risk of death is the same across the different Brazilian states. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26990773
An intuitive Bayesian spatial model for disease mapping that accounts for scaling.
Riebler, Andrea; Sørbye, Sigrunn H; Simpson, Daniel; Rue, Håvard
2016-08-01
In recent years, disease mapping studies have become a routine application within geographical epidemiology and are typically analysed within a Bayesian hierarchical model formulation. A variety of model formulations for the latent level have been proposed but all come with inherent issues. In the classical BYM (Besag, York and Mollié) model, the spatially structured component cannot be seen independently from the unstructured component. This makes prior definitions for the hyperparameters of the two random effects challenging. There are alternative model formulations that address this confounding; however, the issue on how to choose interpretable hyperpriors is still unsolved. Here, we discuss a recently proposed parameterisation of the BYM model that leads to improved parameter control as the hyperparameters can be seen independently from each other. Furthermore, the need for a scaled spatial component is addressed, which facilitates assignment of interpretable hyperpriors and make these transferable between spatial applications with different graph structures. The hyperparameters themselves are used to define flexible extensions of simple base models. Consequently, penalised complexity priors for these parameters can be derived based on the information-theoretic distance from the flexible model to the base model, giving priors with clear interpretation. We provide implementation details for the new model formulation which preserve sparsity properties, and we investigate systematically the model performance and compare it to existing parameterisations. Through a simulation study, we show that the new model performs well, both showing good learning abilities and good shrinkage behaviour. In terms of model choice criteria, the proposed model performs at least equally well as existing parameterisations, but only the new formulation offers parameters that are interpretable and hyperpriors that have a clear meaning. PMID:27566770
Structural and parameter uncertainty in Bayesian cost-effectiveness models
Jackson, Christopher H; Sharples, Linda D; Thompson, Simon G
2010-01-01
Health economic decision models are subject to various forms of uncertainty, including uncertainty about the parameters of the model and about the model structure. These uncertainties can be handled within a Bayesian framework, which also allows evidence from previous studies to be combined with the data. As an example, we consider a Markov model for assessing the cost-effectiveness of implantable cardioverter defibrillators. Using Markov chain Monte Carlo posterior simulation, uncertainty about the parameters of the model is formally incorporated in the estimates of expected cost and effectiveness. We extend these methods to include uncertainty about the choice between plausible model structures. This is accounted for by averaging the posterior distributions from the competing models using weights that are derived from the pseudo-marginal-likelihood and the deviance information criterion, which are measures of expected predictive utility. We also show how these cost-effectiveness calculations can be performed efficiently in the widely used software WinBUGS. PMID:20383261
Quantum-Like Bayesian Networks for Modeling Decision Making
Moreira, Catarina; Wichert, Andreas
2016-01-01
In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios. PMID:26858669
Predictive RANS simulations via Bayesian Model-Scenario Averaging
Edeling, W.N.; Cinnella, P.; Dwight, R.P.
2014-10-15
The turbulence closure model is the dominant source of error in most Reynolds-Averaged Navier–Stokes simulations, yet no reliable estimators for this error component currently exist. Here we develop a stochastic, a posteriori error estimate, calibrated to specific classes of flow. It is based on variability in model closure coefficients across multiple flow scenarios, for multiple closure models. The variability is estimated using Bayesian calibration against experimental data for each scenario, and Bayesian Model-Scenario Averaging (BMSA) is used to collate the resulting posteriors, to obtain a stochastic estimate of a Quantity of Interest (QoI) in an unmeasured (prediction) scenario. The scenario probabilities in BMSA are chosen using a sensor which automatically weights those scenarios in the calibration set which are similar to the prediction scenario. The methodology is applied to the class of turbulent boundary-layers subject to various pressure gradients. For all considered prediction scenarios the standard-deviation of the stochastic estimate is consistent with the measurement ground truth. Furthermore, the mean of the estimate is more consistently accurate than the individual model predictions.
Assessing uncertainty in a stand growth model by Bayesian synthesis
Green, E.J.; MacFarlane, D.W.; Valentine, H.T.; Strawderman, W.E.
1999-11-01
The Bayesian synthesis method (BSYN) was used to bound the uncertainty in projections calculated with PIPESTEM, a mechanistic model of forest growth. The application furnished posterior distributions of (a) the values of the model's parameters, and (b) the values of three of the model's output variables--basal area per unit land area, average tree height, and tree density--at different points in time. Confidence or credible intervals for the output variables were obtained directly from the posterior distributions. The application also provides estimates of correlation among the parameters and output variables. BSYN, which originally was applied to a population dynamics model for bowhead whales, is generally applicable to deterministic models. Extension to two or more linked models is discussed. A simple worked example is included in an appendix.
Assessing global vegetation activity using spatio-temporal Bayesian modelling
NASA Astrophysics Data System (ADS)
Mulder, Vera L.; van Eck, Christel M.; Friedlingstein, Pierre; Regnier, Pierre A. G.
2016-04-01
This work demonstrates the potential of modelling vegetation activity using a hierarchical Bayesian spatio-temporal model. This approach allows modelling changes in vegetation and climate simultaneous in space and time. Changes of vegetation activity such as phenology are modelled as a dynamic process depending on climate variability in both space and time. Additionally, differences in observed vegetation status can be contributed to other abiotic ecosystem properties, e.g. soil and terrain properties. Although these properties do not change in time, they do change in space and may provide valuable information in addition to the climate dynamics. The spatio-temporal Bayesian models were calibrated at a regional scale because the local trends in space and time can be better captured by the model. The regional subsets were defined according to the SREX segmentation, as defined by the IPCC. Each region is considered being relatively homogeneous in terms of large-scale climate and biomes, still capturing small-scale (grid-cell level) variability. Modelling within these regions is hence expected to be less uncertain due to the absence of these large-scale patterns, compared to a global approach. This overall modelling approach allows the comparison of model behavior for the different regions and may provide insights on the main dynamic processes driving the interaction between vegetation and climate within different regions. The data employed in this study encompasses the global datasets for soil properties (SoilGrids), terrain properties (Global Relief Model based on SRTM DEM and ETOPO), monthly time series of satellite-derived vegetation indices (GIMMS NDVI3g) and climate variables (Princeton Meteorological Forcing Dataset). The findings proved the potential of a spatio-temporal Bayesian modelling approach for assessing vegetation dynamics, at a regional scale. The observed interrelationships of the employed data and the different spatial and temporal trends support
A Bayesian approach to biokinetic models of internally- deposited radionuclides
NASA Astrophysics Data System (ADS)
Amer, Mamun F.
Bayesian methods were developed and applied to estimate parameters of biokinetic models of internally deposited radionuclides for the first time. Marginal posterior densities for the parameters, given the available data, were obtained and graphed. These densities contain all the information available about the parameters and fully describe their uncertainties. Two different numerical integration methods were employed to approximate the multi-dimensional integrals needed to obtain these densities and to verify our results. One numerical method was based on Gaussian quadrature. The other method was a lattice rule that was developed by Conroy. The lattice rule method is applied here for the first time in conjunction with Bayesian analysis. Computer codes were developed in Mathematica's own programming language to perform the integrals. Several biokinetic models were studied. The first model was a single power function, a/ t-b that was used to describe 226Ra whole body retention data for long periods of time in many patients. The posterior odds criterion for model identification was applied to select, from among some competing models, the best model to represent 226Ra retention in man. The highest model posterior was attained by the single power function. Posterior densities for the model parameters were obtained for each patient. Also, predictive densities for retention, given the available retention values and some selected times, were obtained. These predictive densities characterize the uncertainties in the unobservable retention values taking into consideration the uncertainties of other parameters in the model. The second model was a single exponential function, α e-/beta t, that was used to represent one patient's whole body retention as well as total excretion of 137Cs. Missing observations (censored data) in the two responses were replaced by unknown parameters and were handled in the same way other model parameters are treated. By applying the Bayesian
Bayesian Models for fMRI Data Analysis
Zhang, Linlin; Guindani, Michele; Vannucci, Marina
2015-01-01
Functional magnetic resonance imaging (fMRI), a noninvasive neuroimaging method that provides an indirect measure of neuronal activity by detecting blood flow changes, has experienced an explosive growth in the past years. Statistical methods play a crucial role in understanding and analyzing fMRI data. Bayesian approaches, in particular, have shown great promise in applications. A remarkable feature of fully Bayesian approaches is that they allow a flexible modeling of spatial and temporal correlations in the data. This paper provides a review of the most relevant models developed in recent years. We divide methods according to the objective of the analysis. We start from spatio-temporal models for fMRI data that detect task-related activation patterns. We then address the very important problem of estimating brain connectivity. We also touch upon methods that focus on making predictions of an individual's brain activity or a clinical or behavioral response. We conclude with a discussion of recent integrative models that aim at combining fMRI data with other imaging modalities, such as EEG/MEG and DTI data, measured on the same subjects. We also briefly discuss the emerging field of imaging genetics. PMID:25750690
Bayesian Gaussian Copula Factor Models for Mixed Data
Murray, Jared S.; Dunson, David B.; Carin, Lawrence; Lucas, Joseph E.
2013-01-01
Gaussian factor models have proven widely useful for parsimoniously characterizing dependence in multivariate data. There is a rich literature on their extension to mixed categorical and continuous variables, using latent Gaussian variables or through generalized latent trait models acommodating measurements in the exponential family. However, when generalizing to non-Gaussian measured variables the latent variables typically influence both the dependence structure and the form of the marginal distributions, complicating interpretation and introducing artifacts. To address this problem we propose a novel class of Bayesian Gaussian copula factor models which decouple the latent factors from the marginal distributions. A semiparametric specification for the marginals based on the extended rank likelihood yields straightforward implementation and substantial computational gains. We provide new theoretical and empirical justifications for using this likelihood in Bayesian inference. We propose new default priors for the factor loadings and develop efficient parameter-expanded Gibbs sampling for posterior computation. The methods are evaluated through simulations and applied to a dataset in political science. The models in this paper are implemented in the R package bfa.1 PMID:23990691
Approximate Bayesian computation for forward modeling in cosmology
NASA Astrophysics Data System (ADS)
Akeret, Joël; Refregier, Alexandre; Amara, Adam; Seehars, Sebastian; Hasner, Caspar
2015-08-01
Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to the posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release.
Model Selection in Historical Research Using Approximate Bayesian Computation
Rubio-Campillo, Xavier
2016-01-01
Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953
Efficient multilevel brain tumor segmentation with integrated bayesian model classification.
Corso, J J; Sharon, E; Dube, S; El-Saden, S; Sinha, U; Yuille, A
2008-05-01
We present a new method for automatic segmentation of heterogeneous image data that takes a step toward bridging the gap between bottom-up affinity-based segmentation methods and top-down generative model based approaches. The main contribution of the paper is a Bayesian formulation for incorporating soft model assignments into the calculation of affinities, which are conventionally model free. We integrate the resulting model-aware affinities into the multilevel segmentation by weighted aggregation algorithm, and apply the technique to the task of detecting and segmenting brain tumor and edema in multichannel magnetic resonance (MR) volumes. The computationally efficient method runs orders of magnitude faster than current state-of-the-art techniques giving comparable or improved results. Our quantitative results indicate the benefit of incorporating model-aware affinities into the segmentation process for the difficult case of glioblastoma multiforme brain tumor. PMID:18450536
Uncovering Transcriptional Regulatory Networks by Sparse Bayesian Factor Model
NASA Astrophysics Data System (ADS)
Meng, Jia; Zhang, Jianqiu(Michelle); Qi, Yuan(Alan); Chen, Yidong; Huang, Yufei
2010-12-01
The problem of uncovering transcriptional regulation by transcription factors (TFs) based on microarray data is considered. A novel Bayesian sparse correlated rectified factor model (BSCRFM) is proposed that models the unknown TF protein level activity, the correlated regulations between TFs, and the sparse nature of TF-regulated genes. The model admits prior knowledge from existing database regarding TF-regulated target genes based on a sparse prior and through a developed Gibbs sampling algorithm, a context-specific transcriptional regulatory network specific to the experimental condition of the microarray data can be obtained. The proposed model and the Gibbs sampling algorithm were evaluated on the simulated systems, and results demonstrated the validity and effectiveness of the proposed approach. The proposed model was then applied to the breast cancer microarray data of patients with Estrogen Receptor positive ([InlineEquation not available: see fulltext.]) status and Estrogen Receptor negative ([InlineEquation not available: see fulltext.]) status, respectively.
Preferential sampling and Bayesian geostatistics: Statistical modeling and examples.
Cecconi, Lorenzo; Grisotto, Laura; Catelan, Dolores; Lagazio, Corrado; Berrocal, Veronica; Biggeri, Annibale
2016-08-01
Preferential sampling refers to any situation in which the spatial process and the sampling locations are not stochastically independent. In this paper, we present two examples of geostatistical analysis in which the usual assumption of stochastic independence between the point process and the measurement process is violated. To account for preferential sampling, we specify a flexible and general Bayesian geostatistical model that includes a shared spatial random component. We apply the proposed model to two different case studies that allow us to highlight three different modeling and inferential aspects of geostatistical modeling under preferential sampling: (1) continuous or finite spatial sampling frame; (2) underlying causal model and relevant covariates; and (3) inferential goals related to mean prediction surface or prediction uncertainty. PMID:27566774
Emulation: A fast stochastic Bayesian method to eliminate model space
NASA Astrophysics Data System (ADS)
Roberts, Alan; Hobbs, Richard; Goldstein, Michael
2010-05-01
Joint inversion of large 3D datasets has been the goal of geophysicists ever since the datasets first started to be produced. There are two broad approaches to this kind of problem, traditional deterministic inversion schemes and more recently developed Bayesian search methods, such as MCMC (Markov Chain Monte Carlo). However, using both these kinds of schemes has proved prohibitively expensive, both in computing power and time cost, due to the normally very large model space which needs to be searched using forward model simulators which take considerable time to run. At the heart of strategies aimed at accomplishing this kind of inversion is the question of how to reliably and practicably reduce the size of the model space in which the inversion is to be carried out. Here we present a practical Bayesian method, known as emulation, which can address this issue. Emulation is a Bayesian technique used with considerable success in a number of technical fields, such as in astronomy, where the evolution of the universe has been modelled using this technique, and in the petroleum industry where history matching is carried out of hydrocarbon reservoirs. The method of emulation involves building a fast-to-compute uncertainty-calibrated approximation to a forward model simulator. We do this by modelling the output data from a number of forward simulator runs by a computationally cheap function, and then fitting the coefficients defining this function to the model parameters. By calibrating the error of the emulator output with respect to the full simulator output, we can use this to screen out large areas of model space which contain only implausible models. For example, starting with what may be considered a geologically reasonable prior model space of 10000 models, using the emulator we can quickly show that only models which lie within 10% of that model space actually produce output data which is plausibly similar in character to an observed dataset. We can thus much
Bayesian Learning of a Language Model from Continuous Speech
NASA Astrophysics Data System (ADS)
Neubig, Graham; Mimura, Masato; Mori, Shinsuke; Kawahara, Tatsuya
We propose a novel scheme to learn a language model (LM) for automatic speech recognition (ASR) directly from continuous speech. In the proposed method, we first generate phoneme lattices using an acoustic model with no linguistic constraints, then perform training over these phoneme lattices, simultaneously learning both lexical units and an LM. As a statistical framework for this learning problem, we use non-parametric Bayesian statistics, which make it possible to balance the learned model's complexity (such as the size of the learned vocabulary) and expressive power, and provide a principled learning algorithm through the use of Gibbs sampling. Implementation is performed using weighted finite state transducers (WFSTs), which allow for the simple handling of lattice input. Experimental results on natural, adult-directed speech demonstrate that LMs built using only continuous speech are able to significantly reduce ASR phoneme error rates. The proposed technique of joint Bayesian learning of lexical units and an LM over lattices is shown to significantly contribute to this improvement.
Clements, A C A; Pfeiffer, D U; Hayes, D
2005-10-12
A spatio-temporal analysis was undertaken with the aim of identifying the dynamics of herd mean individual cow SCCs (MICSCC) in seasonally calving New Zealand dairy herds. Two datasets were extracted from the Livestock Improvement Corporation's extensive national dairy recording database: (1) milk-recording data aggregated at the herd-level and (2) sales questionnaire data containing information on the size, location and infrastructure of each farm. A Bayesian spatio-temporal modelling approach was applied to the analysis. The data were aggregated by 10 km(2) grid cells and linear regression models were developed with spatially structured and unstructured random effects, a linear temporal trend random effect and spatial-temporal interactions for log-transformed median MISCC (ln(median MISCC)). Significant associations were found between ln(median MISCC) and milk yield, milk fat, milk protein, farm area and number of cups in the dairy. This led us to suggest that SCCs should be adjusted for volume and constituents prior to determining a threshold MISCC for identification of subclinical mastitis (SCM) problem herds. Part, or all, of the temporal trend in MISCC in the spatio-temporal model was accounted for by inclusion of yield and milk constituents as independent variables. This supports the hypothesis of a dilution effect with potential consequences for misdiagnosis of SCM, particularly in late lactation. Unmeasured covariates were similarly likely to be spatially structured and unstructured. PMID:16107283
Collective opinion formation model under Bayesian updating and confirmation bias.
Nishi, Ryosuke; Masuda, Naoki
2013-06-01
We propose a collective opinion formation model with a so-called confirmation bias. The confirmation bias is a psychological effect with which, in the context of opinion formation, an individual in favor of an opinion is prone to misperceive new incoming information as supporting the current belief of the individual. Our model modifies a Bayesian decision-making model for single individuals [M. Rabin and J. L. Schrag, Q. J. Econ. 114, 37 (1999)] for the case of a well-mixed population of interacting individuals in the absence of the external input. We numerically simulate the model to show that all the agents eventually agree on one of the two opinions only when the confirmation bias is weak. Otherwise, the stochastic population dynamics ends up creating a disagreement configuration (also called polarization), particularly for large system sizes. A strong confirmation bias allows various final disagreement configurations with different fractions of the individuals in favor of the opposite opinions. PMID:23848643
A kinematic model for Bayesian tracking of cyclic human motion
NASA Astrophysics Data System (ADS)
Greif, Thomas; Lienhart, Rainer
2010-01-01
We introduce a two-dimensional kinematic model for cyclic motions of humans, which is suitable for the use as temporal prior in any Bayesian tracking framework. This human motion model is solely based on simple kinematic properties: the joint accelerations. Distributions of joint accelerations subject to the cycle progress are learned from training data. We present results obtained by applying the introduced model to the cyclic motion of backstroke swimming in a Kalman filter framework that represents the posterior distribution by a Gaussian. We experimentally evaluate the sensitivity of the motion model with respect to the frequency and noise level of assumed appearance-based pose measurements by simulating various fidelities of the pose measurements using ground truth data.
Bayesian Dose-Response Modeling in Sparse Data
NASA Astrophysics Data System (ADS)
Kim, Steven B.
This book discusses Bayesian dose-response modeling in small samples applied to two different settings. The first setting is early phase clinical trials, and the second setting is toxicology studies in cancer risk assessment. In early phase clinical trials, experimental units are humans who are actual patients. Prior to a clinical trial, opinions from multiple subject area experts are generally more informative than the opinion of a single expert, but we may face a dilemma when they have disagreeing prior opinions. In this regard, we consider compromising the disagreement and compare two different approaches for making a decision. In addition to combining multiple opinions, we also address balancing two levels of ethics in early phase clinical trials. The first level is individual-level ethics which reflects the perspective of trial participants. The second level is population-level ethics which reflects the perspective of future patients. We extensively compare two existing statistical methods which focus on each perspective and propose a new method which balances the two conflicting perspectives. In toxicology studies, experimental units are living animals. Here we focus on a potential non-monotonic dose-response relationship which is known as hormesis. Briefly, hormesis is a phenomenon which can be characterized by a beneficial effect at low doses and a harmful effect at high doses. In cancer risk assessments, the estimation of a parameter, which is known as a benchmark dose, can be highly sensitive to a class of assumptions, monotonicity or hormesis. In this regard, we propose a robust approach which considers both monotonicity and hormesis as a possibility. In addition, We discuss statistical hypothesis testing for hormesis and consider various experimental designs for detecting hormesis based on Bayesian decision theory. Past experiments have not been optimally designed for testing for hormesis, and some Bayesian optimal designs may not be optimal under a
NASA Astrophysics Data System (ADS)
Mendes, B. S.; Draper, D.
2008-12-01
The issue of model uncertainty and model choice is central in any groundwater modeling effort [Neuman and Wierenga, 2003]; among the several approaches to the problem we favour using Bayesian statistics because it is a method that integrates in a natural way uncertainties (arising from any source) and experimental data. In this work, we experiment with several Bayesian approaches to model choice, focusing primarily on demonstrating the usefulness of the Reversible Jump Markov Chain Monte Carlo (RJMCMC) simulation method [Green, 1995]; this is an extension of the now- common MCMC methods. Standard MCMC techniques approximate posterior distributions for quantities of interest, often by creating a random walk in parameter space; RJMCMC allows the random walk to take place between parameter spaces with different dimensionalities. This fact allows us to explore state spaces that are associated with different deterministic models for experimental data. Our work is exploratory in nature; we restrict our study to comparing two simple transport models applied to a data set gathered to estimate the breakthrough curve for a tracer compound in groundwater. One model has a mean surface based on a simple advection dispersion differential equation; the second model's mean surface is also governed by a differential equation but in two dimensions. We focus on artificial data sets (in which truth is known) to see if model identification is done correctly, but we also address the issues of over and under-paramerization, and we compare RJMCMC's performance with other traditional methods for model selection and propagation of model uncertainty, including Bayesian model averaging, BIC and DIC.References Neuman and Wierenga (2003). A Comprehensive Strategy of Hydrogeologic Modeling and Uncertainty Analysis for Nuclear Facilities and Sites. NUREG/CR-6805, Division of Systems Analysis and Regulatory Effectiveness Office of Nuclear Regulatory Research, U. S. Nuclear Regulatory Commission
Parameter Estimation and Parameterization Uncertainty Using Bayesian Model Averaging
NASA Astrophysics Data System (ADS)
Tsai, F. T.; Li, X.
2007-12-01
This study proposes Bayesian model averaging (BMA) to address parameter estimation uncertainty arisen from non-uniqueness in parameterization methods. BMA provides a means of incorporating multiple parameterization methods for prediction through the law of total probability, with which an ensemble average of hydraulic conductivity distribution is obtained. Estimation uncertainty is described by the BMA variances, which contain variances within and between parameterization methods. BMA shows the facts that considering more parameterization methods tends to increase estimation uncertainty and estimation uncertainty is always underestimated using a single parameterization method. Two major problems in applying BMA to hydraulic conductivity estimation using a groundwater inverse method will be discussed in the study. The first problem is the use of posterior probabilities in BMA, which tends to single out one best method and discard other good methods. This problem arises from Occam's window that only accepts models in a very narrow range. We propose a variance window to replace Occam's window to cope with this problem. The second problem is the use of Kashyap information criterion (KIC), which makes BMA tend to prefer high uncertain parameterization methods due to considering the Fisher information matrix. We found that Bayesian information criterion (BIC) is a good approximation to KIC and is able to avoid controversial results. We applied BMA to hydraulic conductivity estimation in the 1,500-foot sand aquifer in East Baton Rouge Parish, Louisiana.
Bayesian predictive modeling for genomic based personalized treatment selection.
Ma, Junsheng; Stingo, Francesco C; Hobbs, Brian P
2016-06-01
Efforts to personalize medicine in oncology have been limited by reductive characterizations of the intrinsically complex underlying biological phenomena. Future advances in personalized medicine will rely on molecular signatures that derive from synthesis of multifarious interdependent molecular quantities requiring robust quantitative methods. However, highly parameterized statistical models when applied in these settings often require a prohibitively large database and are sensitive to proper characterizations of the treatment-by-covariate interactions, which in practice are difficult to specify and may be limited by generalized linear models. In this article, we present a Bayesian predictive framework that enables the integration of a high-dimensional set of genomic features with clinical responses and treatment histories of historical patients, providing a probabilistic basis for using the clinical and molecular information to personalize therapy for future patients. Our work represents one of the first attempts to define personalized treatment assignment rules based on large-scale genomic data. We use actual gene expression data acquired from The Cancer Genome Atlas in the settings of leukemia and glioma to explore the statistical properties of our proposed Bayesian approach for personalizing treatment selection. The method is shown to yield considerable improvements in predictive accuracy when compared to penalized regression approaches. PMID:26575856
Advanced REACH Tool: a Bayesian model for occupational exposure assessment.
McNally, Kevin; Warren, Nicholas; Fransman, Wouter; Entink, Rinke Klein; Schinkel, Jody; van Tongeren, Martie; Cherrie, John W; Kromhout, Hans; Schneider, Thomas; Tielemans, Erik
2014-06-01
This paper describes a Bayesian model for the assessment of inhalation exposures in an occupational setting; the methodology underpins a freely available web-based application for exposure assessment, the Advanced REACH Tool (ART). The ART is a higher tier exposure tool that combines disparate sources of information within a Bayesian statistical framework. The information is obtained from expert knowledge expressed in a calibrated mechanistic model of exposure assessment, data on inter- and intra-individual variability in exposures from the literature, and context-specific exposure measurements. The ART provides central estimates and credible intervals for different percentiles of the exposure distribution, for full-shift and long-term average exposures. The ART can produce exposure estimates in the absence of measurements, but the precision of the estimates improves as more data become available. The methodology presented in this paper is able to utilize partially analogous data, a novel approach designed to make efficient use of a sparsely populated measurement database although some additional research is still required before practical implementation. The methodology is demonstrated using two worked examples: an exposure to copper pyrithione in the spraying of antifouling paints and an exposure to ethyl acetate in shoe repair. PMID:24665110
Advanced REACH Tool: A Bayesian Model for Occupational Exposure Assessment
McNally, Kevin; Warren, Nicholas; Fransman, Wouter; Entink, Rinke Klein; Schinkel, Jody; van Tongeren, Martie; Cherrie, John W.; Kromhout, Hans; Schneider, Thomas; Tielemans, Erik
2014-01-01
This paper describes a Bayesian model for the assessment of inhalation exposures in an occupational setting; the methodology underpins a freely available web-based application for exposure assessment, the Advanced REACH Tool (ART). The ART is a higher tier exposure tool that combines disparate sources of information within a Bayesian statistical framework. The information is obtained from expert knowledge expressed in a calibrated mechanistic model of exposure assessment, data on inter- and intra-individual variability in exposures from the literature, and context-specific exposure measurements. The ART provides central estimates and credible intervals for different percentiles of the exposure distribution, for full-shift and long-term average exposures. The ART can produce exposure estimates in the absence of measurements, but the precision of the estimates improves as more data become available. The methodology presented in this paper is able to utilize partially analogous data, a novel approach designed to make efficient use of a sparsely populated measurement database although some additional research is still required before practical implementation. The methodology is demonstrated using two worked examples: an exposure to copper pyrithione in the spraying of antifouling paints and an exposure to ethyl acetate in shoe repair. PMID:24665110
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Modeling the Climatology of Tornado Occurrence with Bayesian Inference
NASA Astrophysics Data System (ADS)
Cheng, Vincent Y. S.
Our mechanistic understanding of tornadic environments has significantly improved by the recent technological enhancements in the detection of tornadoes as well as the advances of numerical weather predictive modeling. Nonetheless, despite the decades of active research, prediction of tornado occurrence remains one of the most difficult problems in meteorological and climate science. In our efforts to develop predictive tools for tornado occurrence, there are a number of issues to overcome, such as the treatment of inconsistent tornado records, the consideration of suitable combination of atmospheric predictors, and the selection of appropriate resolution to accommodate the variability in time and space. In this dissertation, I address each of these topics by undertaking three empirical (statistical) modeling studies, where I examine the signature of different atmospheric factors influencing the tornado occurrence, the sampling biases in tornado observations, and the optimal spatiotemporal resolution for studying tornado occurrence. In the first study, I develop a novel Bayesian statistical framework to assess the probability of tornado occurrence in Canada, in which the sampling bias of tornado observations and the linkage between lightning climatology and tornadogenesis are considered. The results produced reasonable probability estimates of tornado occurrence for the under-sampled areas in the model domain. The same study also delineated the geographical variability in the lightning-tornado relationship across Canada. In the second study, I present a novel modeling framework to examine the relative importance of several key atmospheric variables (e.g., convective available potential energy, 0-3 km storm-relative helicity, 0-6 km bulk wind difference, 0-tropopause vertical wind shear) on tornado activity in North America. I found that the variable quantifying the updraft strength is more important during the warm season, whereas the effects of wind
Scheuerell, Mark D; Buhle, Eric R; Semmens, Brice X; Ford, Michael J; Cooney, Tom; Carmichael, Richard W
2015-01-01
Myriad human activities increasingly threaten the existence of many species. A variety of conservation interventions such as habitat restoration, protected areas, and captive breeding have been used to prevent extinctions. Evaluating the effectiveness of these interventions requires appropriate statistical methods, given the quantity and quality of available data. Historically, analysis of variance has been used with some form of predetermined before-after control-impact design to estimate the effects of large-scale experiments or conservation interventions. However, ad hoc retrospective study designs or the presence of random effects at multiple scales may preclude the use of these tools. We evaluated the effects of a large-scale supplementation program on the density of adult Chinook salmon Oncorhynchus tshawytscha from the Snake River basin in the northwestern United States currently listed under the U.S. Endangered Species Act. We analyzed 43 years of data from 22 populations, accounting for random effects across time and space using a form of Bayesian hierarchical time-series model common in analyses of financial markets. We found that varying degrees of supplementation over a period of 25 years increased the density of natural-origin adults, on average, by 0–8% relative to nonsupplementation years. Thirty-nine of the 43 year effects were at least two times larger in magnitude than the mean supplementation effect, suggesting common environmental variables play a more important role in driving interannual variability in adult density. Additional residual variation in density varied considerably across the region, but there was no systematic difference between supplemented and reference populations. Our results demonstrate the power of hierarchical Bayesian models to detect the diffuse effects of management interventions and to quantitatively describe the variability of intervention success. Nevertheless, our study could not address whether ecological
Scheuerell, Mark D; Buhle, Eric R; Semmens, Brice X; Ford, Michael J; Cooney, Tom; Carmichael, Richard W
2015-05-01
Myriad human activities increasingly threaten the existence of many species. A variety of conservation interventions such as habitat restoration, protected areas, and captive breeding have been used to prevent extinctions. Evaluating the effectiveness of these interventions requires appropriate statistical methods, given the quantity and quality of available data. Historically, analysis of variance has been used with some form of predetermined before-after control-impact design to estimate the effects of large-scale experiments or conservation interventions. However, ad hoc retrospective study designs or the presence of random effects at multiple scales may preclude the use of these tools. We evaluated the effects of a large-scale supplementation program on the density of adult Chinook salmon Oncorhynchus tshawytscha from the Snake River basin in the northwestern United States currently listed under the U.S. Endangered Species Act. We analyzed 43 years of data from 22 populations, accounting for random effects across time and space using a form of Bayesian hierarchical time-series model common in analyses of financial markets. We found that varying degrees of supplementation over a period of 25 years increased the density of natural-origin adults, on average, by 0-8% relative to nonsupplementation years. Thirty-nine of the 43 year effects were at least two times larger in magnitude than the mean supplementation effect, suggesting common environmental variables play a more important role in driving interannual variability in adult density. Additional residual variation in density varied considerably across the region, but there was no systematic difference between supplemented and reference populations. Our results demonstrate the power of hierarchical Bayesian models to detect the diffuse effects of management interventions and to quantitatively describe the variability of intervention success. Nevertheless, our study could not address whether ecological factors
Performance and Prediction: Bayesian Modelling of Fallible Choice in Chess
NASA Astrophysics Data System (ADS)
Haworth, Guy; Regan, Ken; di Fatta, Giuseppe
Evaluating agents in decision-making applications requires assessing their skill and predicting their behaviour. Both are well developed in Poker-like situations, but less so in more complex game and model domains. This paper addresses both tasks by using Bayesian inference in a benchmark space of reference agents. The concepts are explained and demonstrated using the game of chess but the model applies generically to any domain with quantifiable options and fallible choice. Demonstration applications address questions frequently asked by the chess community regarding the stability of the rating scale, the comparison of players of different eras and/or leagues, and controversial incidents possibly involving fraud. The last include alleged under-performance, fabrication of tournament results, and clandestine use of computer advice during competition. Beyond the model world of games, the aim is to improve fallible human performance in complex, high-value tasks.
Development of a Bayesian Belief Network Runway Incursion Model
NASA Technical Reports Server (NTRS)
Green, Lawrence L.
2014-01-01
In a previous paper, a statistical analysis of runway incursion (RI) events was conducted to ascertain their relevance to the top ten Technical Challenges (TC) of the National Aeronautics and Space Administration (NASA) Aviation Safety Program (AvSP). The study revealed connections to perhaps several of the AvSP top ten TC. That data also identified several primary causes and contributing factors for RI events that served as the basis for developing a system-level Bayesian Belief Network (BBN) model for RI events. The system-level BBN model will allow NASA to generically model the causes of RI events and to assess the effectiveness of technology products being developed under NASA funding. These products are intended to reduce the frequency of RI events in particular, and to improve runway safety in general. The development, structure and assessment of that BBN for RI events by a Subject Matter Expert panel are documented in this paper.
Aggregated Residential Load Modeling Using Dynamic Bayesian Networks
Vlachopoulou, Maria; Chin, George; Fuller, Jason C.; Lu, Shuai
2014-09-28
Abstract—It is already obvious that the future power grid will have to address higher demand for power and energy, and to incorporate renewable resources of different energy generation patterns. Demand response (DR) schemes could successfully be used to manage and balance power supply and demand under operating conditions of the future power grid. To achieve that, more advanced tools for DR management of operations and planning are necessary that can estimate the available capacity from DR resources. In this research, a Dynamic Bayesian Network (DBN) is derived, trained, and tested that can model aggregated load of Heating, Ventilation, and Air Conditioning (HVAC) systems. DBNs can provide flexible and powerful tools for both operations and planing, due to their unique analytical capabilities. The DBN model accuracy and flexibility of use is demonstrated by testing the model under different operational scenarios.
Model for Aggregated Water Heater Load Using Dynamic Bayesian Networks
Vlachopoulou, Maria; Chin, George; Fuller, Jason C.; Lu, Shuai; Kalsi, Karanjit
2012-07-19
The transition to the new generation power grid, or “smart grid”, requires novel ways of using and analyzing data collected from the grid infrastructure. Fundamental functionalities like demand response (DR), that the smart grid needs, rely heavily on the ability of the energy providers and distributors to forecast the load behavior of appliances under different DR strategies. This paper presents a new model of aggregated water heater load, based on dynamic Bayesian networks (DBNs). The model has been validated against simulated data from an open source distribution simulation software (GridLAB-D). The results presented in this paper demonstrate that the DBN model accurately tracks the load profile curves of aggregated water heaters under different testing scenarios.
Advances in Bayesian Model Based Clustering Using Particle Learning
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
NASA Technical Reports Server (NTRS)
Williford, W. O.; Hsieh, P.; Carter, M. C.
1974-01-01
A Bayesian analysis of the two discrete probability models, the negative binomial and the modified negative binomial distributions, which have been used to describe thunderstorm activity at Cape Kennedy, Florida, is presented. The Bayesian approach with beta prior distributions is compared to the classical approach which uses a moment method of estimation or a maximum-likelihood method. The accuracy and simplicity of the Bayesian method is demonstrated.
Yu, Rongjie; Abdel-Aty, Mohamed
2014-01-01
Severe crashes are causing serious social and economic loss, and because of this, reducing crash injury severity has become one of the key objectives of the high speed facilities' (freeway and expressway) management. Traditional crash injury severity analysis utilized data mainly from crash reports concerning the crash occurrence information, drivers' characteristics and roadway geometric related variables. In this study, real-time traffic and weather data were introduced to analyze the crash injury severity. The space mean speeds captured by the Automatic Vehicle Identification (AVI) system on the two roadways were used as explanatory variables in this study; and data from a mountainous freeway (I-70 in Colorado) and an urban expressway (State Road 408 in Orlando) have been used to identify the analysis result's consistence. Binary probit (BP) models were estimated to classify the non-severe (property damage only) crashes and severe (injury and fatality) crashes. Firstly, Bayesian BP models' results were compared to the results from Maximum Likelihood Estimation BP models and it was concluded that Bayesian inference was superior with more significant variables. Then different levels of hierarchical Bayesian BP models were developed with random effects accounting for the unobserved heterogeneity at segment level and crash individual level, respectively. Modeling results from both studied locations demonstrate that large variations of speed prior to the crash occurrence would increase the likelihood of severe crash occurrence. Moreover, with considering unobserved heterogeneity in the Bayesian BP models, the model goodness-of-fit has improved substantially. Finally, possible future applications of the model results and the hierarchical Bayesian probit models were discussed. PMID:24172082
Bayesian Framework for Water Quality Model Uncertainty Estimation and Risk Management
A formal Bayesian methodology is presented for integrated model calibration and risk-based water quality management using Bayesian Monte Carlo simulation and maximum likelihood estimation (BMCML). The primary focus is on lucid integration of model calibration with risk-based wat...
Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837
ERIC Educational Resources Information Center
Levy, Roy
2014-01-01
Digital games offer an appealing environment for assessing student proficiencies, including skills and misconceptions in a diagnostic setting. This paper proposes a dynamic Bayesian network modeling approach for observations of student performance from an educational video game. A Bayesian approach to model construction, calibration, and use in…
Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum
2006-01-01
A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…
Integrated Bayesian network framework for modeling complex ecological issues.
Johnson, Sandra; Mengersen, Kerrie
2012-07-01
The management of environmental problems is multifaceted, requiring varied and sometimes conflicting objectives and perspectives to be considered. Bayesian network (BN) modeling facilitates the integration of information from diverse sources and is well suited to tackling the management challenges of complex environmental problems. However, combining several perspectives in one model can lead to large, unwieldy BNs that are difficult to maintain and understand. Conversely, an oversimplified model may lead to an unrealistic representation of the environmental problem. Environmental managers require the current research and available knowledge about an environmental problem of interest to be consolidated in a meaningful way, thereby enabling the assessment of potential impacts and different courses of action. Previous investigations of the environmental problem of interest may have already resulted in the construction of several disparate ecological models. On the other hand, the opportunity may exist to initiate this modeling. In the first instance, the challenge is to integrate existing models and to merge the information and perspectives from these models. In the second instance, the challenge is to include different aspects of the environmental problem incorporating both the scientific and management requirements. Although the paths leading to the combined model may differ for these 2 situations, the common objective is to design an integrated model that captures the available information and research, yet is simple to maintain, expand, and refine. BN modeling is typically an iterative process, and we describe a heuristic method, the iterative Bayesian network development cycle (IBNDC), for the development of integrated BN models that are suitable for both situations outlined above. The IBNDC approach facilitates object-oriented BN (OOBN) modeling, arguably viewed as the next logical step in adaptive management modeling, and that embraces iterative development
Determinants of Low Birth Weight in Malawi: Bayesian Geo-Additive Modelling.
Ngwira, Alfred; Stanley, Christopher C
2015-01-01
Studies on factors of low birth weight in Malawi have neglected the flexible approach of using smooth functions for some covariates in models. Such flexible approach reveals detailed relationship of covariates with the response. The study aimed at investigating risk factors of low birth weight in Malawi by assuming a flexible approach for continuous covariates and geographical random effect. A Bayesian geo-additive model for birth weight in kilograms and size of the child at birth (less than average or average and higher) with district as a spatial effect using the 2010 Malawi demographic and health survey data was adopted. A Gaussian model for birth weight in kilograms and a binary logistic model for the binary outcome (size of child at birth) were fitted. Continuous covariates were modelled by the penalized (p) splines and spatial effects were smoothed by the two dimensional p-spline. The study found that child birth order, mother weight and height are significant predictors of birth weight. Secondary education for mother, birth order categories 2-3 and 4-5, wealth index of richer family and mother height were significant predictors of child size at birth. The area associated with low birth weight was Chitipa and areas with increased risk to less than average size at birth were Chitipa and Mchinji. The study found support for the flexible modelling of some covariates that clearly have nonlinear influences. Nevertheless there is no strong support for inclusion of geographical spatial analysis. The spatial patterns though point to the influence of omitted variables with some spatial structure or possibly epidemiological processes that account for this spatial structure and the maps generated could be used for targeting development efforts at a glance. PMID:26114866
Inversion of hierarchical Bayesian models using Gaussian processes.
Lomakina, Ekaterina I; Paliwal, Saee; Diaconescu, Andreea O; Brodersen, Kay H; Aponte, Eduardo A; Buhmann, Joachim M; Stephan, Klaas E
2015-09-01
Over the past decade, computational approaches to neuroimaging have increasingly made use of hierarchical Bayesian models (HBMs), either for inferring on physiological mechanisms underlying fMRI data (e.g., dynamic causal modelling, DCM) or for deriving computational trajectories (from behavioural data) which serve as regressors in general linear models. However, an unresolved problem is that standard methods for inverting the hierarchical Bayesian model are either very slow, e.g. Markov Chain Monte Carlo Methods (MCMC), or are vulnerable to local minima in non-convex optimisation problems, such as variational Bayes (VB). This article considers Gaussian process optimisation (GPO) as an alternative approach for global optimisation of sufficiently smooth and efficiently evaluable objective functions. GPO avoids being trapped in local extrema and can be computationally much more efficient than MCMC. Here, we examine the benefits of GPO for inverting HBMs commonly used in neuroimaging, including DCM for fMRI and the Hierarchical Gaussian Filter (HGF). Importantly, to achieve computational efficiency despite high-dimensional optimisation problems, we introduce a novel combination of GPO and local gradient-based search methods. The utility of this GPO implementation for DCM and HGF is evaluated against MCMC and VB, using both synthetic data from simulations and empirical data. Our results demonstrate that GPO provides parameter estimates with equivalent or better accuracy than the other techniques, but at a fraction of the computational cost required for MCMC. We anticipate that GPO will prove useful for robust and efficient inversion of high-dimensional and nonlinear models of neuroimaging data. PMID:26048619
Bayesian geostatistical modeling of Malaria Indicator Survey data in Angola.
Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope
2010-01-01
The 2006-2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775
Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola
Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope
2010-01-01
The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775
A Bayesian Measurment Error Model for Misaligned Radiographic Data
Lennox, Kristin P.; Glascoe, Lee G.
2013-09-06
An understanding of the inherent variability in micro-computed tomography (micro-CT) data is essential to tasks such as statistical process control and the validation of radiographic simulation tools. The data present unique challenges to variability analysis due to the relatively low resolution of radiographs, and also due to minor variations from run to run which can result in misalignment or magnification changes between repeated measurements of a sample. Positioning changes artificially inflate the variability of the data in ways that mask true physical phenomena. We present a novel Bayesian nonparametric regression model that incorporates both additive and multiplicative measurement error inmore » addition to heteroscedasticity to address this problem. We also use this model to assess the effects of sample thickness and sample position on measurement variability for an aluminum specimen. Supplementary materials for this article are available online.« less
A Bayesian Measurment Error Model for Misaligned Radiographic Data
Lennox, Kristin P.; Glascoe, Lee G.
2013-09-06
An understanding of the inherent variability in micro-computed tomography (micro-CT) data is essential to tasks such as statistical process control and the validation of radiographic simulation tools. The data present unique challenges to variability analysis due to the relatively low resolution of radiographs, and also due to minor variations from run to run which can result in misalignment or magnification changes between repeated measurements of a sample. Positioning changes artificially inflate the variability of the data in ways that mask true physical phenomena. We present a novel Bayesian nonparametric regression model that incorporates both additive and multiplicative measurement error in addition to heteroscedasticity to address this problem. We also use this model to assess the effects of sample thickness and sample position on measurement variability for an aluminum specimen. Supplementary materials for this article are available online.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
NASA Astrophysics Data System (ADS)
Takaishi, Tetsuya
2015-01-01
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran.
Modelling categorical covariates in Bayesian disease mapping by partition structures.
Giudici, P; Knorr-Held, L; Rasser, G
We consider the problem of mapping the risk from a disease using a series of regional counts of observed and expected cases, and information on potential risk factors. To analyse this problem from a Bayesian viewpoint, we propose a methodology which extends a spatial partition model by including categorical covariate information. Such an extension allows detection of clusters in the residual variation, reflecting further, possibly unobserved, covariates. The methodology is implemented by means of reversible jump Markov chain Monte Carlo sampling. An application is presented in order to illustrate and compare our proposed extensions with a purely spatial partition model. Here we analyse a well-known data set on lip cancer incidence in Scotland. PMID:10960873
Cooper, Richard J; Krueger, Tobias; Hiscock, Kevin M; Rawlins, Barry G
2014-01-01
Mixing models have become increasingly common tools for apportioning fluvial sediment load to various sediment sources across catchments using a wide variety of Bayesian and frequentist modeling approaches. In this study, we demonstrate how different model setups can impact upon resulting source apportionment estimates in a Bayesian framework via a one-factor-at-a-time (OFAT) sensitivity analysis. We formulate 13 versions of a mixing model, each with different error assumptions and model structural choices, and apply them to sediment geochemistry data from the River Blackwater, Norfolk, UK, to apportion suspended particulate matter (SPM) contributions from three sources (arable topsoils, road verges, and subsurface material) under base flow conditions between August 2012 and August 2013. Whilst all 13 models estimate subsurface sources to be the largest contributor of SPM (median ∼76%), comparison of apportionment estimates reveal varying degrees of sensitivity to changing priors, inclusion of covariance terms, incorporation of time-variant distributions, and methods of proportion characterization. We also demonstrate differences in apportionment results between a full and an empirical Bayesian setup, and between a Bayesian and a frequentist optimization approach. This OFAT sensitivity analysis reveals that mixing model structural choices and error assumptions can significantly impact upon sediment source apportionment results, with estimated median contributions in this study varying by up to 21% between model versions. Users of mixing models are therefore strongly advised to carefully consider and justify their choice of model structure prior to conducting sediment source apportionment investigations. Key Points An OFAT sensitivity analysis of sediment fingerprinting mixing models is conducted Bayesian models display high sensitivity to error assumptions and structural choices Source apportionment results differ between Bayesian and frequentist approaches PMID
Bayesian model-averaged benchmark dose analysis via reparameterized quantal-response models.
Fang, Q; Piegorsch, W W; Simmons, S J; Li, X; Chen, C; Wang, Y
2015-12-01
An important objective in biomedical and environmental risk assessment is estimation of minimum exposure levels that induce a pre-specified adverse response in a target population. The exposure points in such settings are typically referred to as benchmark doses (BMDs). Parametric Bayesian estimation for finding BMDs has grown in popularity, and a large variety of candidate dose-response models is available for applying these methods. Each model can possess potentially different parametric interpretation(s), however. We present reparameterized dose-response models that allow for explicit use of prior information on the target parameter of interest, the BMD. We also enhance our Bayesian estimation technique for BMD analysis by applying Bayesian model averaging to produce point estimates and (lower) credible bounds, overcoming associated questions of model adequacy when multimodel uncertainty is present. An example from carcinogenicity testing illustrates the calculations. PMID:26102570
Diagnosing Hybrid Systems: a Bayesian Model Selection Approach
NASA Technical Reports Server (NTRS)
McIlraith, Sheila A.
2005-01-01
In this paper we examine the problem of monitoring and diagnosing noisy complex dynamical systems that are modeled as hybrid systems-models of continuous behavior, interleaved by discrete transitions. In particular, we examine continuous systems with embedded supervisory controllers that experience abrupt, partial or full failure of component devices. Building on our previous work in this area (MBCG99;MBCG00), our specific focus in this paper ins on the mathematical formulation of the hybrid monitoring and diagnosis task as a Bayesian model tracking algorithm. The nonlinear dynamics of many hybrid systems present challenges to probabilistic tracking. Further, probabilistic tracking of a system for the purposes of diagnosis is problematic because the models of the system corresponding to failure modes are numerous and generally very unlikely. To focus tracking on these unlikely models and to reduce the number of potential models under consideration, we exploit logic-based techniques for qualitative model-based diagnosis to conjecture a limited initial set of consistent candidate models. In this paper we discuss alternative tracking techniques that are relevant to different classes of hybrid systems, focusing specifically on a method for tracking multiple models of nonlinear behavior simultaneously using factored sampling and conditional density propagation. To illustrate and motivate the approach described in this paper we examine the problem of monitoring and diganosing NASA's Sprint AERCam, a small spherical robotic camera unit with 12 thrusters that enable both linear and rotational motion.
Bayesian network models for error detection in radiotherapy plans
NASA Astrophysics Data System (ADS)
Kalet, Alan M.; Gennari, John H.; Ford, Eric C.; Phillips, Mark H.
2015-04-01
The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network’s conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures.
Bayesian network models for error detection in radiotherapy plans.
Kalet, Alan M; Gennari, John H; Ford, Eric C; Phillips, Mark H
2015-04-01
The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network's conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures. PMID:25768885
A Bayesian Attractor Model for Perceptual Decision Making
Bitzer, Sebastian; Bruineberg, Jelle; Kiebel, Stefan J.
2015-01-01
Even for simple perceptual decisions, the mechanisms that the brain employs are still under debate. Although current consensus states that the brain accumulates evidence extracted from noisy sensory information, open questions remain about how this simple model relates to other perceptual phenomena such as flexibility in decisions, decision-dependent modulation of sensory gain, or confidence about a decision. We propose a novel approach of how perceptual decisions are made by combining two influential formalisms into a new model. Specifically, we embed an attractor model of decision making into a probabilistic framework that models decision making as Bayesian inference. We show that the new model can explain decision making behaviour by fitting it to experimental data. In addition, the new model combines for the first time three important features: First, the model can update decisions in response to switches in the underlying stimulus. Second, the probabilistic formulation accounts for top-down effects that may explain recent experimental findings of decision-related gain modulation of sensory neurons. Finally, the model computes an explicit measure of confidence which we relate to recent experimental evidence for confidence computations in perceptual decision tasks. PMID:26267143
Bayesian modeling of ChIP-chip data using latent variables
2009-01-01
Background The ChIP-chip technology has been used in a wide range of biomedical studies, such as identification of human transcription factor binding sites, investigation of DNA methylation, and investigation of histone modifications in animals and plants. Various methods have been proposed in the literature for analyzing the ChIP-chip data, such as the sliding window methods, the hidden Markov model-based methods, and Bayesian methods. Although, due to the integrated consideration of uncertainty of the models and model parameters, Bayesian methods can potentially work better than the other two classes of methods, the existing Bayesian methods do not perform satisfactorily. They usually require multiple replicates or some extra experimental information to parametrize the model, and long CPU time due to involving of MCMC simulations. Results In this paper, we propose a Bayesian latent model for the ChIP-chip data. The new model mainly differs from the existing Bayesian models, such as the joint deconvolution model, the hierarchical gamma mixture model, and the Bayesian hierarchical model, in two respects. Firstly, it works on the difference between the averaged treatment and control samples. This enables the use of a simple model for the data, which avoids the probe-specific effect and the sample (control/treatment) effect. As a consequence, this enables an efficient MCMC simulation of the posterior distribution of the model, and also makes the model more robust to the outliers. Secondly, it models the neighboring dependence of probes by introducing a latent indicator vector. A truncated Poisson prior distribution is assumed for the latent indicator variable, with the rationale being justified at length. Conclusion The Bayesian latent method is successfully applied to real and ten simulated datasets, with comparisons with some of the existing Bayesian methods, hidden Markov model methods, and sliding window methods. The numerical results indicate that the Bayesian
A Bayesian modelling framework for tornado occurrences in North America
NASA Astrophysics Data System (ADS)
Cheng, Vincent Y. S.; Arhonditsis, George B.; Sills, David M. L.; Gough, William A.; Auld, Heather
2015-03-01
Tornadoes represent one of nature’s most hazardous phenomena that have been responsible for significant destruction and devastating fatalities. Here we present a Bayesian modelling approach for elucidating the spatiotemporal patterns of tornado activity in North America. Our analysis shows a significant increase in the Canadian Prairies and the Northern Great Plains during the summer, indicating a clear transition of tornado activity from the United States to Canada. The linkage between monthly-averaged atmospheric variables and likelihood of tornado events is characterized by distinct seasonality; the convective available potential energy is the predominant factor in the summer; vertical wind shear appears to have a strong signature primarily in the winter and secondarily in the summer; and storm relative environmental helicity is most influential in the spring. The present probabilistic mapping can be used to draw inference on the likelihood of tornado occurrence in any location in North America within a selected time period of the year.
A Bayesian modelling framework for tornado occurrences in North America.
Cheng, Vincent Y S; Arhonditsis, George B; Sills, David M L; Gough, William A; Auld, Heather
2015-01-01
Tornadoes represent one of nature's most hazardous phenomena that have been responsible for significant destruction and devastating fatalities. Here we present a Bayesian modelling approach for elucidating the spatiotemporal patterns of tornado activity in North America. Our analysis shows a significant increase in the Canadian Prairies and the Northern Great Plains during the summer, indicating a clear transition of tornado activity from the United States to Canada. The linkage between monthly-averaged atmospheric variables and likelihood of tornado events is characterized by distinct seasonality; the convective available potential energy is the predominant factor in the summer; vertical wind shear appears to have a strong signature primarily in the winter and secondarily in the summer; and storm relative environmental helicity is most influential in the spring. The present probabilistic mapping can be used to draw inference on the likelihood of tornado occurrence in any location in North America within a selected time period of the year. PMID:25807465
Bayesian Model Selection in 'Big Data' Spectral Analysis
NASA Astrophysics Data System (ADS)
Fischer, Travis C.; Crenshaw, D. Michael; Baron, Fabien; Kloppenborg, Brian K.; Pope, Crystal L.
2015-01-01
As IFU observations and large spectral surveys continue to become more prevalent, the handling of thousands of spectra has become common place. Astronomers look at objects with increasingly complex emission-linestructures, so establishing a method that will easily allow for multiple-component analysis of these features in an automated fashion would be of great use to the community. Already used in exoplanet detection and interferometric image reconstruction, we present a new application of Bayesian model selection in `big data' spectral analysis. With this technique, the fitting of multiple emission-line components in an automated fashion while simultaneously determining the correct number of components in each spectrum streamlines the line measurements for a large number of spectra into a single process.
A unified Bayesian hierarchical model for MRI tissue classification.
Feng, Dai; Liang, Dong; Tierney, Luke
2014-04-15
Various works have used magnetic resonance imaging (MRI) tissue classification extensively to study a number of neurological and psychiatric disorders. Various noise characteristics and other artifacts make this classification a challenging task. Instead of splitting the procedure into different steps, we extend a previous work to develop a unified Bayesian hierarchical model, which addresses both the partial volume effect and intensity non-uniformity, the two major acquisition artifacts, simultaneously. We adopted a normal mixture model with the means and variances depending on the tissue types of voxels to model the observed intensity values. We modeled the relationship among the components of the index vector of tissue types by a hidden Markov model, which captures the spatial similarity of voxels. Furthermore, we addressed the partial volume effect by construction of a higher resolution image in which each voxel is divided into subvoxels. Finally, We achieved the bias field correction by using a Gaussian Markov random field model with a band precision matrix designed in light of image filtering. Sparse matrix methods and parallel computations based on conditional independence are exploited to improve the speed of the Markov chain Monte Carlo simulation. The unified model provides more accurate tissue classification results for both simulated and real data sets. PMID:24738112
Bridging groundwater models and decision support with a Bayesian network
Fienen, Michael N.; Masterson, John P.; Plant, Nathaniel G.; Gutierrez, Benjamin T.; Thieler, E. Robert
2013-01-01
Resource managers need to make decisions to plan for future environmental conditions, particularly sea level rise, in the face of substantial uncertainty. Many interacting processes factor in to the decisions they face. Advances in process models and the quantification of uncertainty have made models a valuable tool for this purpose. Long-simulation runtimes and, often, numerical instability make linking process models impractical in many cases. A method for emulating the important connections between model input and forecasts, while propagating uncertainty, has the potential to provide a bridge between complicated numerical process models and the efficiency and stability needed for decision making. We explore this using a Bayesian network (BN) to emulate a groundwater flow model. We expand on previous approaches to validating a BN by calculating forecasting skill using cross validation of a groundwater model of Assateague Island in Virginia and Maryland, USA. This BN emulation was shown to capture the important groundwater-flow characteristics and uncertainty of the groundwater system because of its connection to island morphology and sea level. Forecast power metrics associated with the validation of multiple alternative BN designs guided the selection of an optimal level of BN complexity. Assateague island is an ideal test case for exploring a forecasting tool based on current conditions because the unique hydrogeomorphological variability of the island includes a range of settings indicative of past, current, and future conditions. The resulting BN is a valuable tool for exploring the response of groundwater conditions to sea level rise in decision support.
Bayesian analysis of input uncertainty in hydrological modeling: 2. Application
NASA Astrophysics Data System (ADS)
Kavetski, Dmitri; Kuczera, George; Franks, Stewart W.
2006-03-01
The Bayesian total error analysis (BATEA) methodology directly addresses both input and output errors in hydrological modeling, requiring the modeler to make explicit, rather than implicit, assumptions about the likely extent of data uncertainty. This study considers a BATEA assessment of two North American catchments: (1) French Broad River and (2) Potomac basins. It assesses the performance of the conceptual Variable Infiltration Capacity (VIC) model with and without accounting for input (precipitation) uncertainty. The results show the considerable effects of precipitation errors on the predicted hydrographs (especially the prediction limits) and on the calibrated parameters. In addition, the performance of BATEA in the presence of severe model errors is analyzed. While BATEA allows a very direct treatment of input uncertainty and yields some limited insight into model errors, it requires the specification of valid error models, which are currently poorly understood and require further work. Moreover, it leads to computationally challenging highly dimensional problems. For some types of models, including the VIC implemented using robust numerical methods, the computational cost of BATEA can be reduced using Newton-type methods.
Toward diagnostic model calibration and evaluation: Approximate Bayesian computation
NASA Astrophysics Data System (ADS)
Vrugt, Jasper A.; Sadegh, Mojtaba
2013-07-01
The ever increasing pace of computational power, along with continued advances in measurement technologies and improvements in process understanding has stimulated the development of increasingly complex hydrologic models that simulate soil moisture flow, groundwater recharge, surface runoff, root water uptake, and river discharge at different spatial and temporal scales. Reconciling these high-order system models with perpetually larger volumes of field data is becoming more and more difficult, particularly because classical likelihood-based fitting methods lack the power to detect and pinpoint deficiencies in the model structure. Gupta et al. (2008) has recently proposed steps (amongst others) toward the development of a more robust and powerful method of model evaluation. Their diagnostic approach uses signature behaviors and patterns observed in the input-output data to illuminate to what degree a representation of the real world has been adequately achieved and how the model should be improved for the purpose of learning and scientific discovery. In this paper, we introduce approximate Bayesian computation (ABC) as a vehicle for diagnostic model evaluation. This statistical methodology relaxes the need for an explicit likelihood function in favor of one or multiple different summary statistics rooted in hydrologic theory that together have a clearer and more compelling diagnostic power than some average measure of the size of the error residuals. Two illustrative case studies are used to demonstrate that ABC is relatively easy to implement, and readily employs signature based indices to analyze and pinpoint which part of the model is malfunctioning and in need of further improvement.
A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.
ERIC Educational Resources Information Center
Glas, Cees A. W.; Meijer, Rob R.
A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…
ERIC Educational Resources Information Center
West, Patti; Rutstein, Daisy Wise; Mislevy, Robert J.; Liu, Junhui; Choi, Younyoung; Levy, Roy; Crawford, Aaron; DiCerbo, Kristen E.; Chappel, Kristina; Behrens, John T.
2010-01-01
A major issue in the study of learning progressions (LPs) is linking student performance on assessment tasks to the progressions. This report describes the challenges faced in making this linkage using Bayesian networks to model LPs in the field of computer networking. The ideas are illustrated with exemplar Bayesian networks built on Cisco…
NASA Astrophysics Data System (ADS)
Cooper, Richard J.; Krueger, Tobias; Hiscock, Kevin M.; Rawlins, Barry G.
2014-11-01
Mixing models have become increasingly common tools for apportioning fluvial sediment load to various sediment sources across catchments using a wide variety of Bayesian and frequentist modeling approaches. In this study, we demonstrate how different model setups can impact upon resulting source apportionment estimates in a Bayesian framework via a one-factor-at-a-time (OFAT) sensitivity analysis. We formulate 13 versions of a mixing model, each with different error assumptions and model structural choices, and apply them to sediment geochemistry data from the River Blackwater, Norfolk, UK, to apportion suspended particulate matter (SPM) contributions from three sources (arable topsoils, road verges, and subsurface material) under base flow conditions between August 2012 and August 2013. Whilst all 13 models estimate subsurface sources to be the largest contributor of SPM (median ˜76%), comparison of apportionment estimates reveal varying degrees of sensitivity to changing priors, inclusion of covariance terms, incorporation of time-variant distributions, and methods of proportion characterization. We also demonstrate differences in apportionment results between a full and an empirical Bayesian setup, and between a Bayesian and a frequentist optimization approach. This OFAT sensitivity analysis reveals that mixing model structural choices and error assumptions can significantly impact upon sediment source apportionment results, with estimated median contributions in this study varying by up to 21% between model versions. Users of mixing models are therefore strongly advised to carefully consider and justify their choice of model structure prior to conducting sediment source apportionment investigations.
PREDICTIVE BAYESIAN PATHOGEN DOSE-RESPONSE MODEL FORMS
The use of predictive Bayesian methods in dose-response assessment will be investigated. The predictive Bayesian approach offers an alternative to current approaches in that it does not require the selection of a specific confidence limit, yet provides an answer that is more cons...
Bayesian modeling of animal- and herd-level prevalences.
Branscum, A J; Gardner, I A; Johnson, W O
2004-12-15
We reviewed Bayesian approaches for animal-level and herd-level prevalence estimation based on cross-sectional sampling designs and demonstrated fitting of these models using the WinBUGS software. We considered estimation of infection prevalence based on use of a single diagnostic test applied to a single herd with binomial and hypergeometric sampling. We then considered multiple herds under binomial sampling with the primary goal of estimating the prevalence distribution and the proportion of infected herds. A new model is presented that can be used to estimate the herd-level prevalence in a region, including the posterior probability that all herds are non-infected. Using this model, inferences for the distribution of prevalences, mean prevalence in the region, and predicted prevalence of herds in the region (including the predicted probability of zero prevalence) are also available. In the models presented, both animal- and herd-level prevalences are modeled as mixture distributions to allow for zero infection prevalences. (If mixture models for the prevalences were not used, prevalence estimates might be artificially inflated, especially in herds and regions with low or zero prevalence.) Finally, we considered estimation of animal-level prevalence based on pooled samples. PMID:15579338
A Bayesian joint model of menstrual cycle length and fecundity.
Lum, Kirsten J; Sundaram, Rajeshwari; Buck Louis, Germaine M; Louis, Thomas A
2016-03-01
Menstrual cycle length (MCL) has been shown to play an important role in couple fecundity, which is the biologic capacity for reproduction irrespective of pregnancy intentions. However, a comprehensive assessment of its role requires a fecundity model that accounts for male and female attributes and the couple's intercourse pattern relative to the ovulation day. To this end, we employ a Bayesian joint model for MCL and pregnancy. MCLs follow a scale multiplied (accelerated) mixture model with Gaussian and Gumbel components; the pregnancy model includes MCL as a covariate and computes the cycle-specific probability of pregnancy in a menstrual cycle conditional on the pattern of intercourse and no previous fertilization. Day-specific fertilization probability is modeled using natural, cubic splines. We analyze data from the Longitudinal Investigation of Fertility and the Environment Study (the LIFE Study), a couple based prospective pregnancy study, and find a statistically significant quadratic relation between fecundity and menstrual cycle length, after adjustment for intercourse pattern and other attributes, including male semen quality, both partner's age, and active smoking status (determined by baseline cotinine level 100 ng/mL). We compare results to those produced by a more basic model and show the advantages of a more comprehensive approach. PMID:26295923
A Bayesian Joint Model of Menstrual Cycle Length and Fecundity
Lum, Kirsten J.; Sundaram, Rajeshwari; Louis, Germaine M. Buck; Louis, Thomas A.
2015-01-01
Summary Menstrual cycle length (MCL) has been shown to play an important role in couple fecundity, which is the biologic capacity for reproduction irrespective of pregnancy intentions. However, a comprehensive assessment of its role requires a fecundity model that accounts for male and female attributes and the couple’s intercourse pattern relative to the ovulation day. To this end, we employ a Bayesian joint model for MCL and pregnancy. MCLs follow a scale multiplied (accelerated) mixture model with Gaussian and Gumbel components; the pregnancy model includes MCL as a covariate and computes the cycle-specific probability of pregnancy in a menstrual cycle conditional on the pattern of intercourse and no previous fertilization. Day-specific fertilization probability is modeled using natural, cubic splines. We analyze data from the Longitudinal Investigation of Fertility and the Environment Study (the LIFE Study), a couple based prospective pregnancy study, and find a statistically significant quadratic relation between fecundity and menstrual cycle length, after adjustment for intercourse pattern and other attributes, including male semen quality, both partner’s age, and active smoking status (determined by baseline cotinine level 100ng/mL). We compare results to those produced by a more basic model and show the advantages of a more comprehensive approach. PMID:26295923
Bayesian calibration of the Community Land Model using surrogates
NASA Astrophysics Data System (ADS)
Ray, J.; Sargsyan, K.; Huang, M.; Hou, Z.
2012-12-01
We present results from a calibration effort of the Community Land Model (CLM) using surrogates. Three parameters, governing subsurface runoff and groundwater dynamics, were targeted and calibrated to observations from the Missouri Ozark Ameriflux tower site (US-Moz) spanning 1996-2004. We adopt a Bayesian approach for calibration where the parameters were estimated as probability distributions to account for the uncertainty due to modelling and observation errors. The model fitting was performed using an adaptive Markov chain Monte Carlo method. Since the sampling-based calibration of CLM could be computationally expensive, we first developed surrogates as alternatives to the CLM. The three-dimensional parameter space was sampled and CLM was used to produce monthly averaged predictions of runoff and latent/sensible heat fluxes. Multiple polynomial "trend" models were proposed, fitted to the CLM simulations via regression, and tested for over-fitting. A quadratic model was ultimately selected and bias-corrected using the universal kriging approach, to produce surrogates with errors less than 10% at any arbitrary point in the parameter-space. This "trend+kriged" model was then used as an inexpensive CLM surrogate, in an MCMC sampler, to solve the calibration problem. Joint densities were developed for the parameters, along with an estimate of the structural error of the surrogates.
Nonlinear regression modeling of nutrient loads in streams: A Bayesian approach
Qian, S.S.; Reckhow, K.H.; Zhai, J.; McMahon, G.
2005-01-01
A Bayesian nonlinear regression modeling method is introduced and compared with the least squares method for modeling nutrient loads in stream networks. The objective of the study is to better model spatial correlation in river basin hydrology and land use for improving the model as a forecasting tool. The Bayesian modeling approach is introduced in three steps, each with a more complicated model and data error structure. The approach is illustrated using a data set from three large river basins in eastern North Carolina. Results indicate that the Bayesian model better accounts for model and data uncertainties than does the conventional least squares approach. Applications of the Bayesian models for ambient water quality standards compliance and TMDL assessment are discussed. Copyright 2005 by the American Geophysical Union.
Bayesian Estimation and Uncertainty Quantification in Differential Equation Models
NASA Astrophysics Data System (ADS)
Bhaumik, Prithwish
In engineering, physics, biomedical sciences, pharmacokinetics and pharmacodynamics (PKPD) and many other fields the regression function is often specified as solution of a system of ordinary differential equations (ODEs) given by. dƒtheta(t) / dt = F(t), ƒtheta(, t),theta), t ∈ [0, 1]; here F is a known appropriately smooth vector valued function. Our interest lies in estimating theta from the noisy data. A two-step approach to solve this problem consists of the first step fitting the data nonparametrically, and the second step estimating the parameter by minimizing the distance between the nonparametrically estimated derivative and the derivative suggested by the system of ODEs. In Chapter 2 we consider a Bayesian analog of the two step approach by putting a finite random series prior on the regression function using B-spline basis. We establish a Bernstein-von Mises theorem for the posterior distribution of the parameter of interest induced from that on the regression function with the n --1/2 contraction rate. Although this approach is computationally fast, the Bayes estimator is not asymptotically efficient. This can be remedied by directly considering the distance between the function in the nonparametric model and a Runge-Kutta (RK4) approximate solution of the ODE while inducing the posterior distribution on the parameter as done in Chapter 3. We also study the asymptotic properties of a direct Bayesian method obtained from the approximate likelihood obtained by the RK4 method in Chapter 3. Chapters 4 and 5 contain the extensions of the methods discussed so far for higher order ODE's and partial differential equations (PDE's) respectively. We have mentioned the scopes of some future works in Chapter 6.
Bayesian Gaussian Mixture Models for High-Density Genotyping Arrays
Sabatti, Chiara; Lange, Kenneth
2011-01-01
Affymetrix's SNP (single-nucleotide polymorphism) genotyping chips have increased the scope and decreased the cost of gene-mapping studies. Because each SNP is queried by multiple DNA probes, the chips present interesting challenges in genotype calling. Traditional clustering methods distinguish the three genotypes of an SNP fairly well given a large enough sample of unrelated individuals or a training sample of known genotypes. This article describes our attempt to improve genotype calling by constructing Gaussian mixture models with empirically derived priors. The priors stabilize parameter estimation and borrow information collectively gathered on tens of thousands of SNPs. When data from related family members are available, our models capture the correlations in signals between relatives. With these advantages in mind, we apply the models to Affymetrix probe intensity data on 10,000 SNPs gathered on 63 genotyped individuals spread over eight pedigrees. We integrate the genotype-calling model with pedigree analysis and examine a sequence of symmetry hypotheses involving the correlated probe signals. The symmetry hypotheses raise novel mathematical issues of parameterization. Using the Bayesian information criterion, we select the best combination of symmetry assumptions. Compared to Affymetrix's software, our model leads to a reduction in no-calls with little sacrifice in overall calling accuracy. PMID:21572926
Bayesian Multiscale Modeling of Closed Curves in Point Clouds
Gu, Kelvin; Pati, Debdeep; Dunson, David B.
2014-01-01
Modeling object boundaries based on image or point cloud data is frequently necessary in medical and scientific applications ranging from detecting tumor contours for targeted radiation therapy, to the classification of organisms based on their structural information. In low-contrast images or sparse and noisy point clouds, there is often insufficient data to recover local segments of the boundary in isolation. Thus, it becomes critical to model the entire boundary in the form of a closed curve. To achieve this, we develop a Bayesian hierarchical model that expresses highly diverse 2D objects in the form of closed curves. The model is based on a novel multiscale deformation process. By relating multiple objects through a hierarchical formulation, we can successfully recover missing boundaries by borrowing structural information from similar objects at the appropriate scale. Furthermore, the model’s latent parameters help interpret the population, indicating dimensions of significant structural variability and also specifying a ‘central curve’ that summarizes the collection. Theoretical properties of our prior are studied in specific cases and efficient Markov chain Monte Carlo methods are developed, evaluated through simulation examples and applied to panorex teeth images for modeling teeth contours and also to a brain tumor contour detection problem. PMID:25544786
Ensemble bayesian model averaging using markov chain Monte Carlo sampling
Vrugt, Jasper A; Diks, Cees G H; Clark, Martyn P
2008-01-01
Bayesian model averaging (BMA) has recently been proposed as a statistical method to calibrate forecast ensembles from numerical weather models. Successful implementation of BMA however, requires accurate estimates of the weights and variances of the individual competing models in the ensemble. In their seminal paper (Raftery etal. Mon Weather Rev 133: 1155-1174, 2(05)) has recommended the Expectation-Maximization (EM) algorithm for BMA model training, even though global convergence of this algorithm cannot be guaranteed. In this paper, we compare the performance of the EM algorithm and the recently developed Differential Evolution Adaptive Metropolis (DREAM) Markov Chain Monte Carlo (MCMC) algorithm for estimating the BMA weights and variances. Simulation experiments using 48-hour ensemble data of surface temperature and multi-model stream-flow forecasts show that both methods produce similar results, and that their performance is unaffected by the length of the training data set. However, MCMC simulation with DREAM is capable of efficiently handling a wide variety of BMA predictive distributions, and provides useful information about the uncertainty associated with the estimated BMA weights and variances.
A Bayesian model of category-specific emotional brain responses.
Wager, Tor D; Kang, Jian; Johnson, Timothy D; Nichols, Thomas E; Satpute, Ajay B; Barrett, Lisa Feldman
2015-04-01
Understanding emotion is critical for a science of healthy and disordered brain function, but the neurophysiological basis of emotional experience is still poorly understood. We analyzed human brain activity patterns from 148 studies of emotion categories (2159 total participants) using a novel hierarchical Bayesian model. The model allowed us to classify which of five categories--fear, anger, disgust, sadness, or happiness--is engaged by a study with 66% accuracy (43-86% across categories). Analyses of the activity patterns encoded in the model revealed that each emotion category is associated with unique, prototypical patterns of activity across multiple brain systems including the cortex, thalamus, amygdala, and other structures. The results indicate that emotion categories are not contained within any one region or system, but are represented as configurations across multiple brain networks. The model provides a precise summary of the prototypical patterns for each emotion category, and demonstrates that a sufficient characterization of emotion categories relies on (a) differential patterns of involvement in neocortical systems that differ between humans and other species, and (b) distinctive patterns of cortical-subcortical interactions. Thus, these findings are incompatible with several contemporary theories of emotion, including those that emphasize emotion-dedicated brain systems and those that propose emotion is localized primarily in subcortical activity. They are consistent with componential and constructionist views, which propose that emotions are differentiated by a combination of perceptual, mnemonic, prospective, and motivational elements. Such brain-based models of emotion provide a foundation for new translational and clinical approaches. PMID:25853490
A Bayesian Model of Category-Specific Emotional Brain Responses
Wager, Tor D.; Kang, Jian; Johnson, Timothy D.; Nichols, Thomas E.; Satpute, Ajay B.; Barrett, Lisa Feldman
2015-01-01
Understanding emotion is critical for a science of healthy and disordered brain function, but the neurophysiological basis of emotional experience is still poorly understood. We analyzed human brain activity patterns from 148 studies of emotion categories (2159 total participants) using a novel hierarchical Bayesian model. The model allowed us to classify which of five categories—fear, anger, disgust, sadness, or happiness—is engaged by a study with 66% accuracy (43-86% across categories). Analyses of the activity patterns encoded in the model revealed that each emotion category is associated with unique, prototypical patterns of activity across multiple brain systems including the cortex, thalamus, amygdala, and other structures. The results indicate that emotion categories are not contained within any one region or system, but are represented as configurations across multiple brain networks. The model provides a precise summary of the prototypical patterns for each emotion category, and demonstrates that a sufficient characterization of emotion categories relies on (a) differential patterns of involvement in neocortical systems that differ between humans and other species, and (b) distinctive patterns of cortical-subcortical interactions. Thus, these findings are incompatible with several contemporary theories of emotion, including those that emphasize emotion-dedicated brain systems and those that propose emotion is localized primarily in subcortical activity. They are consistent with componential and constructionist views, which propose that emotions are differentiated by a combination of perceptual, mnemonic, prospective, and motivational elements. Such brain-based models of emotion provide a foundation for new translational and clinical approaches. PMID:25853490
Bayesian calibration of the Community Land Model using surrogates
Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Swiler, Laura Painton
2014-02-01
We present results from the Bayesian calibration of hydrological parameters of the Community Land Model (CLM), which is often used in climate simulations and Earth system models. A statistical inverse problem is formulated for three hydrological parameters, conditional on observations of latent heat surface fluxes over 48 months. Our calibration method uses polynomial and Gaussian process surrogates of the CLM, and solves the parameter estimation problem using a Markov chain Monte Carlo sampler. Posterior probability densities for the parameters are developed for two sites with different soil and vegetation covers. Our method also allows us to examine the structural error in CLM under two error models. We find that surrogate models can be created for CLM in most cases. The posterior distributions are more predictive than the default parameter values in CLM. Climatologically averaging the observations does not modify the parameters' distributions significantly. The structural error model reveals a correlation time-scale which can be used to identify the physical process that could be contributing to it. While the calibrated CLM has a higher predictive skill, the calibration is under-dispersive.
Bayesian Calibration of the Community Land Model using Surrogates
Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Sargsyan, K.; Swiler, Laura P.
2015-01-01
We present results from the Bayesian calibration of hydrological parameters of the Community Land Model (CLM), which is often used in climate simulations and Earth system models. A statistical inverse problem is formulated for three hydrological parameters, conditioned on observations of latent heat surface fluxes over 48 months. Our calibration method uses polynomial and Gaussian process surrogates of the CLM, and solves the parameter estimation problem using a Markov chain Monte Carlo sampler. Posterior probability densities for the parameters are developed for two sites with different soil and vegetation covers. Our method also allows us to examine the structural error in CLM under two error models. We find that accurate surrogate models can be created for CLM in most cases. The posterior distributions lead to better prediction than the default parameter values in CLM. Climatologically averaging the observations does not modify the parameters’ distributions significantly. The structural error model reveals a correlation time-scale which can potentially be used to identify physical processes that could be contributing to it. While the calibrated CLM has a higher predictive skill, the calibration is under-dispersive.
Application Bayesian Model Averaging method for ensemble system for Poland
NASA Astrophysics Data System (ADS)
Guzikowski, Jakub; Czerwinska, Agnieszka
2014-05-01
The aim of the project is to evaluate methods for generating numerical ensemble weather prediction using a meteorological data from The Weather Research & Forecasting Model and calibrating this data by means of Bayesian Model Averaging (WRF BMA) approach. We are constructing height resolution short range ensemble forecasts using meteorological data (temperature) generated by nine WRF's models. WRF models have 35 vertical levels and 2.5 km x 2.5 km horizontal resolution. The main emphasis is that the used ensemble members has a different parameterization of the physical phenomena occurring in the boundary layer. To calibrate an ensemble forecast we use Bayesian Model Averaging (BMA) approach. The BMA predictive Probability Density Function (PDF) is a weighted average of predictive PDFs associated with each individual ensemble member, with weights that reflect the member's relative skill. For test we chose a case with heat wave and convective weather conditions in Poland area from 23th July to 1st August 2013. From 23th July to 29th July 2013 temperature oscillated below or above 30 Celsius degree in many meteorology stations and new temperature records were added. During this time the growth of the hospitalized patients with cardiovascular system problems was registered. On 29th July 2013 an advection of moist tropical air masses was recorded in the area of Poland causes strong convection event with mesoscale convection system (MCS). MCS caused local flooding, damage to the transport infrastructure, destroyed buildings, trees and injuries and direct threat of life. Comparison of the meteorological data from ensemble system with the data recorded on 74 weather stations localized in Poland is made. We prepare a set of the model - observations pairs. Then, the obtained data from single ensemble members and median from WRF BMA system are evaluated on the basis of the deterministic statistical error Root Mean Square Error (RMSE), Mean Absolute Error (MAE). To evaluation
Bayesian Belief Networks Approach for Modeling Irrigation Behavior
NASA Astrophysics Data System (ADS)
Andriyas, S.; McKee, M.
2012-12-01
Canal operators need information to manage water deliveries to irrigators. Short-term irrigation demand forecasts can potentially valuable information for a canal operator who must manage an on-demand system. Such forecasts could be generated by using information about the decision-making processes of irrigators. Bayesian models of irrigation behavior can provide insight into the likely criteria which farmers use to make irrigation decisions. This paper develops a Bayesian belief network (BBN) to learn irrigation decision-making behavior of farmers and utilizes the resulting model to make forecasts of future irrigation decisions based on factor interaction and posterior probabilities. Models for studying irrigation behavior have been rarely explored in the past. The model discussed here was built from a combination of data about biotic, climatic, and edaphic conditions under which observed irrigation decisions were made. The paper includes a case study using data collected from the Canal B region of the Sevier River, near Delta, Utah. Alfalfa, barley and corn are the main crops of the location. The model has been tested with a portion of the data to affirm the model predictive capabilities. Irrigation rules were deduced in the process of learning and verified in the testing phase. It was found that most of the farmers used consistent rules throughout all years and across different types of crops. Soil moisture stress, which indicates the level of water available to the plant in the soil profile, was found to be one of the most significant likely driving forces for irrigation. Irrigations appeared to be triggered by a farmer's perception of soil stress, or by a perception of combined factors such as information about a neighbor irrigating or an apparent preference to irrigate on a weekend. Soil stress resulted in irrigation probabilities of 94.4% for alfalfa. With additional factors like weekend and irrigating when a neighbor irrigates, alfalfa irrigation
Quantifying Uncertainty in Velocity Models using Bayesian Methods
NASA Astrophysics Data System (ADS)
Hobbs, R.; Caiado, C.; Majdański, M.
2008-12-01
Quanitifying uncertainty in models derived from observed data is a major issue. Public and political understanding of uncertainty is poor and for industry inadequate assessment of risk costs money. In this talk we will examine the geological structure of the subsurface, however our principal exploration tool, controlled source seismology, gives its data in time. Inversion tools exist to map these data into a depth model but a full exploration of the uncertainty of the model is rarely done because robust strategies do not exist for large non-linear complex systems. There are two principal sources of uncertainty: the first comes from the input data which is noisy and bandlimited; the second, and more sinister, is from the model parameterisation and forward algorithms themselves, which approximate to the physics to make the problem tractable. To address these issues we propose a Bayesian approach. One philosophy is to estimate the uncertainty in a possible model derived using standard inversion tools. During the inversion stage we can use our geological prejudice to derive an acceptable model. Then we use a local random walk using the Metropolis- Hastings algorithm to explore the model space immediately around a possible solution. For models with a limited number of parameters we can use the forward modeling step from the inversion code. However as the number of parameters increase and/or the cost of the forward modeling step becomes significant, we need to use fast emulators to act as proxies so a sufficient number of iterations can be performed on which to base our statistical measures of uncertainty. In this presentation we show examples of uncertainty estimation using both pre- and post-critical seismic data. In particular, we will demonstrate uncertainty introduced by the approximation of the physics by using a tomographic inversion of bandlimited data and show that uncertainty increases as the central frequency of the data decreases. This is consistent with the
Using Bayesian Stable Isotope Mixing Models to Enhance Marine Ecosystem Models
The use of stable isotopes in food web studies has proven to be a valuable tool for ecologists. We investigated the use of Bayesian stable isotope mixing models as constraints for an ecosystem model of a temperate seagrass system on the Atlantic coast of France. δ13C and δ15N i...
Technology Transfer Automated Retrieval System (TEKTRAN)
In this paper, the Genetic Algorithms (GA) and Bayesian model averaging (BMA) were combined to simultaneously conduct calibration and uncertainty analysis for the Soil and Water Assessment Tool (SWAT). In this hybrid method, several SWAT models with different structures are first selected; next GA i...
The Survival Kit: software to analyze survival data including possibly correlated random effects.
Mészáros, G; Sölkner, J; Ducrocq, V
2013-06-01
The Survival Kit is a Fortran 90 Software intended for survival analysis using proportional hazards models and their extension to frailty models with a single response time. The hazard function is described as the product of a baseline hazard function and a positive (exponential) function of possibly time-dependent fixed and random covariates. Stratified Cox, grouped data and Weibull models can be used. Random effects can be either log-gamma or normally distributed and can account for a pedigree structure. Variance parameters are estimated in a Bayesian context. It is possible to account for the correlated nature of two random effects either by specifying a known correlation coefficient or estimating it from the data. An R interface of the Survival Kit provides a user friendly way to run the software. PMID:23399103
Forecasting unconventional resource productivity - A spatial Bayesian model
NASA Astrophysics Data System (ADS)
Montgomery, J.; O'sullivan, F.
2015-12-01
Today's low prices mean that unconventional oil and gas development requires ever greater efficiency and better development decision-making. Inter and intra-field variability in well productivity, which is a major contemporary driver of uncertainty regarding resource size and its economics is driven by factors including geological conditions, well and completion design (which companies vary as they seek to optimize their performance), and uncertainty about the nature of fracture propagation. Geological conditions are often not be well understood early on in development campaigns, but nevertheless critical assessments and decisions must be made regarding the value of drilling an area and the placement of wells. In these situations, location provides a reasonable proxy for geology and the "rock quality." We propose a spatial Bayesian model for forecasting acreage quality, which improves decision-making by leveraging available production data and provides a framework for statistically studying the influence of different parameters on well productivity. Our approach consists of subdividing a field into sections and forming prior distributions for productivity in each section based on knowledge about the overall field. Production data from wells is used to update these estimates in a Bayesian fashion, improving model accuracy far more rapidly and with less sensitivity to outliers than a model that simply establishes an "average" productivity in each section. Additionally, forecasts using this model capture the importance of uncertainty—either due to a lack of information or for areas that demonstrate greater geological risk. We demonstrate the forecasting utility of this method using public data and also provide examples of how information from this model can be combined with knowledge about a field's geology or changes in technology to better quantify development risk. This approach represents an important shift in the way that production data is used to guide
Improving default risk prediction using Bayesian model uncertainty techniques.
Kazemi, Reza; Mosleh, Ali
2012-11-01
Credit risk is the potential exposure of a creditor to an obligor's failure or refusal to repay the debt in principal or interest. The potential of exposure is measured in terms of probability of default. Many models have been developed to estimate credit risk, with rating agencies dating back to the 19th century. They provide their assessment of probability of default and transition probabilities of various firms in their annual reports. Regulatory capital requirements for credit risk outlined by the Basel Committee on Banking Supervision have made it essential for banks and financial institutions to develop sophisticated models in an attempt to measure credit risk with higher accuracy. The Bayesian framework proposed in this article uses the techniques developed in physical sciences and engineering for dealing with model uncertainty and expert accuracy to obtain improved estimates of credit risk and associated uncertainties. The approach uses estimates from one or more rating agencies and incorporates their historical accuracy (past performance data) in estimating future default risk and transition probabilities. Several examples demonstrate that the proposed methodology can assess default probability with accuracy exceeding the estimations of all the individual models. Moreover, the methodology accounts for potentially significant departures from "nominal predictions" due to "upsetting events" such as the 2008 global banking crisis. PMID:23163724
A Bayesian Model for the Analysis of Transgenerational Epigenetic Variation
Varona, Luis; Munilla, Sebastián; Mouresan, Elena Flavia; González-Rodríguez, Aldemar; Moreno, Carlos; Altarriba, Juan
2015-01-01
Epigenetics has become one of the major areas of biological research. However, the degree of phenotypic variability that is explained by epigenetic processes still remains unclear. From a quantitative genetics perspective, the estimation of variance components is achieved by means of the information provided by the resemblance between relatives. In a previous study, this resemblance was described as a function of the epigenetic variance component and a reset coefficient that indicates the rate of dissipation of epigenetic marks across generations. Given these assumptions, we propose a Bayesian mixed model methodology that allows the estimation of epigenetic variance from a genealogical and phenotypic database. The methodology is based on the development of a T matrix of epigenetic relationships that depends on the reset coefficient. In addition, we present a simple procedure for the calculation of the inverse of this matrix (T−1) and a Gibbs sampler algorithm that obtains posterior estimates of all the unknowns in the model. The new procedure was used with two simulated data sets and with a beef cattle database. In the simulated populations, the results of the analysis provided marginal posterior distributions that included the population parameters in the regions of highest posterior density. In the case of the beef cattle dataset, the posterior estimate of transgenerational epigenetic variability was very low and a model comparison test indicated that a model that did not included it was the most plausible. PMID:25617408
Chu, Haitao; Nie, Lei; Chen, Yong; Huang, Yi; Sun, Wei
2012-12-01
Multivariate meta-analysis is increasingly utilised in biomedical research to combine data of multiple comparative clinical studies for evaluating drug efficacy and safety profile. When the probability of the event of interest is rare, or when the individual study sample sizes are small, a substantial proportion of studies may not have any event of interest. Conventional meta-analysis methods either exclude such studies or include them through ad hoc continuality correction by adding an arbitrary positive value to each cell of the corresponding 2 × 2 tables, which may result in less accurate conclusions. Furthermore, different continuity corrections may result in inconsistent conclusions. In this article, we discuss a bivariate Beta-binomial model derived from Sarmanov family of bivariate distributions and a bivariate generalised linear mixed effects model for binary clustered data to make valid inferences. These bivariate random effects models use all available data without ad hoc continuity corrections, and accounts for the potential correlation between treatment (or exposure) and control groups within studies naturally. We then utilise the bivariate random effects models to reanalyse two recent meta-analysis data sets. PMID:21177306
ERIC Educational Resources Information Center
Meyer, Donald L.
Bayesian statistical methodology and its possible uses in the behavioral sciences are discussed in relation to the solution of problems in both the use and teaching of fundamental statistical methods, including confidence intervals, significance tests, and sampling. The Bayesian model explains these statistical methods and offers a consistent…
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range
CRAFFT: An Activity Prediction Model based on Bayesian Networks
Nazerfard, Ehsan; Cook, Diane J.
2014-01-01
Recent advances in the areas of pervasive computing, data mining, and machine learning offer unique opportunities to provide health monitoring and assistance for individuals facing difficulties to live independently in their homes. Several components have to work together to provide health monitoring for smart home residents including, but not limited to, activity recognition, activity discovery, activity prediction, and prompting system. Compared to the significant research done to discover and recognize activities, less attention has been given to predict the future activities that the resident is likely to perform. Activity prediction components can play a major role in design of a smart home. For instance, by taking advantage of an activity prediction module, a smart home can learn context-aware rules to prompt individuals to initiate important activities. In this paper, we propose an activity prediction model using Bayesian networks together with a novel two-step inference process to predict both the next activity features and the next activity label. We also propose an approach to predict the start time of the next activity which is based on modeling the relative start time of the predicted activity using the continuous normal distribution and outlier detection. To validate our proposed models, we used real data collected from physical smart environments. PMID:25937847
Bayesian network model of crowd emotion and negative behavior
NASA Astrophysics Data System (ADS)
Ramli, Nurulhuda; Ghani, Noraida Abdul; Hatta, Zulkarnain Ahmad; Hashim, Intan Hashimah Mohd; Sulong, Jasni; Mahudin, Nor Diana Mohd; Rahman, Shukran Abd; Saad, Zarina Mat
2014-12-01
The effects of overcrowding have become a major concern for event organizers. One aspect of this concern has been the idea that overcrowding can enhance the occurrence of serious incidents during events. As one of the largest Muslim religious gathering attended by pilgrims from all over the world, Hajj has become extremely overcrowded with many incidents being reported. The purpose of this study is to analyze the nature of human emotion and negative behavior resulting from overcrowding during Hajj events from data gathered in Malaysian Hajj Experience Survey in 2013. The sample comprised of 147 Malaysian pilgrims (70 males and 77 females). Utilizing a probabilistic model called Bayesian network, this paper models the dependence structure between different emotions and negative behaviors of pilgrims in the crowd. The model included the following variables of emotion: negative, negative comfortable, positive, positive comfortable and positive spiritual and variables of negative behaviors; aggressive and hazardous acts. The study demonstrated that emotions of negative, negative comfortable, positive spiritual and positive emotion have a direct influence on aggressive behavior whereas emotion of negative comfortable, positive spiritual and positive have a direct influence on hazardous acts behavior. The sensitivity analysis showed that a low level of negative and negative comfortable emotions leads to a lower level of aggressive and hazardous behavior. Findings of the study can be further improved to identify the exact cause and risk factors of crowd-related incidents in preventing crowd disasters during the mass gathering events.
A Bayesian generative model for learning semantic hierarchies
Mittelman, Roni; Sun, Min; Kuipers, Benjamin; Savarese, Silvio
2014-01-01
Building fine-grained visual recognition systems that are capable of recognizing tens of thousands of categories, has received much attention in recent years. The well known semantic hierarchical structure of categories and concepts, has been shown to provide a key prior which allows for optimal predictions. The hierarchical organization of various domains and concepts has been subject to extensive research, and led to the development of the WordNet domains hierarchy (Fellbaum, 1998), which was also used to organize the images in the ImageNet (Deng et al., 2009) dataset, in which the category count approaches the human capacity. Still, for the human visual system, the form of the hierarchy must be discovered with minimal use of supervision or innate knowledge. In this work, we propose a new Bayesian generative model for learning such domain hierarchies, based on semantic input. Our model is motivated by the super-subordinate organization of domain labels and concepts that characterizes WordNet, and accounts for several important challenges: maintaining context information when progressing deeper into the hierarchy, learning a coherent semantic concept for each node, and modeling uncertainty in the perception process. PMID:24904452
A Bayesian Semiparametric Model for Radiation Dose-Response Estimation.
Furukawa, Kyoji; Misumi, Munechika; Cologne, John B; Cullings, Harry M
2016-06-01
In evaluating the risk of exposure to health hazards, characterizing the dose-response relationship and estimating acceptable exposure levels are the primary goals. In analyses of health risks associated with exposure to ionizing radiation, while there is a clear agreement that moderate to high radiation doses cause harmful effects in humans, little has been known about the possible biological effects at low doses, for example, below 0.1 Gy, which is the dose range relevant to most radiation exposures of concern today. A conventional approach to radiation dose-response estimation based on simple parametric forms, such as the linear nonthreshold model, can be misleading in evaluating the risk and, in particular, its uncertainty at low doses. As an alternative approach, we consider a Bayesian semiparametric model that has a connected piece-wise-linear dose-response function with prior distributions having an autoregressive structure among the random slope coefficients defined over closely spaced dose categories. With a simulation study and application to analysis of cancer incidence data among Japanese atomic bomb survivors, we show that this approach can produce smooth and flexible dose-response estimation while reasonably handling the risk uncertainty at low doses and elsewhere. With relatively few assumptions and modeling options to be made by the analyst, the method can be particularly useful in assessing risks associated with low-dose radiation exposures. PMID:26581473
Bayesian parametrization of coarse-grain dissipative dynamics models
NASA Astrophysics Data System (ADS)
Dequidt, Alain; Solano Canchaya, Jose G.
2015-08-01
We introduce a new bottom-up method for the optimization of dissipative coarse-grain models. The method is based on Bayesian optimization of the likelihood to reproduce a coarse-grained reference trajectory obtained from analysis of a higher resolution molecular dynamics trajectory. This new method is related to force matching techniques, but using the total force on each grain averaged on a coarse time step instead of instantaneous forces. It has the advantage of not being limited to pairwise short-range interactions in the coarse-grain model and also yields an estimation of the friction parameter controlling the dynamics. The theory supporting the method is exposed in a practical perspective, with an analytical solution for the optimal set of parameters. The method was first validated by using it on a system with a known optimum. The new method was then tested on a simple system: n-pentane. The local molecular structure of the optimized model is in excellent agreement with the reference system. An extension of the method allows to get also an excellent agreement for the equilibrium density. As for the dynamic properties, they are also very satisfactory, but more sensitive to the choice of the coarse-grain representation. The quality of the final force field depends on the definition of the coarse grain degrees of freedom and interactions. We consider this method as a serious alternative to other methods like iterative Boltzmann inversion, force matching, and Green-Kubo formulae.
Tran, Van; Liu, Danping; Pradhan, Anuj K.; Li, Kaigang; Bingham, C. Raymond; Simons-Morton, Bruce G.; Albert, Paul S.
2016-01-01
Signalized intersection management is a common measure of risky driving in simulator studies. In a recent randomized trial, investigators were interested in whether teenage males exposed to a risk-accepting passenger took more intersection risks in a driving simulator compared with those exposed to a risk-averse peer passenger. Analyses in this trial are complicated by the longitudinal or repeated measures that are semi-continuous with clumping at zero. Specifically, the dependent variable in a randomized trial looking at the effect of risk-accepting versus risk-averse peer passengers on teenage simulator driving is comprised of two components. The discrete component measures whether the teen driver stops for a yellow light, and the continuous component measures the time the teen driver, who does not stop, spends in the intersection during a red light. To convey both components of this measure, we apply a two-part regression with correlated random effects model (CREM), consisting of a logistic regression to model whether the driver stops for a yellow light and a linear regression to model the time spent in the intersection during a red light. These two components are related through the correlation of their random effects. Using this novel analysis, we found that those exposed to a risk-averse passenger have a higher proportion of stopping at yellow lights and a longer mean time in the intersection during a red light when they did not stop at the light compared to those exposed to a risk-accepting passenger, consistent with the study hypotheses and previous analyses. Examining the statistical properties of the CREM approach through simulations, we found that in most situations, the CREM achieves greater power than competing approaches. We also examined whether the treatment effect changes across the length of the drive and provided a sample size recommendation for detecting such phenomenon in subsequent trials. Our findings suggest that CREM provides an efficient
Enhancing the Modeling of PFOA Pharmacokinetics with Bayesian Analysis
The detail sufficient to describe the pharmacokinetics (PK) for perfluorooctanoic acid (PFOA) and the methods necessary to combine information from multiple data sets are both subjects of ongoing investigation. Bayesian analysis provides tools to accommodate these goals. We exa...
Elsheikh, Ahmed H.; Wheeler, Mary F.; Hoteit, Ibrahim
2014-02-01
A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems.
A Bayesian model of context-sensitive value attribution
Rigoli, Francesco; Friston, Karl J; Martinelli, Cristina; Selaković, Mirjana; Shergill, Sukhwinder S; Dolan, Raymond J
2016-01-01
Substantial evidence indicates that incentive value depends on an anticipation of rewards within a given context. However, the computations underlying this context sensitivity remain unknown. To address this question, we introduce a normative (Bayesian) account of how rewards map to incentive values. This assumes that the brain inverts a model of how rewards are generated. Key features of our account include (i) an influence of prior beliefs about the context in which rewards are delivered (weighted by their reliability in a Bayes-optimal fashion), (ii) the notion that incentive values correspond to precision-weighted prediction errors, (iii) and contextual information unfolding at different hierarchical levels. This formulation implies that incentive value is intrinsically context-dependent. We provide empirical support for this model by showing that incentive value is influenced by context variability and by hierarchically nested contexts. The perspective we introduce generates new empirical predictions that might help explaining psychopathologies, such as addiction. DOI: http://dx.doi.org/10.7554/eLife.16127.001 PMID:27328323
Bayesian Modeling and Chronological Precision for Polynesian Settlement of Tonga
Weisler, Marshall; Zhao, Jian-xin
2015-01-01
First settlement of Polynesia, and population expansion throughout the ancestral Polynesian homeland are foundation events for global history. A precise chronology is paramount to informed archaeological interpretation of these events and their consequences. Recently applied chronometric hygiene protocols excluding radiocarbon dates on wood charcoal without species identification all but eliminates this chronology as it has been built for the Kingdom of Tonga, the initial islands to be settled in Polynesia. In this paper we re-examine and redevelop this chronology through application of Bayesian models to the questioned suite of radiocarbon dates, but also incorporating short-lived wood charcoal dates from archived samples and high precision U/Th dates on coral artifacts. These models provide generation level precision allowing us to track population migration from first Lapita occupation on the island of Tongatapu through Tonga’s central and northern island groups. They further illustrate an exceptionally short duration for the initial colonizing Lapita phase and a somewhat abrupt transition to ancestral Polynesian society as it is currently defined. PMID:25799460
Point source moment tensor inversion through a Bayesian hierarchical model
NASA Astrophysics Data System (ADS)
Mustać, Marija; Tkalčić, Hrvoje
2016-01-01
Characterization of seismic sources is an important aspect of seismology. Parameter uncertainties in such inversions are essential for estimating solution robustness, but are rarely available. We have developed a non-linear moment tensor inversion method in a probabilistic Bayesian framework that also accounts for noise in the data. The method is designed for point source inversion using waveform data of moderate-size earthquakes and explosions at regional distances. This probabilistic approach results in an ensemble of models, whose density is proportional to parameter probability distribution and quantifies parameter uncertainties. Furthermore, we invert for noise in the data, allowing it to determine the model complexity. We implement an empirical noise covariance matrix that accounts for interdependence of observational errors present in waveform data. After we demonstrate the feasibility of the approach on synthetic data, we apply it to a Long Valley Caldera, CA, earthquake with a well-documented anomalous (non-double-couple) radiation from previous studies. We confirm a statistically significant isotropic component in the source without a trade-off with the compensated linear vector dipoles component.
Bayesian Modeling of Haplotype Effects in Multiparent Populations
Zhang, Zhaojun; Wang, Wei; Valdar, William
2014-01-01
A general Bayesian model, Diploffect, is described for estimating the effects of founder haplotypes at quantitative trait loci (QTL) detected in multiparental genetic populations; such populations include the Collaborative Cross (CC), Heterogeneous Socks (HS), and many others for which local genetic variation is well described by an underlying, usually probabilistically inferred, haplotype mosaic. Our aim is to provide a framework for coherent estimation of haplotype and diplotype (haplotype pair) effects that takes into account the following: uncertainty in haplotype composition for each individual; uncertainty arising from small sample sizes and infrequently observed haplotype combinations; possible effects of dominance (for noninbred subjects); genetic background; and that provides a means to incorporate data that may be incomplete or has a hierarchical structure. Using the results of a probabilistic haplotype reconstruction as prior information, we obtain posterior distributions at the QTL for both haplotype effects and haplotype composition. Two alternative computational approaches are supplied: a Markov chain Monte Carlo sampler and a procedure based on importance sampling of integrated nested Laplace approximations. Using simulations of QTL in the incipient CC (pre-CC) and Northport HS populations, we compare the accuracy of Diploffect, approximations to it, and more commonly used approaches based on Haley–Knott regression, describing trade-offs between these methods. We also estimate effects for three QTL previously identified in those populations, obtaining posterior intervals that describe how the phenotype might be affected by diplotype substitutions at the modeled locus. PMID:25236455
Chow, Sy-Miin; Lu, Zhaohua; Sherwood, Andrew; Zhu, Hongtu
2016-03-01
The past decade has evidenced the increased prevalence of irregularly spaced longitudinal data in social sciences. Clearly lacking, however, are modeling tools that allow researchers to fit dynamic models to irregularly spaced data, particularly data that show nonlinearity and heterogeneity in dynamical structures. We consider the issue of fitting multivariate nonlinear differential equation models with random effects and unknown initial conditions to irregularly spaced data. A stochastic approximation expectation-maximization algorithm is proposed and its performance is evaluated using a benchmark nonlinear dynamical systems model, namely, the Van der Pol oscillator equations. The empirical utility of the proposed technique is illustrated using a set of 24-h ambulatory cardiovascular data from 168 men and women. Pertinent methodological challenges and unresolved issues are discussed. PMID:25416456
Emulation Modeling with Bayesian Networks for Efficient Decision Support
NASA Astrophysics Data System (ADS)
Fienen, M. N.; Masterson, J.; Plant, N. G.; Gutierrez, B. T.; Thieler, E. R.
2012-12-01
Bayesian decision networks (BDN) have long been used to provide decision support in systems that require explicit consideration of uncertainty; applications range from ecology to medical diagnostics and terrorism threat assessments. Until recently, however, few studies have applied BDNs to the study of groundwater systems. BDNs are particularly useful for representing real-world system variability by synthesizing a range of hydrogeologic situations within a single simulation. Because BDN output is cast in terms of probability—an output desired by decision makers—they explicitly incorporate the uncertainty of a system. BDNs can thus serve as a more efficient alternative to other uncertainty characterization methods such as computationally demanding Monte Carlo analyses and others methods restricted to linear model analyses. We present a unique application of a BDN to a groundwater modeling analysis of the hydrologic response of Assateague Island, Maryland to sea-level rise. Using both input and output variables of the modeled groundwater response to different sea-level (SLR) rise scenarios, the BDN predicts the probability of changes in the depth to fresh water, which exerts an important influence on physical and biological island evolution. Input variables included barrier-island width, maximum island elevation, and aquifer recharge. The variability of these inputs and their corresponding outputs are sampled along cross sections in a single model run to form an ensemble of input/output pairs. The BDN outputs, which are the posterior distributions of water table conditions for the sea-level rise scenarios, are evaluated through error analysis and cross-validation to assess both fit to training data and predictive power. The key benefit for using BDNs in groundwater modeling analyses is that they provide a method for distilling complex model results into predictions with associated uncertainty, which is useful to decision makers. Future efforts incorporate
Learning Bayesian Networks from Correlated Data
NASA Astrophysics Data System (ADS)
Bae, Harold; Monti, Stefano; Montano, Monty; Steinberg, Martin H.; Perls, Thomas T.; Sebastiani, Paola
2016-05-01
Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational studies are designed using some form of clustered sampling that introduces correlations between observations within the same cluster and ignoring this correlation typically inflates the rate of false positive associations. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. We compare different learning metrics using simulations and illustrate the method in two real examples: an analysis of genetic and non-genetic factors associated with human longevity from a family-based study, and an example of risk factors for complications of sickle cell anemia from a longitudinal study with repeated measures.
Using Bayesian statistical methods to quantify uncertainty and variability in human physiologically-based pharmacokinetic (PBPK) model predictions for use in risk assessments requires prior distributions (priors), which characterize what is known or believed about parameters’ val...
Using Bayesian statistical methods to quantify uncertainty and variability in human PBPK model predictions for use in risk assessments requires prior distributions (priors), which characterize what is known or believed about parameters’ values before observing in vivo data. Expe...
A Bayesian view on acoustic model-based techniques for robust speech recognition
NASA Astrophysics Data System (ADS)
Maas, Roland; Huemmer, Christian; Sehr, Armin; Kellermann, Walter
2015-12-01
This article provides a unifying Bayesian view on various approaches for acoustic model adaptation, missing feature, and uncertainty decoding that are well-known in the literature of robust automatic speech recognition. The representatives of these classes can often be deduced from a Bayesian network that extends the conventional hidden Markov models used in speech recognition. These extensions, in turn, can in many cases be motivated from an underlying observation model that relates clean and distorted feature vectors. By identifying and converting the observation models into a Bayesian network representation, we formulate the corresponding compensation rules. We thus summarize the various approaches as approximations or modifications of the same Bayesian decoding rule leading to a unified view on known derivations as well as to new formulations for certain approaches.
Lee, Sik-Yum; Song, Xin-Yuan
2004-05-01
Missing data are very common in behavioural and psychological research. In this paper, we develop a Bayesian approach in the context of a general nonlinear structural equation model with missing continuous and ordinal categorical data. In the development, the missing data are treated as latent quantities, and provision for the incompleteness of the data is made by a hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm. We show by means of a simulation study that the Bayesian estimates are accurate. A Bayesian model comparison procedure based on the Bayes factor and path sampling is proposed. The required observations from the posterior distribution for computing the Bayes factor are simulated by the hybrid algorithm in Bayesian estimation. Our simulation results indicate that the correct model is selected more frequently when the incomplete records are used in the analysis than when they are ignored. The methodology is further illustrated with a real data set from a study concerned with an AIDS preventative intervention for Filipina sex workers. PMID:15171804
MARTINEZ, Josue G.; BOHN, Kirsten M.; CARROLL, Raymond J.
2013-01-01
We describe a new approach to analyze chirp syllables of free-tailed bats from two regions of Texas in which they are predominant: Austin and College Station. Our goal is to characterize any systematic regional differences in the mating chirps and assess whether individual bats have signature chirps. The data are analyzed by modeling spectrograms of the chirps as responses in a Bayesian functional mixed model. Given the variable chirp lengths, we compute the spectrograms on a relative time scale interpretable as the relative chirp position, using a variable window overlap based on chirp length. We use 2D wavelet transforms to capture correlation within the spectrogram in our modeling and obtain adaptive regularization of the estimates and inference for the regions-specific spectrograms. Our model includes random effect spectrograms at the bat level to account for correlation among chirps from the same bat, and to assess relative variability in chirp spectrograms within and between bats. The modeling of spectrograms using functional mixed models is a general approach for the analysis of replicated nonstationary time series, such as our acoustical signals, to relate aspects of the signals to various predictors, while accounting for between-signal structure. This can be done on raw spectrograms when all signals are of the same length, and can be done using spectrograms defined on a relative time scale for signals of variable length in settings where the idea of defining correspondence across signals based on relative position is sensible. PMID:23997376
Martinez, Josue G; Bohn, Kirsten M; Carroll, Raymond J; Morris, Jeffrey S
2013-06-01
We describe a new approach to analyze chirp syllables of free-tailed bats from two regions of Texas in which they are predominant: Austin and College Station. Our goal is to characterize any systematic regional differences in the mating chirps and assess whether individual bats have signature chirps. The data are analyzed by modeling spectrograms of the chirps as responses in a Bayesian functional mixed model. Given the variable chirp lengths, we compute the spectrograms on a relative time scale interpretable as the relative chirp position, using a variable window overlap based on chirp length. We use 2D wavelet transforms to capture correlation within the spectrogram in our modeling and obtain adaptive regularization of the estimates and inference for the regions-specific spectrograms. Our model includes random effect spectrograms at the bat level to account for correlation among chirps from the same bat, and to assess relative variability in chirp spectrograms within and between bats. The modeling of spectrograms using functional mixed models is a general approach for the analysis of replicated nonstationary time series, such as our acoustical signals, to relate aspects of the signals to various predictors, while accounting for between-signal structure. This can be done on raw spectrograms when all signals are of the same length, and can be done using spectrograms defined on a relative time scale for signals of variable length in settings where the idea of defining correspondence across signals based on relative position is sensible. PMID:23997376
A Bayesian hierarchical surrogate outcome model for multiple sclerosis.
Pozzi, Luca; Schmidli, Heinz; Ohlssen, David I
2016-07-01
The development of novel therapies in multiple sclerosis (MS) is one area where a range of surrogate outcomes are used in various stages of clinical research. While the aim of treatments in MS is to prevent disability, a clinical trial for evaluating a drugs effect on disability progression would require a large sample of patients with many years of follow-up. The early stage of MS is characterized by relapses. To reduce study size and duration, clinical relapses are accepted as primary endpoints in phase III trials. For phase II studies, the primary outcomes are typically lesion counts based on magnetic resonance imaging (MRI), as these are considerably more sensitive than clinical measures for detecting MS activity. Recently, Sormani and colleagues in 'Surrogate endpoints for EDSS worsening in multiple sclerosis' provided a systematic review and used weighted regression analyses to examine the role of either MRI lesions or relapses as trial level surrogate outcomes for disability. We build on this work by developing a Bayesian three-level model, accommodating the two surrogates and the disability endpoint, and properly taking into account that treatment effects are estimated with errors. Specifically, a combination of treatment effects based on MRI lesion count outcomes and clinical relapse was used to develop a study-level surrogate outcome model for the corresponding treatment effects based on disability progression. While the primary aim for developing this model was to support decision-making in drug development, the proposed model may also be considered for future validation. Copyright © 2016 John Wiley & Sons, Ltd. PMID:27061897
Bayesian Safety Risk Modeling of Human-Flightdeck Automation Interaction
NASA Technical Reports Server (NTRS)
Ancel, Ersin; Shih, Ann T.
2015-01-01
Usage of automatic systems in airliners has increased fuel efficiency, added extra capabilities, enhanced safety and reliability, as well as provide improved passenger comfort since its introduction in the late 80's. However, original automation benefits, including reduced flight crew workload, human errors or training requirements, were not achieved as originally expected. Instead, automation introduced new failure modes, redistributed, and sometimes increased workload, brought in new cognitive and attention demands, and increased training requirements. Modern airliners have numerous flight modes, providing more flexibility (and inherently more complexity) to the flight crew. However, the price to pay for the increased flexibility is the need for increased mode awareness, as well as the need to supervise, understand, and predict automated system behavior. Also, over-reliance on automation is linked to manual flight skill degradation and complacency in commercial pilots. As a result, recent accidents involving human errors are often caused by the interactions between humans and the automated systems (e.g., the breakdown in man-machine coordination), deteriorated manual flying skills, and/or loss of situational awareness due to heavy dependence on automated systems. This paper describes the development of the increased complexity and reliance on automation baseline model, named FLAP for FLightdeck Automation Problems. The model development process starts with a comprehensive literature review followed by the construction of a framework comprised of high-level causal factors leading to an automation-related flight anomaly. The framework was then converted into a Bayesian Belief Network (BBN) using the Hugin Software v7.8. The effects of automation on flight crew are incorporated into the model, including flight skill degradation, increased cognitive demand and training requirements along with their interactions. Besides flight crew deficiencies, automation system
Estimating seabed scattering mechanisms via Bayesian model selection.
Steininger, Gavin; Dosso, Stan E; Holland, Charles W; Dettmer, Jan
2014-10-01
A quantitative inversion procedure is developed and applied to determine the dominant scattering mechanism (surface roughness and/or volume scattering) from seabed scattering-strength data. The classification system is based on trans-dimensional Bayesian inversion with the deviance information criterion used to select the dominant scattering mechanism. Scattering is modeled using first-order perturbation theory as due to one of three mechanisms: Interface scattering from a rough seafloor, volume scattering from a heterogeneous sediment layer, or mixed scattering combining both interface and volume scattering. The classification system is applied to six simulated test cases where it correctly identifies the true dominant scattering mechanism as having greater support from the data in five cases; the remaining case is indecisive. The approach is also applied to measured backscatter-strength data where volume scattering is determined as the dominant scattering mechanism. Comparison of inversion results with core data indicates the method yields both a reasonable volume heterogeneity size distribution and a good estimate of the sub-bottom depths at which scatterers occur. PMID:25324059
NASA Astrophysics Data System (ADS)
Zhu, Qi; Burzykowski, Tomasz
2011-03-01
To reduce the influence of the between-spectra variability on the results of peptide quantification, one can consider the 18O-labeling approach. Ideally, with such labeling technique, a mass shift of 4 Da of the isotopic distributions of peptides from the labeled sample is induced, which allows one to distinguish the two samples and to quantify the relative abundance of the peptides. It is worth noting, however, that the presence of small quantities of 16O and 17O atoms during the labeling step can cause incomplete labeling. In practice, ignoring incomplete labeling may result in the biased estimation of the relative abundance of the peptide in the compared samples. A Markov model was developed to address this issue (Zhu, Valkenborg, Burzykowski. J. Proteome Res. 9, 2669-2677, 2010). The model assumed that the peak intensities were normally distributed with heteroscedasticity using a power-of-the-mean variance funtion. Such a dependence has been observed in practice. Alternatively, we formulate the model within the Bayesian framework. This opens the possibility to further extend the model by the inclusion of random effects that can be used to capture the biological/technical variability of the peptide abundance. The operational characteristics of the model were investigated by applications to real-life mass-spectrometry data sets and a simulation study.
Bayesian Model Averaging of Artificial Intelligence Models for Hydraulic Conductivity Estimation
NASA Astrophysics Data System (ADS)
Nadiri, A.; Chitsazan, N.; Tsai, F. T.; Asghari Moghaddam, A.
2012-12-01
This research presents a Bayesian artificial intelligence model averaging (BAIMA) method that incorporates multiple artificial intelligence (AI) models to estimate hydraulic conductivity and evaluate estimation uncertainties. Uncertainty in the AI model outputs stems from error in model input as well as non-uniqueness in selecting different AI methods. Using one single AI model tends to bias the estimation and underestimate uncertainty. BAIMA employs Bayesian model averaging (BMA) technique to address the issue of using one single AI model for estimation. BAIMA estimates hydraulic conductivity by averaging the outputs of AI models according to their model weights. In this study, the model weights were determined using the Bayesian information criterion (BIC) that follows the parsimony principle. BAIMA calculates the within-model variances to account for uncertainty propagation from input data to AI model output. Between-model variances are evaluated to account for uncertainty due to model non-uniqueness. We employed Takagi-Sugeno fuzzy logic (TS-FL), artificial neural network (ANN) and neurofuzzy (NF) to estimate hydraulic conductivity for the Tasuj plain aquifer, Iran. BAIMA combined three AI models and produced better fitting than individual models. While NF was expected to be the best AI model owing to its utilization of both TS-FL and ANN models, the NF model is nearly discarded by the parsimony principle. The TS-FL model and the ANN model showed equal importance although their hydraulic conductivity estimates were quite different. This resulted in significant between-model variances that are normally ignored by using one AI model.
Bayesian Proteoform Modeling Improves Protein Quantification of Global Proteomic Measurements
Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Datta, Susmita; Payne, Samuel H.; Kang, Jiyun; Bramer, Lisa M.; Nicora, Carrie D.; Shukla, Anil K.; Metz, Thomas O.; Rodland, Karin D.; Smith, Richard D.; Tardiff, Mark F.; McDermott, Jason E.; Pounds, Joel G.; Waters, Katrina M.
2014-12-01
As the capability of mass spectrometry-based proteomics has matured, tens of thousands of peptides can be measured simultaneously, which has the benefit of offering a systems view of protein expression. However, a major challenge is that with an increase in throughput, protein quantification estimation from the native measured peptides has become a computational task. A limitation to existing computationally-driven protein quantification methods is that most ignore protein variation, such as alternate splicing of the RNA transcript and post-translational modifications or other possible proteoforms, which will affect a significant fraction of the proteome. The consequence of this assumption is that statistical inference at the protein level, and consequently downstream analyses, such as network and pathway modeling, have only limited power for biomarker discovery. Here, we describe a Bayesian model (BP-Quant) that uses statistically derived peptides signatures to identify peptides that are outside the dominant pattern, or the existence of multiple over-expressed patterns to improve relative protein abundance estimates. It is a research-driven approach that utilizes the objectives of the experiment, defined in the context of a standard statistical hypothesis, to identify a set of peptides exhibiting similar statistical behavior relating to a protein. This approach infers that changes in relative protein abundance can be used as a surrogate for changes in function, without necessarily taking into account the effect of differential post-translational modifications, processing, or splicing in altering protein function. We verify the approach using a dilution study from mouse plasma samples and demonstrate that BP-Quant achieves similar accuracy as the current state-of-the-art methods at proteoform identification with significantly better specificity. BP-Quant is available as a MatLab ® and R packages at https://github.com/PNNL-Comp-Mass-Spec/BP-Quant.
Inherently irrational? A computational model of escalation of commitment as Bayesian Updating.
Gilroy, Shawn P; Hantula, Donald A
2016-06-01
Monte Carlo simulations were performed to analyze the degree to which two-, three- and four-step learning histories of losses and gains correlated with escalation and persistence in extended extinction (continuous loss) conditions. Simulated learning histories were randomly generated at varying lengths and compositions and warranted probabilities were determined using Bayesian Updating methods. Bayesian Updating predicted instances where particular learning sequences were more likely to engender escalation and persistence under extinction conditions. All simulations revealed greater rates of escalation and persistence in the presence of heterogeneous (e.g., both Wins and Losses) lag sequences, with substantially increased rates of escalation when lags comprised predominantly of losses were followed by wins. These methods were then applied to human investment choices in earlier experiments. The Bayesian Updating models corresponded with data obtained from these experiments. These findings suggest that Bayesian Updating can be utilized as a model for understanding how and when individual commitment may escalate and persist despite continued failures. PMID:26945510
Ice Shelf Modeling: A Cross-Polar Bayesian Statistical Approach
NASA Astrophysics Data System (ADS)
Kirchner, N.; Furrer, R.; Jakobsson, M.; Zwally, H. J.
2010-12-01
Ice streams interlink glacial terrestrial and marine environments: embedded in a grounded inland ice such as the Antarctic Ice Sheet or the paleo ice sheets covering extensive parts of the Eurasian and Amerasian Arctic respectively, ice streams are major drainage agents facilitating the discharge of substantial portions of continental ice into the ocean. At their seaward side, ice streams can either extend onto the ocean as floating ice tongues (such as the Drygalsky Ice Tongue/East Antarctica), or feed large ice shelves (as is the case for e.g. the Siple Coast and the Ross Ice Shelf/West Antarctica). The flow behavior of ice streams has been recognized to be intimately linked with configurational changes in their attached ice shelves; in particular, ice shelf disintegration is associated with rapid ice stream retreat and increased mass discharge from the continental ice mass, contributing eventually to sea level rise. Investigations of ice stream retreat mechanism are however incomplete if based on terrestrial records only: rather, the dynamics of ice shelves (and, eventually, the impact of the ocean on the latter) must be accounted for. However, since floating ice shelves leave hardly any traces behind when melting, uncertainty regarding the spatio-temporal distribution and evolution of ice shelves in times prior to instrumented and recorded observation is high, calling thus for a statistical modeling approach. Complementing ongoing large-scale numerical modeling efforts (Pollard & DeConto, 2009), we model the configuration of ice shelves by using a Bayesian Hiearchial Modeling (BHM) approach. We adopt a cross-polar perspective accounting for the fact that currently, ice shelves exist mainly along the coastline of Antarctica (and are virtually non-existing in the Arctic), while Arctic Ocean ice shelves repeatedly impacted the Arctic ocean basin during former glacial periods. Modeled Arctic ocean ice shelf configurations are compared with geological spatial
Bayesian Model Comparison for the Order Restricted RC Association Model
ERIC Educational Resources Information Center
Iliopoulos, G.; Kateri, M.; Ntzoufras, I.
2009-01-01
Association models constitute an attractive alternative to the usual log-linear models for modeling the dependence between classification variables. They impose special structure on the underlying association by assigning scores on the levels of each classification variable, which can be fixed or parametric. Under the general row-column (RC)…
Hierarchical Bayesian Model Averaging for Chance Constrained Remediation Designs
NASA Astrophysics Data System (ADS)
Chitsazan, N.; Tsai, F. T.
2012-12-01
Groundwater remediation designs are heavily relying on simulation models which are subjected to various sources of uncertainty in their predictions. To develop a robust remediation design, it is crucial to understand the effect of uncertainty sources. In this research, we introduce a hierarchical Bayesian model averaging (HBMA) framework to segregate and prioritize sources of uncertainty in a multi-layer frame, where each layer targets a source of uncertainty. The HBMA framework provides an insight to uncertainty priorities and propagation. In addition, HBMA allows evaluating model weights in different hierarchy levels and assessing the relative importance of models in each level. To account for uncertainty, we employ a chance constrained (CC) programming for stochastic remediation design. Chance constrained programming was implemented traditionally to account for parameter uncertainty. Recently, many studies suggested that model structure uncertainty is not negligible compared to parameter uncertainty. Using chance constrained programming along with HBMA can provide a rigorous tool for groundwater remediation designs under uncertainty. In this research, the HBMA-CC was applied to a remediation design in a synthetic aquifer. The design was to develop a scavenger well approach to mitigate saltwater intrusion toward production wells. HBMA was employed to assess uncertainties from model structure, parameter estimation and kriging interpolation. An improved harmony search optimization method was used to find the optimal location of the scavenger well. We evaluated prediction variances of chloride concentration at the production wells through the HBMA framework. The results showed that choosing the single best model may lead to a significant error in evaluating prediction variances for two reasons. First, considering the single best model, variances that stem from uncertainty in the model structure will be ignored. Second, considering the best model with non
Bayesian estimation of regularization parameters for deformable surface models
Cunningham, G.S.; Lehovich, A.; Hanson, K.M.
1999-02-20
In this article the authors build on their past attempts to reconstruct a 3D, time-varying bolus of radiotracer from first-pass data obtained by the dynamic SPECT imager, FASTSPECT, built by the University of Arizona. The object imaged is a CardioWest total artificial heart. The bolus is entirely contained in one ventricle and its associated inlet and outlet tubes. The model for the radiotracer distribution at a given time is a closed surface parameterized by 482 vertices that are connected to make 960 triangles, with nonuniform intensity variations of radiotracer allowed inside the surface on a voxel-to-voxel basis. The total curvature of the surface is minimized through the use of a weighted prior in the Bayesian framework, as is the weighted norm of the gradient of the voxellated grid. MAP estimates for the vertices, interior intensity voxels and background count level are produced. The strength of the priors, or hyperparameters, are determined by maximizing the probability of the data given the hyperparameters, called the evidence. The evidence is calculated by first assuming that the posterior is approximately normal in the values of the vertices and voxels, and then by evaluating the integral of the multi-dimensional normal distribution. This integral (which requires evaluating the determinant of a covariance matrix) is computed by applying a recent algorithm from Bai et. al. that calculates the needed determinant efficiently. They demonstrate that the radiotracer is highly inhomogeneous in early time frames, as suspected in earlier reconstruction attempts that assumed a uniform intensity of radiotracer within the closed surface, and that the optimal choice of hyperparameters is substantially different for different time frames.
Yi, Nengjun; Shriner, Daniel; Banerjee, Samprit; Mehta, Tapan; Pomp, Daniel; Yandell, Brian S.
2007-01-01
We extend our Bayesian model selection framework for mapping epistatic QTL in experimental crosses to include environmental effects and gene–environment interactions. We propose a new, fast Markov chain Monte Carlo algorithm to explore the posterior distribution of unknowns. In addition, we take advantage of any prior knowledge about genetic architecture to increase posterior probability on more probable models. These enhancements have significant computational advantages in models with many effects. We illustrate the proposed method by detecting new epistatic and gene–sex interactions for obesity-related traits in two real data sets of mice. Our method has been implemented in the freely available package R/qtlbim (http://www.qtlbim.org) to facilitate the general usage of the Bayesian methodology for genomewide interacting QTL analysis. PMID:17483424
A Bayesian Model for Determining Levels of Student Mastery.
ERIC Educational Resources Information Center
Schmalz, Steve W.; Cartledge, Carolyn M.
During the last decade the use of Bayesian statistical method has become quite prevalent in the educational community. Yet, like most statistical techniques, little has been written concerning the application of these methods to the classroom setting. The purpose of this paper is to help correct such a deficiency in the literature by developing a…
Bayesian Modeling in Institutional Research: An Example of Nonlinear Classification
ERIC Educational Resources Information Center
Xu, Yonghong Jade; Ishitani, Terry T.
2008-01-01
In recent years, rapid advancement has taken place in computing technology that allows institutional researchers to efficiently and effectively address data of increasing volume and structural complexity (Luan, 2002). In this chapter, the authors propose a new data analytical technique, Bayesian belief networks (BBN), to add to the toolbox for…
Model Criticism of Bayesian Networks with Latent Variables.
ERIC Educational Resources Information Center
Williamson, David M.; Mislevy, Robert J.; Almond, Russell G.
This study investigated statistical methods for identifying errors in Bayesian networks (BN) with latent variables, as found in intelligent cognitive assessments. BN, commonly used in artificial intelligence systems, are promising mechanisms for scoring constructed-response examinations. The success of an intelligent assessment or tutoring system…
A Comparison of Imputation Methods for Bayesian Factor Analysis Models
ERIC Educational Resources Information Center
Merkle, Edgar C.
2011-01-01
Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…
Open Source Bayesian Models. 3. Composite Models for Prediction of Binned Responses
2016-01-01
Bayesian models constructed from structure-derived fingerprints have been a popular and useful method for drug discovery research when applied to bioactivity measurements that can be effectively classified as active or inactive. The results can be used to rank candidate structures according to their probability of activity, and this ranking benefits from the high degree of interpretability when structure-based fingerprints are used, making the results chemically intuitive. Besides selecting an activity threshold, building a Bayesian model is fast and requires few or no parameters or user intervention. The method also does not suffer from such acute overtraining problems as quantitative structure–activity relationships or quantitative structure–property relationships (QSAR/QSPR). This makes it an approach highly suitable for automated workflows that are independent of user expertise or prior knowledge of the training data. We now describe a new method for creating a composite group of Bayesian models to extend the method to work with multiple states, rather than just binary. Incoming activities are divided into bins, each covering a mutually exclusive range of activities. For each of these bins, a Bayesian model is created to model whether or not the compound belongs in the bin. Analyzing putative molecules using the composite model involves making a prediction for each bin and examining the relative likelihood for each assignment, for example, highest value wins. The method has been evaluated on a collection of hundreds of data sets extracted from ChEMBL v20 and validated data sets for ADME/Tox and bioactivity. PMID:26750305
A Bayesian Model for Pooling Gene Expression Studies That Incorporates Co-Regulation Information
Conlon, Erin M.; Postier, Bradley L.; Methé, Barbara A.; Nevin, Kelly P.; Lovley, Derek R.
2012-01-01
Current Bayesian microarray models that pool multiple studies assume gene expression is independent of other genes. However, in prokaryotic organisms, genes are arranged in units that are co-regulated (called operons). Here, we introduce a new Bayesian model for pooling gene expression studies that incorporates operon information into the model. Our Bayesian model borrows information from other genes within the same operon to improve estimation of gene expression. The model produces the gene-specific posterior probability of differential expression, which is the basis for inference. We found in simulations and in biological studies that incorporating co-regulation information improves upon the independence model. We assume that each study contains two experimental conditions: a treatment and control. We note that there exist environmental conditions for which genes that are supposed to be transcribed together lose their operon structure, and that our model is best carried out for known operon structures. PMID:23284902
ERIC Educational Resources Information Center
Karl, Andrew T.; Yang, Yan; Lohr, Sharon L.
2013-01-01
Value-added models have been widely used to assess the contributions of individual teachers and schools to students' academic growth based on longitudinal student achievement outcomes. There is concern, however, that ignoring the presence of missing values, which are common in longitudinal studies, can bias teachers' value-added scores.…
Ksantini, Riadh; Ziou, Djemel; Colin, Bernard; Dubeau, François
2008-02-01
In this paper, we investigate the effectiveness of a Bayesian logistic regression model to compute the weights of a pseudo-metric, in order to improve its discriminatory capacity and thereby increase image retrieval accuracy. In the proposed Bayesian model, the prior knowledge of the observations is incorporated and the posterior distribution is approximated by a tractable Gaussian form using variational transformation and Jensen's inequality, which allow a fast and straightforward computation of the weights. The pseudo-metric makes use of the compressed and quantized versions of wavelet decomposed feature vectors, and in our previous work, the weights were adjusted by classical logistic regression model. A comparative evaluation of the Bayesian and classical logistic regression models is performed for content-based image retrieval as well as for other classification tasks, in a decontextualized evaluation framework. In this same framework, we compare the Bayesian logistic regression model to some relevant state-of-the-art classification algorithms. Experimental results show that the Bayesian logistic regression model outperforms these linear classification algorithms, and is a significantly better tool than the classical logistic regression model to compute the pseudo-metric weights and improve retrieval and classification performance. Finally, we perform a comparison with results obtained by other retrieval methods. PMID:18084057
Parameterizing Bayesian network Representations of Social-Behavioral Models by Expert Elicitation
Walsh, Stephen J.; Dalton, Angela C.; Whitney, Paul D.; White, Amanda M.
2010-05-23
Bayesian networks provide a general framework with which to model many natural phenomena. The mathematical nature of Bayesian networks enables a plethora of model validation and calibration techniques: e.g parameter estimation, goodness of fit tests, and diagnostic checking of the model assumptions. However, they are not free of shortcomings. Parameter estimation from relevant extant data is a common approach to calibrating the model parameters. In practice it is not uncommon to find oneself lacking adequate data to reliably estimate all model parameters. In this paper we present the early development of a novel application of conjoint analysis as a method for eliciting and modeling expert opinions and using the results in a methodology for calibrating the parameters of a Bayesian network.
Placek, Ben; Knuth, Kevin H.; Angerhausen, Daniel E-mail: kknuth@albany.edu
2014-11-10
EXONEST is an algorithm dedicated to detecting and characterizing the photometric signatures of exoplanets, which include reflection and thermal emission, Doppler boosting, and ellipsoidal variations. Using Bayesian inference, we can test between competing models that describe the data as well as estimate model parameters. We demonstrate this approach by testing circular versus eccentric planetary orbital models, as well as testing for the presence or absence of four photometric effects. In addition to using Bayesian model selection, a unique aspect of EXONEST is the potential capability to distinguish between reflective and thermal contributions to the light curve. A case study is presented using Kepler data recorded from the transiting planet KOI-13b. By considering only the nontransiting portions of the light curve, we demonstrate that it is possible to estimate the photometrically relevant model parameters of KOI-13b. Furthermore, Bayesian model testing confirms that the orbit of KOI-13b has a detectable eccentricity.
A Bayesian hierarchical diffusion model decomposition of performance in Approach–Avoidance Tasks
Krypotos, Angelos-Miltiadis; Beckers, Tom; Kindt, Merel; Wagenmakers, Eric-Jan
2015-01-01
Common methods for analysing response time (RT) tasks, frequently used across different disciplines of psychology, suffer from a number of limitations such as the failure to directly measure the underlying latent processes of interest and the inability to take into account the uncertainty associated with each individual's point estimate of performance. Here, we discuss a Bayesian hierarchical diffusion model and apply it to RT data. This model allows researchers to decompose performance into meaningful psychological processes and to account optimally for individual differences and commonalities, even with relatively sparse data. We highlight the advantages of the Bayesian hierarchical diffusion model decomposition by applying it to performance on Approach–Avoidance Tasks, widely used in the emotion and psychopathology literature. Model fits for two experimental data-sets demonstrate that the model performs well. The Bayesian hierarchical diffusion model overcomes important limitations of current analysis procedures and provides deeper insight in latent psychological processes of interest. PMID:25491372
Number-Knower Levels in Young Children: Insights from Bayesian Modeling
ERIC Educational Resources Information Center
Lee, Michael D.; Sarnecka, Barbara W.
2011-01-01
Lee and Sarnecka (2010) developed a Bayesian model of young children's behavior on the Give-N test of number knowledge. This paper presents two new extensions of the model, and applies the model to new data. In the first extension, the model is used to evaluate competing theories about the conceptual knowledge underlying children's behavior. One,…
Carter, Barry; Clarke, William; Ardery, Gail; Weber, Cynthia; James, Paul; Weg, Mark Vander; Chrischilles, Elizabeth; Vaughn, Thomas; Egan, Brent
2010-01-01
Background Numerous studies have demonstrated the value of team-based care to improve blood pressure (BP) control but there is limited information on whether these models would be adopted in diverse populations. The purpose of this study is to evaluate whether a collaborative model between physicians and pharmacists can improve BP control in multiple primary care medical offices with diverse geographic and patient characteristics and whether long-term BP control can be sustained. Methods This study is a randomized prospective trial in 27 primary care offices first stratified by the percent of under-represented minorities and the level of clinical pharmacy services within the office. Each medical office was then randomized to either a 9 or 24 month intervention or to a control group. Patients will be enrolled in this study until 2012. Conclusions The results of this study should provide information on whether this model can be implemented in large numbers of diverse offices, if it is effective in diverse populations and whether BP control can be sustained long-term. PMID:20647575
A Bayesian approach to model structural error and input variability in groundwater modeling
NASA Astrophysics Data System (ADS)
Xu, T.; Valocchi, A. J.; Lin, Y. F. F.; Liang, F.
2015-12-01
Effective water resource management typically relies on numerical models to analyze groundwater flow and solute transport processes. Model structural error (due to simplification and/or misrepresentation of the "true" environmental system) and input forcing variability (which commonly arises since some inputs are uncontrolled or estimated with high uncertainty) are ubiquitous in groundwater models. Calibration that overlooks errors in model structure and input data can lead to biased parameter estimates and compromised predictions. We present a fully Bayesian approach for a complete assessment of uncertainty for spatially distributed groundwater models. The approach explicitly recognizes stochastic input and uses data-driven error models based on nonparametric kernel methods to account for model structural error. We employ exploratory data analysis to assist in specifying informative prior for error models to improve identifiability. The inference is facilitated by an efficient sampling algorithm based on DREAM-ZS and a parameter subspace multiple-try strategy to reduce the required number of forward simulations of the groundwater model. We demonstrate the Bayesian approach through a synthetic case study of surface-ground water interaction under changing pumping conditions. It is found that explicit treatment of errors in model structure and input data (groundwater pumping rate) has substantial impact on the posterior distribution of groundwater model parameters. Using error models reduces predictive bias caused by parameter compensation. In addition, input variability increases parametric and predictive uncertainty. The Bayesian approach allows for a comparison among the contributions from various error sources, which could inform future model improvement and data collection efforts on how to best direct resources towards reducing predictive uncertainty.
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision. PMID:27303323
Bayesian Nonparametric Inference – Why and How
Müller, Peter; Mitra, Riten
2013-01-01
We review inference under models with nonparametric Bayesian (BNP) priors. The discussion follows a set of examples for some common inference problems. The examples are chosen to highlight problems that are challenging for standard parametric inference. We discuss inference for density estimation, clustering, regression and for mixed effects models with random effects distributions. While we focus on arguing for the need for the flexibility of BNP models, we also review some of the more commonly used BNP models, thus hopefully answering a bit of both questions, why and how to use BNP. PMID:24368932
On Numerical Aspects of Bayesian Model Selection in High and Ultrahigh-dimensional Settings
Johnson, Valen E.
2014-01-01
This article examines the convergence properties of a Bayesian model selection procedure based on a non-local prior density in ultrahigh-dimensional settings. The performance of the model selection procedure is also compared to popular penalized likelihood methods. Coupling diagnostics are used to bound the total variation distance between iterates in an Markov chain Monte Carlo (MCMC) algorithm and the posterior distribution on the model space. In several simulation scenarios in which the number of observations exceeds 100, rapid convergence and high accuracy of the Bayesian procedure is demonstrated. Conversely, the coupling diagnostics are successful in diagnosing lack of convergence in several scenarios for which the number of observations is less than 100. The accuracy of the Bayesian model selection procedure in identifying high probability models is shown to be comparable to commonly used penalized likelihood methods, including extensions of smoothly clipped absolute deviations (SCAD) and least absolute shrinkage and selection operator (LASSO) procedures. PMID:24683431
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Wöhling, Thomas; Samaniego, Luis; Nowak, Wolfgang
2014-12-01
Bayesian model selection or averaging objectively ranks a number of plausible, competing conceptual models based on Bayes' theorem. It implicitly performs an optimal trade-off between performance in fitting available data and minimum model complexity. The procedure requires determining Bayesian model evidence (BME), which is the likelihood of the observed data integrated over each model's parameter space. The computation of this integral is highly challenging because it is as high-dimensional as the number of model parameters. Three classes of techniques to compute BME are available, each with its own challenges and limitations: (1) Exact and fast analytical solutions are limited by strong assumptions. (2) Numerical evaluation quickly becomes unfeasible for expensive models. (3) Approximations known as information criteria (ICs) such as the AIC, BIC, or KIC (Akaike, Bayesian, or Kashyap information criterion, respectively) yield contradicting results with regard to model ranking. Our study features a theory-based intercomparison of these techniques. We further assess their accuracy in a simplistic synthetic example where for some scenarios an exact analytical solution exists. In more challenging scenarios, we use a brute-force Monte Carlo integration method as reference. We continue this analysis with a real-world application of hydrological model selection. This is a first-time benchmarking of the various methods for BME evaluation against true solutions. Results show that BME values from ICs are often heavily biased and that the choice of approximation method substantially influences the accuracy of model ranking. For reliable model selection, bias-free numerical methods should be preferred over ICs whenever computationally feasible.
Schöniger, Anneli; Wöhling, Thomas; Samaniego, Luis; Nowak, Wolfgang
2014-01-01
Bayesian model selection or averaging objectively ranks a number of plausible, competing conceptual models based on Bayes' theorem. It implicitly performs an optimal trade-off between performance in fitting available data and minimum model complexity. The procedure requires determining Bayesian model evidence (BME), which is the likelihood of the observed data integrated over each model's parameter space. The computation of this integral is highly challenging because it is as high-dimensional as the number of model parameters. Three classes of techniques to compute BME are available, each with its own challenges and limitations: (1) Exact and fast analytical solutions are limited by strong assumptions. (2) Numerical evaluation quickly becomes unfeasible for expensive models. (3) Approximations known as information criteria (ICs) such as the AIC, BIC, or KIC (Akaike, Bayesian, or Kashyap information criterion, respectively) yield contradicting results with regard to model ranking. Our study features a theory-based intercomparison of these techniques. We further assess their accuracy in a simplistic synthetic example where for some scenarios an exact analytical solution exists. In more challenging scenarios, we use a brute-force Monte Carlo integration method as reference. We continue this analysis with a real-world application of hydrological model selection. This is a first-time benchmarking of the various methods for BME evaluation against true solutions. Results show that BME values from ICs are often heavily biased and that the choice of approximation method substantially influences the accuracy of model ranking. For reliable model selection, bias-free numerical methods should be preferred over ICs whenever computationally feasible. PMID:25745272
Bayesian shared frailty models for regional inference about wildlife survival
Heisey, D.M.
2012-01-01
One can joke that 'exciting statistics' is an oxymoron, but it is neither a joke nor an exaggeration to say that these are exciting times to be involved in statistical ecology. As Halstead et al.'s (2012) paper nicely exemplifies, recently developed Bayesian analyses can now be used to extract insights from data using techniques that would have been unavailable to the ecological researcher just a decade ago. Some object to this, implying that the subjective priors of the Bayesian approach is the pathway to perdition (e.g. Lele & Dennis, 2009). It is reasonable to ask whether these new approaches are really giving us anything that we could not obtain with traditional tried-and-true frequentist approaches. I believe the answer is a clear yes.
ERIC Educational Resources Information Center
Natesan, Prathiba; Limbers, Christine; Varni, James W.
2010-01-01
The present study presents the formulation of graded response models in the multilevel framework (as nonlinear mixed models) and demonstrates their use in estimating item parameters and investigating the group-level effects for specific covariates using Bayesian estimation. The graded response multilevel model (GRMM) combines the formulation of…
Bayesian Comparison of Alternative Graded Response Models for Performance Assessment Applications
ERIC Educational Resources Information Center
Zhu, Xiaowen; Stone, Clement A.
2012-01-01
This study examined the relative effectiveness of Bayesian model comparison methods in selecting an appropriate graded response (GR) model for performance assessment applications. Three popular methods were considered: deviance information criterion (DIC), conditional predictive ordinate (CPO), and posterior predictive model checking (PPMC). Using…
NASA Astrophysics Data System (ADS)
Kim, Tae-Jeong; Kim, Ki-Young; Shin, Dong-Hoon; Kwon, Hyun-Han
2015-04-01
It has been widely acknowledged that the appropriate simulation of natural streamflow at ungauged sites is one of the fundamental challenges to hydrology community. In particular, the key to reliable runoff simulation in ungauged basins is a reliable rainfall-runoff model and a parameter estimation. In general, parameter estimation in rainfall-runoff models is a complex issue due to an insufficient hydrologic data. This study aims to regionalize the parameters of the continuous rainfall-runoff model in conjunction with Bayesian statistical techniques to facilitate uncertainty analysis. First, this study uses the Bayesian Markov Chain Monte Carlo scheme for the Sacramento rainfall-runoff model that has been widely used around the world. The Sacramento model is calibrated against daily runoff observation, and thirteen parameters of the model are optimized as well as posterior distributor distributions for each parameter are derived. Second, we applied Bayesian generalized linear regression model to set of the parameters with basin characteristics (e.g. area and slope), to obtain a functional relationship between pairs of variables. The proposed model was validated in two gauged watersheds in accordance with the efficiency criteria such as the Nash-Sutcliffe efficiency, coefficient of efficiency, index of agreement and coefficient of correlation. The future study will be further focused on uncertainty analysis to fully incorporate propagation of the uncertainty into the regionalization framework. KEYWORDS: Ungauge, Parameter, Sacramento, Generalized linear model, Regionalization Acknowledgement This research was supported by a Grant (13SCIPA01) from Smart Civil Infrastructure Research Program funded by the Ministry of Land, Infrastructure and Transport (MOLIT) of Korea government and the Korea Agency for Infrastructure Technology Advancement (KAIA).
D. L. Kelly
2007-06-01
Markov chain Monte Carlo (MCMC) techniques represent an extremely flexible and powerful approach to Bayesian modeling. This work illustrates the application of such techniques to time-dependent reliability of components with repair. The WinBUGS package is used to illustrate, via examples, how Bayesian techniques can be used for parametric statistical modeling of time-dependent component reliability. Additionally, the crucial, but often overlooked subject of model validation is discussed, and summary statistics for judging the model’s ability to replicate the observed data are developed, based on the posterior predictive distribution for the parameters of interest.
NASA Astrophysics Data System (ADS)
Werner, J. P.; Tingley, M. P.
2014-12-01
Reconstructions of late-Holocene climate rely heavily upon proxies that are assumed to be accurately dated by layer counting, such as measurement on tree rings, ice cores, and varved lake sediments. Considerable advances may be achievable if time uncertain proxies could be included within these multiproxy reconstructions, and if time uncertainties were recognized and correctly modeled for proxies commonly treated as free of age model errors. Current approaches to accounting for time uncertainty are generally limited to repeating the reconstruction using each of an ensemble of age models, thereby inflating the final estimated uncertainty - in effect, each possible age model is given equal weighting. Uncertainties can be reduced by exploiting the inferred space-time covariance structure of the climate to re-weight the possible age models. Here we demonstrate how Bayesian Hierarchical climate reconstruction models can be augmented to account for time uncertain proxies. Critically, while a priori all age models are given equal probability of being correct, the probabilities associated with the age models are formally updated within the Bayesian framework, thereby reducing uncertainties. Numerical experiments show that updating the age-model probabilities decreases uncertainty in the climate reconstruction, as compared with the current de-facto standard of sampling over all age models, provided there is sufficient information from other data sources in the region of the time-uncertain proxy. This approach can readily be generalized to non-layer counted proxies, such as those derived from marine sediments.
NASA Astrophysics Data System (ADS)
Werner, J. P.; Tingley, M. P.
2015-03-01
Reconstructions of the late-Holocene climate rely heavily upon proxies that are assumed to be accurately dated by layer counting, such as measurements of tree rings, ice cores, and varved lake sediments. Considerable advances could be achieved if time-uncertain proxies were able to be included within these multiproxy reconstructions, and if time uncertainties were recognized and correctly modeled for proxies commonly treated as free of age model errors. Current approaches for accounting for time uncertainty are generally limited to repeating the reconstruction using each one of an ensemble of age models, thereby inflating the final estimated uncertainty - in effect, each possible age model is given equal weighting. Uncertainties can be reduced by exploiting the inferred space-time covariance structure of the climate to re-weight the possible age models. Here, we demonstrate how Bayesian hierarchical climate reconstruction models can be augmented to account for time-uncertain proxies. Critically, although a priori all age models are given equal probability of being correct, the probabilities associated with the age models are formally updated within the Bayesian framework, thereby reducing uncertainties. Numerical experiments show that updating the age model probabilities decreases uncertainty in the resulting reconstructions, as compared with the current de facto standard of sampling over all age models, provided there is sufficient information from other data sources in the spatial region of the time-uncertain proxy. This approach can readily be generalized to non-layer-counted proxies, such as those derived from marine sediments.
NASA Astrophysics Data System (ADS)
Werner, Johannes; Tingley, Martin
2015-04-01
Reconstructions of late-Holocene climate rely heavily upon proxies that are assumed to be accurately dated by layer counting, such as measurement on tree rings, ice cores, and varved lake sediments. Considerable advances may be achievable if time uncertain proxies could be included within these multiproxy reconstructions, and if time uncertainties were recognized and correctly modeled for proxies commonly treated as free of age model errors. Current approaches to accounting for time uncertainty are generally limited to repeating the reconstruction using each of an ensemble of age models, thereby inflating the final estimated uncertainty - in effect, each possible age model is given equal weighting. Uncertainties can be reduced by exploiting the inferred space-time covariance structure of the climate to re-weight the possible age models. Here we demonstrate how Bayesian Hierarchical climate reconstruction models can be augmented to account for time uncertain proxies. Critically, while a priori all age models are given equal probability of being correct, the probabilities associated with the age models are formally updated within the Bayesian framework, thereby reducing uncertainties. Numerical experiments show that updating the age model probabilities decreases uncertainty in the climate reconstruction, as compared with the current de-facto standard of sampling over all age models, provided there is sufficient information from other data sources in the region of the time-uncertain proxy. This approach can readily be generalized to non-layer counted proxies, such as those derived from marine sediments. Werner and Tingley, Climate of the Past Discussions (2014)
Bayesian model selection for a finite element model of a large civil aircraft
Hemez, F. M.; Rutherford, A. C.
2004-01-01
Nine aircraft stiffness parameters have been varied and used as inputs to a finite element model of an aircraft to generate natural frequency and deflection features (Goge, 2003). This data set (147 input parameter configurations and associated outputs) is now used to generate a metamodel, or a fast running surrogate model, using Bayesian model selection methods. Once a forward relationship is defined, the metamodel may be used in an inverse sense. That is, knowing the measured output frequencies and deflections, what were the input stiffness parameters that caused them?
Kapur, Kush; Li, Xue; Blood, Emily A; Hedeker, Donald
2015-02-20
In the statistical literature, the methods to understand the relationship of explanatory variables on each individual outcome variable are well developed and widely applied. However, in most health-related studies given the technological advancement and sophisticated methods of obtaining and storing data, a need to perform joint analysis of multivariate outcomes while explaining the impact of predictors simultaneously and accounting for all the correlations is in high demand. In this manuscript, we propose a generalized approach within a Bayesian framework that models the changes in the variation in terms of explanatory variables and captures the correlations between the multivariate continuous outcomes by the inclusion of random effects at both the location and scale levels. We describe the use of a spherical transformation for the correlations between the random location and scale effects in order to apply separation strategy for prior elicitation while ensuring positive semi-definiteness of the covariance matrix. We present the details of our approach using an example from an ecological momentary assessment study on adolescents. PMID:25409923
Bayesian model selection applied to artificial neural networks used for water resources modeling
NASA Astrophysics Data System (ADS)
Kingston, Greer B.; Maier, Holger R.; Lambert, Martin F.
2008-04-01
Artificial neural networks (ANNs) have proven to be extremely valuable tools in the field of water resources engineering. However, one of the most difficult tasks in developing an ANN is determining the optimum level of complexity required to model a given problem, as there is no formal systematic model selection method. This paper presents a Bayesian model selection (BMS) method for ANNs that provides an objective approach for comparing models of varying complexity in order to select the most appropriate ANN structure. The approach uses Markov Chain Monte Carlo posterior simulations to estimate the evidence in favor of competing models and, in this study, three known methods for doing this are compared in terms of their suitability for being incorporated into the proposed BMS framework for ANNs. However, it is acknowledged that it can be particularly difficult to accurately estimate the evidence of ANN models. Therefore, the proposed BMS approach for ANNs incorporates a further check of the evidence results by inspecting the marginal posterior distributions of the hidden-to-output layer weights, which unambiguously indicate any redundancies in the hidden layer nodes. The fact that this check is available is one of the greatest advantages of the proposed approach over conventional model selection methods, which do not provide such a test and instead rely on the modeler's subjective choice of selection criterion. The advantages of a total Bayesian approach to ANN development, including training and model selection, are demonstrated on two synthetic and one real world water resources case study.
Naznin, Farhana; Currie, Graham; Logan, David; Sarvi, Majid
2016-07-01
Safety is a key concern in the design, operation and development of light rail systems including trams or streetcars as they impose crash risks on road users in terms of crash frequency and severity. The aim of this study is to identify key traffic, transit and route factors that influence tram-involved crash frequencies along tram route sections in Melbourne. A random effects negative binomial (RENB) regression model was developed to analyze crash frequency data obtained from Yarra Trams, the tram operator in Melbourne. The RENB modelling approach can account for spatial and temporal variations within observation groups in panel count data structures by assuming that group specific effects are randomly distributed across locations. The results identify many significant factors effecting tram-involved crash frequency including tram service frequency (2.71), tram stop spacing (-0.42), tram route section length (0.31), tram signal priority (-0.25), general traffic volume (0.18), tram lane priority (-0.15) and ratio of platform tram stops (-0.09). Findings provide useful insights on route section level tram-involved crashes in an urban tram or streetcar operating environment. The method described represents a useful planning tool for transit agencies hoping to improve safety performance. PMID:27035395
Bayesian conditional-independence modeling of the AIDS epidemic in England and Wales
NASA Astrophysics Data System (ADS)
Gilks, Walter R.; De Angelis, Daniela; Day, Nicholas E.
We describe the use of conditional-independence modeling, Bayesian inference and Markov chain Monte Carlo, to model and project the HIV-AIDS epidemic in homosexual/bisexual males in England and Wales. Complexity in this analysis arises through selectively missing data, indirectly observed underlying processes, and measurement error. Our emphasis is on presentation and discussion of the concepts, not on the technicalities of this analysis, which can be found elsewhere [D. De Angelis, W.R. Gilks, N.E. Day, Bayesian projection of the the acquired immune deficiency syndrome epidemic (with discussion), Applied Statistics, in press].
Simplifying Probability Elicitation and Uncertainty Modeling in Bayesian Networks
Paulson, Patrick R; Carroll, Thomas E; Sivaraman, Chitra; Neorr, Peter A; Unwin, Stephen D; Hossain, Shamina S
2011-04-16
In this paper we contribute two methods that simplify the demands of knowledge elicitation for particular types of Bayesian networks. The first method simplify the task of providing probabilities when the states that a random variable takes can be described by a new, fully ordered state set in which a state implies all the preceding states. The second method leverages Dempster-Shafer theory of evidence to provide a way for the expert to express the degree of ignorance that they feel about the estimates being provided.
ERIC Educational Resources Information Center
Kessler, Lawrence M.
2013-01-01
In this paper I propose Bayesian estimation of a nonlinear panel data model with a fractional dependent variable (bounded between 0 and 1). Specifically, I estimate a panel data fractional probit model which takes into account the bounded nature of the fractional response variable. I outline estimation under the assumption of strict exogeneity as…
A Bayesian Multi-Level Factor Analytic Model of Consumer Price Sensitivities across Categories
ERIC Educational Resources Information Center
Duvvuri, Sri Devi; Gruca, Thomas S.
2010-01-01
Identifying price sensitive consumers is an important problem in marketing. We develop a Bayesian multi-level factor analytic model of the covariation among household-level price sensitivities across product categories that are substitutes. Based on a multivariate probit model of category incidence, this framework also allows the researcher to…
ERIC Educational Resources Information Center
Story, Roger E.
1996-01-01
Discussion of the use of Latent Semantic Indexing to determine relevancy in information retrieval focuses on statistical regression and Bayesian methods. Topics include keyword searching; a multiple regression model; how the regression model can aid search methods; and limitations of this approach, including complexity, linearity, and…
Reconstructing Constructivism: Causal Models, Bayesian Learning Mechanisms, and the Theory Theory
ERIC Educational Resources Information Center
Gopnik, Alison; Wellman, Henry M.
2012-01-01
We propose a new version of the "theory theory" grounded in the computational framework of probabilistic causal models and Bayesian learning. Probabilistic models allow a constructivist but rigorous and detailed approach to cognitive development. They also explain the learning of both more specific causal hypotheses and more abstract framework…
Model Reduction of a Transient Groundwater-Flow Model for Bayesian Inverse Problems
NASA Astrophysics Data System (ADS)
Boyce, S. E.; Yeh, W. W.
2011-12-01
A Bayesian inverse problem requires many repeated model simulations to characterize an unknown parameter's posterior probability distribution. It is computationally infeasible to solve a Bayesian inverse problem of a discretized groundwater flow model with a high dimension parameter and state space. Model reduction has been shown to reduce the dimension of a groundwater model by several orders of magnitude and is well suited for Bayesian inverse problems. A projection-based model reduction approach is proposed to reduce the parameter and state dimensions of a groundwater model. Previous work has done this by using a greedy algorithm for the selection of parameter vectors that make up a basis and their corresponding steady-state solutions for a state basis. The proposed method extends this idea to include transient models by assembling sequentially though the greedy algorithm the parameter and state projection bases. The method begins with the parameter basis being a single vector that is equal to one or an accepted series of values. A set of state vectors that are solutions to the groundwater model using this parameter vector at appropriate times is called the parameter snapshot set. The appropriate times for the parameter snapshot set are determined by maximizing the set's minimum singular value. This optimization is a similar to those used in experimental design for maximizing information. The two bases are made orthonormal by a QR decomposition and applied to the full groundwater model to form a reduced model. The parameter basis is increased with a new parameter vector that maximizes the error between the full model and the reduced model at a set of observation times. The new parameter vector represents where the reduced model is least accurate in representing the original full model. The corresponding parameter snapshot set's appropriate times are found using a greedy algorithm. This sequentially chooses times that have maximum error between the full and
2014-01-01
Background Transmission models can aid understanding of disease dynamics and are useful in testing the efficiency of control measures. The aim of this study was to formulate an appropriate stochastic Susceptible-Infectious-Resistant/Carrier (SIR) model for Salmonella Typhimurium in pigs and thus estimate the transmission parameters between states. Results The transmission parameters were estimated using data from a longitudinal study of three Danish farrow-to-finish pig herds known to be infected. A Bayesian model framework was proposed, which comprised Binomial components for the transition from susceptible to infectious and from infectious to carrier; and a Poisson component for carrier to infectious. Cohort random effects were incorporated into these models to allow for unobserved cohort-specific variables as well as unobserved sources of transmission, thus enabling a more realistic estimation of the transmission parameters. In the case of the transition from susceptible to infectious, the cohort random effects were also time varying. The number of infectious pigs not detected by the parallel testing was treated as unknown, and the probability of non-detection was estimated using information about the sensitivity and specificity of the bacteriological and serological tests. The estimate of the transmission rate from susceptible to infectious was 0.33 [0.06, 1.52], from infectious to carrier was 0.18 [0.14, 0.23] and from carrier to infectious was 0.01 [0.0001, 0.04]. The estimate for the basic reproduction ration (R 0 ) was 1.91 [0.78, 5.24]. The probability of non-detection was estimated to be 0.18 [0.12, 0.25]. Conclusions The proposed framework for stochastic SIR models was successfully implemented to estimate transmission rate parameters for Salmonella Typhimurium in swine field data. R 0 was 1.91, implying that there was dissemination of the infection within pigs of the same cohort. There was significant temporal-cohort variability, especially at the
Modelling the presence of disease under spatial misalignment using Bayesian latent Gaussian models.
Barber, Xavier; Conesa, David; Lladosa, Silvia; López-Quílez, Antonio
2016-01-01
Modelling patterns of the spatial incidence of diseases using local environmental factors has been a growing problem in the last few years. Geostatistical models have become popular lately because they allow estimating and predicting the underlying disease risk and relating it with possible risk factors. Our approach to these models is based on the fact that the presence/absence of a disease can be expressed with a hierarchical Bayesian spatial model that incorporates the information provided by the geographical and environmental characteristics of the region of interest. Nevertheless, our main interest here is to tackle the misalignment problem arising when information about possible covariates are partially (or totally) different than those of the observed locations and those in which we want to predict. As a result, we present two different models depending on the fact that there is uncertainty on the covariates or not. In both cases, Bayesian inference on the parameters and prediction of presence/absence in new locations are made by considering the model as a latent Gaussian model, which allows the use of the integrated nested Laplace approximation. In particular, the spatial effect is implemented with the stochastic partial differential equation approach. The methodology is evaluated on the presence of the Fasciola hepatica in Galicia, a North-West region of Spain. PMID:27087038
NASA Astrophysics Data System (ADS)
Xu, Tianfang; Valocchi, Albert J.
2015-11-01
Numerical groundwater flow and solute transport models are usually subject to model structural error due to simplification and/or misrepresentation of the real system, which raises questions regarding the suitability of conventional least squares regression-based (LSR) calibration. We present a new framework that explicitly describes the model structural error statistically in an inductive, data-driven way. We adopt a fully Bayesian approach that integrates Gaussian process error models into the calibration, prediction, and uncertainty analysis of groundwater flow models. We test the usefulness of the fully Bayesian approach with a synthetic case study of the impact of pumping on surface-ground water interaction. We illustrate through this example that the Bayesian parameter posterior distributions differ significantly from parameters estimated by conventional LSR, which does not account for model structural error. For the latter method, parameter compensation for model structural error leads to biased, overconfident prediction under changing pumping condition. In contrast, integrating Gaussian process error models significantly reduces predictive bias and leads to prediction intervals that are more consistent with validation data. Finally, we carry out a generalized LSR recalibration step to assimilate the Bayesian prediction while preserving mass conservation and other physical constraints, using a full error covariance matrix obtained from Bayesian results. It is found that the recalibrated model achieved lower predictive bias compared to the model calibrated using conventional LSR. The results highlight the importance of explicit treatment of model structural error especially in circumstances where subsequent decision-making and risk analysis require accurate prediction and uncertainty quantification.
Improving Local and Regional Flood Quantile Estimates Using a Hierarchical Bayesian GEV Model
NASA Astrophysics Data System (ADS)
Ribeiro Lima, C. H.; Lall, U.; Devineni, N.; Troy, T.
2013-12-01
Flood risk management usually relies on local and regional flood frequency analysis, which tends to suffer from lack of data and parameter uncertainties. Here we estimate local and regional Generalized Extreme Value (GEV) distribution parameters in a hierarchical Bayesian framework, which helps reduce uncertainties by pooling more information in the estimation process and provides a simple topology to propagate model and parameter uncertainties to flood quantile estimates. As prior information for the Bayesian model, it is assumed for each site that the GEV location and scale parameters come from independent log-normal distributions, whose mean parameter follows the well known log-log scaling law with the drainage area. The shape parameter for each site is shrunk towards a common mean. Non-informative prior distributions are assumed for the hyperparameters and the MCMC method is used to sample from the posterior distributions. The model is tested using annual maximum series from 20 streamflow gauges located in an 83.000 km2 basin in southeastern Brazil. The results show a significant improvement of flood quantile estimates over the traditional GEV model, particularly for sites with few data. For return periods within the range of the data (around 50 years), the Bayesian credible intervals for the flood quantiles are narrower than the classical confidence limits based on the delta method. As the return period increases beyond the range of the data, the confidence limits from the delta method become unreliable and the Bayesian credible intervals provide a way to estimate satisfactory confidence bands for the flood quantiles considering the parameter uncertainties. In order to evaluate the applicability of the proposed hierarchical Bayesian model for flood frequency regional analysis, we estimate flood quantiles for three randomly chosen out-of-sample sites and compare with classical estimates using the index flood method. The posterior distributions of the scaling
Balfer, Jenny; Bajorath, Jürgen
2014-09-22
Supervised machine learning models are widely used in chemoinformatics, especially for the prediction of new active compounds or targets of known actives. Bayesian classification methods are among the most popular machine learning approaches for the prediction of activity from chemical structure. Much work has focused on predicting structure-activity relationships (SARs) on the basis of experimental training data. By contrast, only a few efforts have thus far been made to rationalize the performance of Bayesian or other supervised machine learning models and better understand why they might succeed or fail. In this study, we introduce an intuitive approach for the visualization and graphical interpretation of naïve Bayesian classification models. Parameters derived during supervised learning are visualized and interactively analyzed to gain insights into model performance and identify features that determine predictions. The methodology is introduced in detail and applied to assess Bayesian modeling efforts and predictions on compound data sets of varying structural complexity. Different classification models and features determining their performance are characterized in detail. A prototypic implementation of the approach is provided. PMID:25137527
Berhane, Kiros; Molitor, Nuoo-Ting
2008-01-01
Flexible multilevel models are proposed to allow for cluster-specific smooth estimation of growth curves in a mixed-effects modeling format that includes subject-specific random effects on the growth parameters. Attention is then focused on models that examine between-cluster comparisons of the effects of an ecologic covariate of interest (e.g. air pollution) on nonlinear functionals of growth curves (e.g. maximum rate of growth). A Gibbs sampling approach is used to get posterior mean estimates of nonlinear functionals along with their uncertainty estimates. A second-stage ecologic random-effects model is used to examine the association between a covariate of interest (e.g. air pollution) and the nonlinear functionals. A unified estimation procedure is presented along with its computational and theoretical details. The models are motivated by, and illustrated with, lung function and air pollution data from the Southern California Children's Health Study. PMID:18349036
NASA Astrophysics Data System (ADS)
Tsai, Frank T.-C.; Li, Xiaobao
2008-09-01
This study proposes a Bayesian model averaging (BMA) method to address parameter estimation uncertainty arising from nonuniqueness in parameterization methods. BMA is able to incorporate multiple parameterization methods for prediction through the law of total probability and to obtain an ensemble average of hydraulic conductivity estimates. Two major issues in applying BMA to hydraulic conductivity estimation are discussed. The first problem is using Occam's window in usual BMA applications to measure approximated posterior model probabilities. Occam's window only accepts models in a very narrow range, tending to single out the best method and discard other good methods. We propose a variance window to replace Occam's window to cope with this problem. The second problem is the Kashyap information criterion (KIC) in the approximated posterior model probabilities, which tends to prefer highly uncertain parameterization methods by considering the Fisher information matrix. With sufficient amounts of observation data, the Bayesian information criterion (BIC) is a good approximation and is able to avoid controversial results from using KIC. This study adopts multiple generalized parameterization (GP) methods such as the BMA models to estimate spatially correlated hydraulic conductivity. Numerical examples illustrate the issues of using KIC and Occam's window and show the advantages of using BIC and the variance window in BMA application. Finally, we apply BMA to the hydraulic conductivity estimation of the "1500-foot" sand in East Baton Rouge Parish, Louisiana.
Automated parameter estimation for biological models using Bayesian statistical model checking
2015-01-01
Background Probabilistic models have gained widespread acceptance in the systems biology community as a useful way to represent complex biological systems. Such models are developed using existing knowledge of the structure and dynamics of the system, experimental observations, and inferences drawn from statistical analysis of empirical data. A key bottleneck in building such models is that some system variables cannot be measured experimentally. These variables are incorporated into the model as numerical parameters. Determining values of these parameters that justify existing experiments and provide reliable predictions when model simulations are performed is a key research problem. Domain experts usually estimate the values of these parameters by fitting the model to experimental data. Model fitting is usually expressed as an optimization problem that requires minimizing a cost-function which measures some notion of distance between the model and the data. This optimization problem is often solved by combining local and global search methods that tend to perform well for the specific application domain. When some prior information about parameters is available, methods such as Bayesian inference are commonly used for parameter learning. Choosing the appropriate parameter search technique requires detailed domain knowledge and insight into the underlying system. Results Using an agent-based model of the dynamics of acute inflammation, we demonstrate a novel parameter estimation algorithm by discovering the amount and schedule of doses of bacterial lipopolysaccharide that guarantee a set of observed clinical outcomes with high probability. We synthesized values of twenty-eight unknown parameters such that the parameterized model instantiated with these parameter values satisfies four specifications describing the dynamic behavior of the model. Conclusions We have developed a new algorithmic technique for discovering parameters in complex stochastic models of
NASA Astrophysics Data System (ADS)
Pham, Hai V.; Tsai, Frank T.-C.
2015-09-01
The lack of hydrogeological data and knowledge often results in different propositions (or alternatives) to represent uncertain model components and creates many candidate groundwater models using the same data. Uncertainty of groundwater head prediction may become unnecessarily high. This study introduces an experimental design to identify propositions in each uncertain model component and decrease the prediction uncertainty by reducing conceptual model uncertainty. A discrimination criterion is developed based on posterior model probability that directly uses data to evaluate model importance. Bayesian model averaging (BMA) is used to predict future observation data. The experimental design aims to find the optimal number and location of future observations and the number of sampling rounds such that the desired discrimination criterion is met. Hierarchical Bayesian model averaging (HBMA) is adopted to assess if highly probable propositions can be identified and the conceptual model uncertainty can be reduced by the experimental design. The experimental design is implemented to a groundwater study in the Baton Rouge area, Louisiana. We design a new groundwater head observation network based on existing USGS observation wells. The sources of uncertainty that create multiple groundwater models are geological architecture, boundary condition, and fault permeability architecture. All possible design solutions are enumerated using a multi-core supercomputer. Several design solutions are found to achieve an 80%-identifiable groundwater model in 5 years by using six or more existing USGS wells. The HBMA result shows that each highly probable proposition can be identified for each uncertain model component once the discrimination criterion is achieved. The variances of groundwater head predictions are significantly decreased by reducing posterior model probabilities of unimportant propositions.
Bayesian-MCMC-based parameter estimation of stealth aircraft RCS models
NASA Astrophysics Data System (ADS)
Xia, Wei; Dai, Xiao-Xia; Feng, Yuan
2015-12-01
When modeling a stealth aircraft with low RCS (Radar Cross Section), conventional parameter estimation methods may cause a deviation from the actual distribution, owing to the fact that the characteristic parameters are estimated via directly calculating the statistics of RCS. The Bayesian-Markov Chain Monte Carlo (Bayesian-MCMC) method is introduced herein to estimate the parameters so as to improve the fitting accuracies of fluctuation models. The parameter estimations of the lognormal and the Legendre polynomial models are reformulated in the Bayesian framework. The MCMC algorithm is then adopted to calculate the parameter estimates. Numerical results show that the distribution curves obtained by the proposed method exhibit improved consistence with the actual ones, compared with those fitted by the conventional method. The fitting accuracy could be improved by no less than 25% for both fluctuation models, which implies that the Bayesian-MCMC method might be a good candidate among the optimal parameter estimation methods for stealth aircraft RCS models. Project supported by the National Natural Science Foundation of China (Grant No. 61101173), the National Basic Research Program of China (Grant No. 613206), the National High Technology Research and Development Program of China (Grant No. 2012AA01A308), the State Scholarship Fund by the China Scholarship Council (CSC), and the Oversea Academic Training Funds, and University of Electronic Science and Technology of China (UESTC).
fMRI data analysis with nonstationary noise models: a Bayesian approach.
Luo, Huaien; Puthusserypady, Sadasivan
2007-09-01
The assumption of noise stationarity in the functional magnetic resonance imaging (fMRI) data analysis may lead to the loss of crucial dynamic features of the data and thus result in inaccurate activation detection. In this paper, a Bayesian approach is proposed to analyze the fMRI data with two nonstationary noise models (the time-varying variance noise model and the fractional noise model). The covariance matrices of the time-varying variance noise and the fractional noise after wavelet transform are diagonal matrices. This property is investigated under the Bayesian framework. The Bayesian estimator not only gives an accurate estimate of the weights in general linear model, but also provides posterior probability of activation in a voxel and, hence, avoids the limitations (i.e., using only hypothesis testing) in the classical methods. The performance of the proposed Bayesian methods (under the assumption of different noise models) are compared with the ordinary least squares (OLS) and the weighted least squares (WLS) methods. Results from the simulation studies validate the superiority of the proposed approach to the OLS and WLS methods considering the complex noise structures in the fMRI data. PMID:17867354
A Dynamic Bayesian Network Model for the Production and Inventory Control
NASA Astrophysics Data System (ADS)
Shin, Ji-Sun; Takazaki, Noriyuki; Lee, Tae-Hong; Kim, Jin-Il; Lee, Hee-Hyol
In general, the production quantities and delivered goods are changed randomly and then the total stock is also changed randomly. This paper deals with the production and inventory control using the Dynamic Bayesian Network. Bayesian Network is a probabilistic model which represents the qualitative dependence between two or more random variables by the graph structure, and indicates the quantitative relations between individual variables by the conditional probability. The probabilistic distribution of the total stock is calculated through the propagation of the probability on the network. Moreover, an adjusting rule of the production quantities to maintain the probability of a lower limit and a ceiling of the total stock to certain values is shown.
A Bayesian approach to the semi-analytic model of galaxy formation
NASA Astrophysics Data System (ADS)
Lu, Yu
It is believed that a wide range of physical processes conspire to shape the observed galaxy population but it remains unsure of their detailed interactions. The semi-analytic model (SAM) of galaxy formation uses multi-dimensional parameterizations of the physical processes of galaxy formation and provides a tool to constrain these underlying physical interactions. Because of the high dimensionality and large uncertainties in the model, the parametric problem of galaxy formation can be profitably tackled with a Bayesian-inference based approach, which allows one to constrain theory with data in a statistically rigorous way. In this thesis, I present a newly developed method to build SAM upon the framework of Bayesian inference. I show that, aided by advanced Markov-Chain Monte-Carlo algorithms, the method has the power to efficiently combine information from diverse data sources, rigorously establish confidence bounds on model parameters, and provide powerful probability-based methods for hypothesis test. Using various data sets (stellar mass function, conditional stellar mass function, K-band luminosity function, and cold gas mass functions) of galaxies in the local Universe, I carry out a series of Bayesian model inferences. The results show that SAM contains huge degeneracies among its parameters, indicating that some of the conclusions drawn previously with the conventional approach may not be truly valid but need to be revisited by the Bayesian approach. Second, some of the degeneracy of the model can be broken by adopting multiple data sets that constrain different aspects of the galaxy population. Third, the inferences reveal that model has challenge to simultaneously explain some important observational results, suggesting that some key physics governing the evolution of star formation and feedback may still be missing from the model. These analyses show clearly that the Bayesian inference based SAM can be used to perform systematic and statistically
Likelihood-free Bayesian computation for structural model calibration: a feasibility study
NASA Astrophysics Data System (ADS)
Jin, Seung-Seop; Jung, Hyung-Jo
2016-04-01
Finite element (FE) model updating is often used to associate FE models with corresponding existing structures for the condition assessment. FE model updating is an inverse problem and prone to be ill-posed and ill-conditioning when there are many errors and uncertainties in both an FE model and its corresponding measurements. In this case, it is important to quantify these uncertainties properly. Bayesian FE model updating is one of the well-known methods to quantify parameter uncertainty by updating our prior belief on the parameters with the available measurements. In Bayesian inference, likelihood plays a central role in summarizing the overall residuals between model predictions and corresponding measurements. Therefore, likelihood should be carefully chosen to reflect the characteristics of the residuals. It is generally known that very little or no information is available regarding the statistical characteristics of the residuals. In most cases, the likelihood is assumed to be the independent identically distributed Gaussian distribution with the zero mean and constant variance. However, this assumption may cause biased and over/underestimated estimates of parameters, so that the uncertainty quantification and prediction are questionable. To alleviate the potential misuse of the inadequate likelihood, this study introduced approximate Bayesian computation (i.e., likelihood-free Bayesian inference), which relaxes the need for an explicit likelihood by analyzing the behavior similarities between model predictions and measurements. We performed FE model updating based on likelihood-free Markov chain Monte Carlo (MCMC) without using the likelihood. Based on the result of the numerical study, we observed that the likelihood-free Bayesian computation can quantify the updating parameters correctly and its predictive capability for the measurements, not used in calibrated, is also secured.
de los Campos, Gustavo; Gianola, Daniel
2007-01-01
Multivariate linear models are increasingly important in quantitative genetics. In high dimensional specifications, factor analysis (FA) may provide an avenue for structuring (co)variance matrices, thus reducing the number of parameters needed for describing (co)dispersion. We describe how FA can be used to model genetic effects in the context of a multivariate linear mixed model. An orthogonal common factor structure is used to model genetic effects under Gaussian assumption, so that the marginal likelihood is multivariate normal with a structured genetic (co)variance matrix. Under standard prior assumptions, all fully conditional distributions have closed form, and samples from the joint posterior distribution can be obtained via Gibbs sampling. The model and the algorithm developed for its Bayesian implementation were used to describe five repeated records of milk yield in dairy cattle, and a one common FA model was compared with a standard multiple trait model. The Bayesian Information Criterion favored the FA model. PMID:17897592
Bayesian Modeling of Population Variability -- Practical Guidance and Pitfalls
Dana L. Kelly; Corwin L. Atwood
2008-05-01
With the advent of easy-to-use open-source software for Markov chain Monte Carlo (MCMC) simulation, hierarchical Bayesian analysis is gaining in popularity. This paper presents practical guidance for hierarchical Bayes analysis of typical problems in probabilistic safety assessment (PSA). The guidance is related to choosing parameterizations that accelerate convergence of the MCMC sampling and to illustrating the potential sensitivity of the results to the functional form chosen for the first-stage prior. This latter issue has significant ramifications because the mean of the average population variability curve (PVC) from hierarchical Bayes (or the mean of the point estimate distribution from empirical Bayes) can be very sensitive to this choice in cases where variability is large. Numerical examples are provided to illustrate the issues discussed.
Using Bayesian Model Selection to Characterize Neonatal Eeg Recordings
NASA Astrophysics Data System (ADS)
Mitchell, Timothy J.
2009-12-01
The brains of premature infants must undergo significant maturation outside of the womb and are thus particularly susceptible to injury. Electroencephalographic (EEG) recordings are an important diagnostic tool in determining if a newborn's brain is functioning normally or if injury has occurred. However, interpreting the recordings is difficult and requires the skills of a trained electroencephelographer. Because these EEG specialists are rare, an automated interpretation of newborn EEG recordings would increase access to an important diagnostic tool for physicians. To automate this procedure, we employ Bayesian probability theory to compute the posterior probability for the EEG features of interest and use the results in a program designed to mimic EEG specialists. Specifically, we will be identifying waveforms of varying frequency and amplitude, as well as periods of flat recordings where brain activity is minimal.
Bayesian state space models for dynamic genetic network construction across multiple tissues.
Liang, Yulan; Kelemen, Arpad
2016-08-01
Construction of gene-gene interaction networks and potential pathways is a challenging and important problem in genomic research for complex diseases while estimating the dynamic changes of the temporal correlations and non-stationarity are the keys in this process. In this paper, we develop dynamic state space models with hierarchical Bayesian settings to tackle this challenge for inferring the dynamic profiles and genetic networks associated with disease treatments. We treat both the stochastic transition matrix and the observation matrix time-variant and include temporal correlation structures in the covariance matrix estimations in the multivariate Bayesian state space models. The unevenly spaced short time courses with unseen time points are treated as hidden state variables. Hierarchical Bayesian approaches with various prior and hyper-prior models with Monte Carlo Markov Chain and Gibbs sampling algorithms are used to estimate the model parameters and the hidden state variables. We apply the proposed Hierarchical Bayesian state space models to multiple tissues (liver, skeletal muscle, and kidney) Affymetrix time course data sets following corticosteroid (CS) drug administration. Both simulation and real data analysis results show that the genomic changes over time and gene-gene interaction in response to CS treatment can be well captured by the proposed models. The proposed dynamic Hierarchical Bayesian state space modeling approaches could be expanded and applied to other large scale genomic data, such as next generation sequence (NGS) combined with real time and time varying electronic health record (EHR) for more comprehensive and robust systematic and network based analysis in order to transform big biomedical data into predictions and diagnostics for precision medicine and personalized healthcare with better decision making and patient outcomes. PMID:27343475
Bayesian Analysis of Structural Equation Models with Nonlinear Covariates and Latent Variables
ERIC Educational Resources Information Center
Song, Xin-Yuan; Lee, Sik-Yum
2006-01-01
In this article, we formulate a nonlinear structural equation model (SEM) that can accommodate covariates in the measurement equation and nonlinear terms of covariates and exogenous latent variables in the structural equation. The covariates can come from continuous or discrete distributions. A Bayesian approach is developed to analyze the…
Lin, Lin; Chan, Cliburn; West, Mike
2016-01-01
We discuss the evaluation of subsets of variables for the discriminative evidence they provide in multivariate mixture modeling for classification. The novel development of Bayesian classification analysis presented is partly motivated by problems of design and selection of variables in biomolecular studies, particularly involving widely used assays of large-scale single-cell data generated using flow cytometry technology. For such studies and for mixture modeling generally, we define discriminative analysis that overlays fitted mixture models using a natural measure of concordance between mixture component densities, and define an effective and computationally feasible method for assessing and prioritizing subsets of variables according to their roles in discrimination of one or more mixture components. We relate the new discriminative information measures to Bayesian classification probabilities and error rates, and exemplify their use in Bayesian analysis of Dirichlet process mixture models fitted via Markov chain Monte Carlo methods as well as using a novel Bayesian expectation-maximization algorithm. We present a series of theoretical and simulated data examples to fix concepts and exhibit the utility of the approach, and compare with prior approaches. We demonstrate application in the context of automatic classification and discriminative variable selection in high-throughput systems biology using large flow cytometry datasets. PMID:26040910
A General and Flexible Approach to Estimating the Social Relations Model Using Bayesian Methods
ERIC Educational Resources Information Center
Ludtke, Oliver; Robitzsch, Alexander; Kenny, David A.; Trautwein, Ulrich
2013-01-01
The social relations model (SRM) is a conceptual, methodological, and analytical approach that is widely used to examine dyadic behaviors and interpersonal perception within groups. This article introduces a general and flexible approach to estimating the parameters of the SRM that is based on Bayesian methods using Markov chain Monte Carlo…
ERIC Educational Resources Information Center
Tchumtchoua, Sylvie; Dey, Dipak K.
2012-01-01
This paper proposes a semiparametric Bayesian framework for the analysis of associations among multivariate longitudinal categorical variables in high-dimensional data settings. This type of data is frequent, especially in the social and behavioral sciences. A semiparametric hierarchical factor analysis model is developed in which the…
Hierarchical Bayesian Model (HBM) - Derived Estimates of Air Quality for 2007: Annual Report
This report describes EPA's Hierarchical Bayesian model generated (HBM) estimates of ozone (O_{3}) and fine particulate matter (PM_{2.5} particles with aerodynamic diameter < 2.5 microns)concentrations throughout the continental United States during the 2007 calen...
Bayesian Inference for Growth Mixture Models with Latent Class Dependent Missing Data
ERIC Educational Resources Information Center
Lu, Zhenqiu Laura; Zhang, Zhiyong; Lubke, Gitta
2011-01-01
"Growth mixture models" (GMMs) with nonignorable missing data have drawn increasing attention in research communities but have not been fully studied. The goal of this article is to propose and to evaluate a Bayesian method to estimate the GMMs with latent class dependent missing data. An extended GMM is first presented in which class…
Bayesian Structural Equation Modeling: A More Flexible Representation of Substantive Theory
ERIC Educational Resources Information Center
Muthen, Bengt; Asparouhov, Tihomir
2012-01-01
This article proposes a new approach to factor analysis and structural equation modeling using Bayesian analysis. The new approach replaces parameter specifications of exact zeros with approximate zeros based on informative, small-variance priors. It is argued that this produces an analysis that better reflects substantive theories. The proposed…
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2006 - Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2006 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2001 - Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2001 calendar year. HBM estimates provide the spatial and temporal variance of O_{ 3}...
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2003 – Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2003 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2005 - Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2005 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2002– Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2002 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
A Robust Bayesian Approach for Structural Equation Models with Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum; Xia, Ye-Mao
2008-01-01
In this paper, normal/independent distributions, including but not limited to the multivariate t distribution, the multivariate contaminated distribution, and the multivariate slash distribution, are used to develop a robust Bayesian approach for analyzing structural equation models with complete or missing data. In the context of a nonlinear…
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan; Cai, Jing-Heng
2010-01-01
Analysis of ordered binary and unordered binary data has received considerable attention in social and psychological research. This article introduces a Bayesian approach, which has several nice features in practical applications, for analyzing nonlinear structural equation models with dichotomous data. We demonstrate how to use the software…
Pretense, Counterfactuals, and Bayesian Causal Models: Why What Is Not Real Really Matters
ERIC Educational Resources Information Center
Weisberg, Deena S.; Gopnik, Alison
2013-01-01
Young children spend a large portion of their time pretending about non-real situations. Why? We answer this question by using the framework of Bayesian causal models to argue that pretending and counterfactual reasoning engage the same component cognitive abilities: disengaging with current reality, making inferences about an alternative…
ERIC Educational Resources Information Center
Bekele, Rahel; McPherson, Maggie
2011-01-01
This research work presents a Bayesian Performance Prediction Model that was created in order to determine the strength of personality traits in predicting the level of mathematics performance of high school students in Addis Ababa. It is an automated tool that can be used to collect information from students for the purpose of effective group…
The Bayesian Evaluation of Categorization Models: Comment on Wills and Pothos (2012)
ERIC Educational Resources Information Center
Vanpaemel, Wolf; Lee, Michael D.
2012-01-01
Wills and Pothos (2012) reviewed approaches to evaluating formal models of categorization, raising a series of worthwhile issues, challenges, and goals. Unfortunately, in discussing these issues and proposing solutions, Wills and Pothos (2012) did not consider Bayesian methods in any detail. This means not only that their review excludes a major…
ERIC Educational Resources Information Center
Wang, Qiu; Diemer, Matthew A.; Maier, Kimberly S.
2013-01-01
This study integrated Bayesian hierarchical modeling and receiver operating characteristic analysis (BROCA) to evaluate how interest strength (IS) and interest differentiation (ID) predicted low–socioeconomic status (SES) youth's interest-major congruence (IMC). Using large-scale Kuder Career Search online-assessment data, this study fit three…
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2004 - Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2004 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
Hierarchical Bayesian Model (HBM) - Derived Estimates of Air Quality for 2008: Annual Report
This report describes EPA’s Hierarchical Bayesian model generated (HBM) estimates of ozone (O_{3}) and fine particulate matter (PM_{2.5}, particles with aerodynamic diameter < 2.5 microns) concentrations throughout the continental United States during the 2007 ca...
Höhna, Sebastian; Landis, Michael J.
2016-01-01
Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com. [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.] PMID:27235697
Höhna, Sebastian; Landis, Michael J; Heath, Tracy A; Boussau, Bastien; Lartillot, Nicolas; Moore, Brian R; Huelsenbeck, John P; Ronquist, Fredrik
2016-07-01
Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]. PMID:27235697
A Bayesian modification to the Jelinski-Moranda software reliability growth model
NASA Technical Reports Server (NTRS)
Littlewood, B.; Sofer, A.
1983-01-01
The Jelinski-Moranda (JM) model for software reliability was examined. It is suggested that a major reason for the poor results given by this model is the poor performance of the maximum likelihood method (ML) of parameter estimation. A reparameterization and Bayesian analysis, involving a slight modelling change, are proposed. It is shown that this new Bayesian-Jelinski-Moranda model (BJM) is mathematically quite tractable, and several metrics of interest to practitioners are obtained. The BJM and JM models are compared by using several sets of real software failure data collected and in all cases the BJM model gives superior reliability predictions. A change in the assumption which underlay both models to present the debugging process more accurately is discussed.
NASA Astrophysics Data System (ADS)
Chen, X.; Hao, Z.; Devineni, N.; Lall, U.
2013-09-01
A Hierarchal Bayesian model for forecasting regional summer rainfall and streamflow season-ahead using exogenous climate variables for East Central China is presented. The model provides estimates of the posterior forecasted probability distribution for 12 rainfall and 2 streamflow stations considering parameter uncertainty, and cross-site correlation. The model has a multilevel structure with regression coefficients modeled from a common multivariate normal distribution results in partial-pooling of information across multiple stations and better representation of parameter and posterior distribution uncertainty. Covariance structure of the residuals across stations is explicitly modeled. Model performance is tested under leave-10-out cross-validation. Frequentist and Bayesian performance metrics used include Receiver Operating Characteristic, Reduction of Error, Coefficient of Efficiency, Rank Probability Skill Scores, and coverage by posterior credible intervals. The ability of the model to reliably forecast regional summer rainfall and streamflow season-ahead offers potential for developing adaptive water risk management strategies.
NASA Astrophysics Data System (ADS)
Chen, X.; Hao, Z.; Devineni, N.; Lall, U.
2014-04-01
A Hierarchal Bayesian model is presented for one season-ahead forecasts of summer rainfall and streamflow using exogenous climate variables for east central China. The model provides estimates of the posterior forecasted probability distribution for 12 rainfall and 2 streamflow stations considering parameter uncertainty, and cross-site correlation. The model has a multi-level structure with regression coefficients modeled from a common multi-variate normal distribution resulting in partial pooling of information across multiple stations and better representation of parameter and posterior distribution uncertainty. Covariance structure of the residuals across stations is explicitly modeled. Model performance is tested under leave-10-out cross-validation. Frequentist and Bayesian performance metrics used include receiver operating characteristic, reduction of error, coefficient of efficiency, rank probability skill scores, and coverage by posterior credible intervals. The ability of the model to reliably forecast season-ahead regional summer rainfall and streamflow offers potential for developing adaptive water risk management strategies.
An Application of Bayesian Approach in Modeling Risk of Death in an Intensive Care Unit
Wong, Rowena Syn Yin; Ismail, Noor Azina
2016-01-01
Background and Objectives There are not many studies that attempt to model intensive care unit (ICU) risk of death in developing countries, especially in South East Asia. The aim of this study was to propose and describe application of a Bayesian approach in modeling in-ICU deaths in a Malaysian ICU. Methods This was a prospective study in a mixed medical-surgery ICU in a multidisciplinary tertiary referral hospital in Malaysia. Data collection included variables that were defined in Acute Physiology and Chronic Health Evaluation IV (APACHE IV) model. Bayesian Markov Chain Monte Carlo (MCMC) simulation approach was applied in the development of four multivariate logistic regression predictive models for the ICU, where the main outcome measure was in-ICU mortality risk. The performance of the models were assessed through overall model fit, discrimination and calibration measures. Results from the Bayesian models were also compared against results obtained using frequentist maximum likelihood method. Results The study involved 1,286 consecutive ICU admissions between January 1, 2009 and June 30, 2010, of which 1,111 met the inclusion criteria. Patients who were admitted to the ICU were generally younger, predominantly male, with low co-morbidity load and mostly under mechanical ventilation. The overall in-ICU mortality rate was 18.5% and the overall mean Acute Physiology Score (APS) was 68.5. All four models exhibited good discrimination, with area under receiver operating characteristic curve (AUC) values approximately 0.8. Calibration was acceptable (Hosmer-Lemeshow p-values > 0.05) for all models, except for model M3. Model M1 was identified as the model with the best overall performance in this study. Conclusion Four prediction models were proposed, where the best model was chosen based on its overall performance in this study. This study has also demonstrated the promising potential of the Bayesian MCMC approach as an alternative in the analysis and modeling of
Comparison of a Bayesian Network with a Logistic Regression Model to Forecast IgA Nephropathy
Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre
2013-01-01
Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation. PMID:24328031
Equifinality of formal (DREAM) and informal (GLUE) bayesian approaches in hydrologic modeling?
Vrugt, Jasper A; Robinson, Bruce A; Ter Braak, Cajo J F; Gupta, Hoshin V
2008-01-01
In recent years, a strong debate has emerged in the hydrologic literature regarding what constitutes an appropriate framework for uncertainty estimation. Particularly, there is strong disagreement whether an uncertainty framework should have its roots within a proper statistical (Bayesian) context, or whether such a framework should be based on a different philosophy and implement informal measures and weaker inference to summarize parameter and predictive distributions. In this paper, we compare a formal Bayesian approach using Markov Chain Monte Carlo (MCMC) with generalized likelihood uncertainty estimation (GLUE) for assessing uncertainty in conceptual watershed modeling. Our formal Bayesian approach is implemented using the recently developed differential evolution adaptive metropolis (DREAM) MCMC scheme with a likelihood function that explicitly considers model structural, input and parameter uncertainty. Our results demonstrate that DREAM and GLUE can generate very similar estimates of total streamflow uncertainty. This suggests that formal and informal Bayesian approaches have more common ground than the hydrologic literature and ongoing debate might suggest. The main advantage of formal approaches is, however, that they attempt to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty. This is key to improving hydrologic theory and to better understand and predict the flow of water through catchments.
Bayesian model selection validates a biokinetic model for zirconium processing in humans
2012-01-01
Background In radiation protection, biokinetic models for zirconium processing are of crucial importance in dose estimation and further risk analysis for humans exposed to this radioactive substance. They provide limiting values of detrimental effects and build the basis for applications in internal dosimetry, the prediction for radioactive zirconium retention in various organs as well as retrospective dosimetry. Multi-compartmental models are the tool of choice for simulating the processing of zirconium. Although easily interpretable, determining the exact compartment structure and interaction mechanisms is generally daunting. In the context of observing the dynamics of multiple compartments, Bayesian methods provide efficient tools for model inference and selection. Results We are the first to apply a Markov chain Monte Carlo approach to compute Bayes factors for the evaluation of two competing models for zirconium processing in the human body after ingestion. Based on in vivo measurements of human plasma and urine levels we were able to show that a recently published model is superior to the standard model of the International Commission on Radiological Protection. The Bayes factors were estimated by means of the numerically stable thermodynamic integration in combination with a recently developed copula-based Metropolis-Hastings sampler. Conclusions In contrast to the standard model the novel model predicts lower accretion of zirconium in bones. This results in lower levels of noxious doses for exposed individuals. Moreover, the Bayesian approach allows for retrospective dose assessment, including credible intervals for the initially ingested zirconium, in a significantly more reliable fashion than previously possible. All methods presented here are readily applicable to many modeling tasks in systems biology. PMID:22863152
NASA Astrophysics Data System (ADS)
Robertson, D. E.; Wang, Q. J.; Malano, H.; Etchells, T.
2009-02-01
For models to be useful, they need to adequately describe the systems they represent. The probabilistic nature of Bayesian network models has traditionally meant that model validation is difficult. In this paper we present a process to validate Inteca-Farm, a Bayesian network model of farm irrigation that we described in the first paper of this series. We assessed three aspects of the quality of model predictions, namely, bias, accuracy, and skill, for the two variables for which validation data are available directly or indirectly. We also examined model predictions for any systematic errors. The validation results show that the bias and accuracy of the two validated variables are within acceptable tolerances and that systematic errors are minimal. This suggests that Inteca-Farm is a plausible representation of farm irrigation system in the Shepparton Irrigation Region of northern Victoria, Australia.
Bayesian spatio-temporal modeling of particulate matter concentrations in Peninsular Malaysia
NASA Astrophysics Data System (ADS)
Manga, Edna; Awang, Norhashidah
2016-06-01
This article presents an application of a Bayesian spatio-temporal Gaussian process (GP) model on particulate matter concentrations from Peninsular Malaysia. We analyze daily PM10 concentration levels from 35 monitoring sites in June and July 2011. The spatiotemporal model set in a Bayesian hierarchical framework allows for inclusion of informative covariates, meteorological variables and spatiotemporal interactions. Posterior density estimates of the model parameters are obtained by Markov chain Monte Carlo methods. Preliminary data analysis indicate information on PM10 levels at sites classified as industrial locations could explain part of the space time variations. We include the site-type indicator in our modeling efforts. Results of the parameter estimates for the fitted GP model show significant spatio-temporal structure and positive effect of the location-type explanatory variable. We also compute some validation criteria for the out of sample sites that show the adequacy of the model for predicting PM10 at unmonitored sites.
Bayesian Model Selection in Complex Linear Systems, as Illustrated in Genetic Association Studies
Wen, Xiaoquan
2013-01-01
Summary Motivated by examples from genetic association studies, this paper considers the model selection problem in a general complex linear model system and in a Bayesian framework. We discuss formulating model selection problems and incorporating context-dependent a priori information through different levels of prior specifications. We also derive analytic Bayes factors and their approximations to facilitate model selection and discuss their theoretical and computational properties. We demonstrate our Bayesian approach based on an implemented Markov Chain Monte Carlo (MCMC) algorithm in simulations and a real data application of mapping tissue-specific eQTLs. Our novel results on Bayes factors provide a general framework to perform efficient model comparisons in complex linear model systems. PMID:24350677
Drovandi, Christopher C; McCutchan, Roy A
2016-06-01
In this article we present a new method for performing Bayesian parameter inference and model choice for low- count time series models with intractable likelihoods. The method involves incorporating an alive particle filter within a sequential Monte Carlo (SMC) algorithm to create a novel exact-approximate algorithm, which we refer to as alive SMC2. The advantages of this approach over competing methods are that it is naturally adaptive, it does not involve between-model proposals required in reversible jump Markov chain Monte Carlo, and does not rely on potentially rough approximations. The algorithm is demonstrated on Markov process and integer autoregressive moving average models applied to real biological datasets of hospital-acquired pathogen incidence, animal health time series, and the cumulative number of prion disease cases in mule deer. PMID:26584211
Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets
2015-01-01
On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, and the ability to share such models still remains a major challenge limiting drug discovery. We describe the creation of a reference implementation of a Bayesian model-building software module, which we have released as an open source component that is now included in the Chemistry Development Kit (CDK) project, as well as implemented in the CDD Vault and in several mobile apps. We use this implementation to build an array of Bayesian models for ADME/Tox, in vitro and in vivo bioactivity, and other physicochemical properties. We show that these models possess cross-validation receiver operator curve values comparable to those generated previously in prior publications using alternative tools. We have now described how the implementation of Bayesian models with FCFP6 descriptors generated in the CDD Vault enables the rapid production of robust machine learning models from public data or the user’s own datasets. The current study sets the stage for generating models in proprietary software (such as CDD) and exporting these models in a format that could be run in open source software using CDK components. This work also demonstrates that we can enable biocomputation across distributed private or public datasets to enhance drug discovery. PMID:25994950
NASA Astrophysics Data System (ADS)
Tsai, Frank T.-C.; Elshall, Ahmed S.
2013-09-01
Analysts are often faced with competing propositions for each uncertain model component. How can we judge that we select a correct proposition(s) for an uncertain model component out of numerous possible propositions? We introduce the hierarchical Bayesian model averaging (HBMA) method as a multimodel framework for uncertainty analysis. The HBMA allows for segregating, prioritizing, and evaluating different sources of uncertainty and their corresponding competing propositions through a hierarchy of BMA models that forms a BMA tree. We apply the HBMA to conduct uncertainty analysis on the reconstructed hydrostratigraphic architectures of the Baton Rouge aquifer-fault system, Louisiana. Due to uncertainty in model data, structure, and parameters, multiple possible hydrostratigraphic models are produced and calibrated as base models. The study considers four sources of uncertainty. With respect to data uncertainty, the study considers two calibration data sets. With respect to model structure, the study considers three different variogram models, two geological stationarity assumptions and two fault conceptualizations. The base models are produced following a combinatorial design to allow for uncertainty segregation. Thus, these four uncertain model components with their corresponding competing model propositions result in 24 base models. The results show that the systematic dissection of the uncertain model components along with their corresponding competing propositions allows for detecting the robust model propositions and the major sources of uncertainty.
Lessons Learned from a Past Series of Bayesian Model Averaging studies for Soil/Plant Models
NASA Astrophysics Data System (ADS)
Nowak, Wolfgang; Wöhling, Thomas; Schöniger, Anneli
2015-04-01
In this study we evaluate the lessons learned about modelling soil/plant systems from analyzing evapotranspiration data, soil moisture and leaf area index. The data were analyzed with advanced tools from the area of Bayesian Model Averaging, model ranking and Bayesian Model Selection. We have generated a large variety of model conceptualizations by sampling random parameter sets from the vegetation components of the CERES, SUCROS, GECROS, and SPASS models and a common model for soil water movement via Monte-Carlo simulations. We used data from a one vegetation period of winter wheat at a field site in Nellingen, Germany. The data set includes soil moisture, actual evapotranspiration (ETa) from an eddy covariance tower, and leaf-area index (LAI). The focus of data analysis was on how one can do model ranking and model selection. Further analysis steps included the predictive reliability of different soil/plant models calibrated on different subsets of the available data. Our main conclusion is that model selection between different competing soil-plant models remains a large challenge, because 1. different data types and their combinations favor different models, because competing models are more or less good in simulating the coupling processes between the various compartments and their states, 2. singular events (such as the evolution of LAI during plant senescence) can dominate an entire time series, and long time series can be represented well by the few data values where the models disagree most, 3. the different data types differ in their discriminating power for model selection, 4. the level of noise present in ETa and LAI data, and the level of systematic model bias through simplifications of the complex system (e.g., assuming a few internally homogeneous soil layers) substantially reduce the confidence in model ranking and model selection, 5. none of the models withstands a hypothesis test against the available data, 6. even the assumed level of measurement
A Bayesian approach for inducing sparsity in generalized linear models with multi-category response
2015-01-01
Background The dimension and complexity of high-throughput gene expression data create many challenges for downstream analysis. Several approaches exist to reduce the number of variables with respect to small sample sizes. In this study, we utilized the Generalized Double Pareto (GDP) prior to induce sparsity in a Bayesian Generalized Linear Model (GLM) setting. The approach was evaluated using a publicly available microarray dataset containing 99 samples corresponding to four different prostate cancer subtypes. Results A hierarchical Sparse Bayesian GLM using GDP prior (SBGG) was developed to take into account the progressive nature of the response variable. We obtained an average overall classification accuracy between 82.5% and 94%, which was higher than Support Vector Machine, Random Forest or a Sparse Bayesian GLM using double exponential priors. Additionally, SBGG outperforms the other 3 methods in correctly identifying pre-metastatic stages of cancer progression, which can prove extremely valuable for therapeutic and diagnostic purposes. Importantly, using Geneset Cohesion Analysis Tool, we found that the top 100 genes produced by SBGG had an average functional cohesion p-value of 2.0E-4 compared to 0.007 to 0.131 produced by the other methods. Conclusions Using GDP in a Bayesian GLM model applied to cancer progression data results in better subclass prediction. In particular, the method identifies pre-metastatic stages of prostate cancer with substantially better accuracy and produces more functionally relevant gene sets. PMID:26423345
Dynamic causal modelling of electrographic seizure activity using Bayesian belief updating.
Cooray, Gerald K; Sengupta, Biswa; Douglas, Pamela K; Friston, Karl
2016-01-15
Seizure activity in EEG recordings can persist for hours with seizure dynamics changing rapidly over time and space. To characterise the spatiotemporal evolution of seizure activity, large data sets often need to be analysed. Dynamic causal modelling (DCM) can be used to estimate the synaptic drivers of cortical dynamics during a seizure; however, the requisite (Bayesian) inversion procedure is computationally expensive. In this note, we describe a straightforward procedure, within the DCM framework, that provides efficient inversion of seizure activity measured with non-invasive and invasive physiological recordings; namely, EEG/ECoG. We describe the theoretical background behind a Bayesian belief updating scheme for DCM. The scheme is tested on simulated and empirical seizure activity (recorded both invasively and non-invasively) and compared with standard Bayesian inversion. We show that the Bayesian belief updating scheme provides similar estimates of time-varying synaptic parameters, compared to standard schemes, indicating no significant qualitative change in accuracy. The difference in variance explained was small (less than 5%). The updating method was substantially more efficient, taking approximately 5-10min compared to approximately 1-2h. Moreover, the setup of the model under the updating scheme allows for a clear specification of how neuronal variables fluctuate over separable timescales. This method now allows us to investigate the effect of fast (neuronal) activity on slow fluctuations in (synaptic) parameters, paving a way forward to understand how seizure activity is generated. PMID:26220742
Dynamic causal modelling of electrographic seizure activity using Bayesian belief updating
Cooray, Gerald K.; Sengupta, Biswa; Douglas, Pamela K.; Friston, Karl
2016-01-01
Seizure activity in EEG recordings can persist for hours with seizure dynamics changing rapidly over time and space. To characterise the spatiotemporal evolution of seizure activity, large data sets often need to be analysed. Dynamic causal modelling (DCM) can be used to estimate the synaptic drivers of cortical dynamics during a seizure; however, the requisite (Bayesian) inversion procedure is computationally expensive. In this note, we describe a straightforward procedure, within the DCM framework, that provides efficient inversion of seizure activity measured with non-invasive and invasive physiological recordings; namely, EEG/ECoG. We describe the theoretical background behind a Bayesian belief updating scheme for DCM. The scheme is tested on simulated and empirical seizure activity (recorded both invasively and non-invasively) and compared with standard Bayesian inversion. We show that the Bayesian belief updating scheme provides similar estimates of time-varying synaptic parameters, compared to standard schemes, indicating no significant qualitative change in accuracy. The difference in variance explained was small (less than 5%). The updating method was substantially more efficient, taking approximately 5–10 min compared to approximately 1–2 h. Moreover, the setup of the model under the updating scheme allows for a clear specification of how neuronal variables fluctuate over separable timescales. This method now allows us to investigate the effect of fast (neuronal) activity on slow fluctuations in (synaptic) parameters, paving a way forward to understand how seizure activity is generated. PMID:26220742
Toni, Tina; Welch, David; Strelkowa, Natalja; Ipsen, Andreas; Stumpf, Michael P.H.
2008-01-01
Approximate Bayesian computation (ABC) methods can be used to evaluate posterior distributions without having to calculate likelihoods. In this paper, we discuss and apply an ABC method based on sequential Monte Carlo (SMC) to estimate parameters of dynamical models. We show that ABC SMC provides information about the inferability of parameters and model sensitivity to changes in parameters, and tends to perform better than other ABC approaches. The algorithm is applied to several well-known biological systems, for which parameters and their credible intervals are inferred. Moreover, we develop ABC SMC as a tool for model selection; given a range of different mathematical descriptions, ABC SMC is able to choose the best model using the standard Bayesian model selection apparatus. PMID:19205079
Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models
Daunizeau, J.; Friston, K.J.; Kiebel, S.J.
2009-01-01
In this paper, we describe a general variational Bayesian approach for approximate inference on nonlinear stochastic dynamic models. This scheme extends established approximate inference on hidden-states to cover: (i) nonlinear evolution and observation functions, (ii) unknown parameters and (precision) hyperparameters and (iii) model comparison and prediction under uncertainty. Model identification or inversion entails the estimation of the marginal likelihood or evidence of a model. This difficult integration problem can be finessed by optimising a free-energy bound on the evidence using results from variational calculus. This yields a deterministic update scheme that optimises an approximation to the posterior density on the unknown model variables. We derive such a variational Bayesian scheme in the context of nonlinear stochastic dynamic hierarchical models, for both model identification and time-series prediction. The computational complexity of the scheme is comparable to that of an extended Kalman filter, which is critical when inverting high dimensional models or long time-series. Using Monte-Carlo simulations, we assess the estimation efficiency of this variational Bayesian approach using three stochastic variants of chaotic dynamic systems. We also demonstrate the model comparison capabilities of the method, its self-consistency and its predictive power. PMID:19862351
Modelling household finances: A Bayesian approach to a multivariate two-part model
Brown, Sarah; Ghosh, Pulak; Su, Li; Taylor, Karl
2016-01-01
We contribute to the empirical literature on household finances by introducing a Bayesian multivariate two-part model, which has been developed to further our understanding of household finances. Our flexible approach allows for the potential interdependence between the holding of assets and liabilities at the household level and also encompasses a two-part process to allow for differences in the influences on asset or liability holding and on the respective amounts held. Furthermore, the framework is dynamic in order to allow for persistence in household finances over time. Our findings endorse the joint modelling approach and provide evidence supporting the importance of dynamics. In addition, we find that certain independent variables exert different influences on the binary and continuous parts of the model thereby highlighting the flexibility of our framework and revealing a detailed picture of the nature of household finances. PMID:27212801
NASA Astrophysics Data System (ADS)
Lu, Dan; Ye, Ming; Curtis, Gary P.
2015-10-01
While Bayesian model averaging (BMA) has been widely used in groundwater modeling, it is infrequently applied to groundwater reactive transport modeling because of multiple sources of uncertainty in the coupled hydrogeochemical processes and because of the long execution time of each model run. To resolve these problems, this study analyzed different levels of uncertainty in a hierarchical way, and used the maximum likelihood version of BMA, i.e., MLBMA, to improve the computational efficiency. This study demonstrates the applicability of MLBMA to groundwater reactive transport modeling in a synthetic case in which twenty-seven reactive transport models were designed to predict the reactive transport of hexavalent uranium (U(VI)) based on observations at a former uranium mill site near Naturita, CO. These reactive transport models contain three uncertain model components, i.e., parameterization of hydraulic conductivity, configuration of model boundary, and surface complexation reactions that simulate U(VI) adsorption. These uncertain model components were aggregated into the alternative models by integrating a hierarchical structure into MLBMA. The modeling results of the individual models and MLBMA were analyzed to investigate their predictive performance. The predictive logscore results show that MLBMA generally outperforms the best model, suggesting that using MLBMA is a sound strategy to achieve more robust model predictions relative to a single model. MLBMA works best when the alternative models are structurally distinct and have diverse model predictions. When correlation in model structure exists, two strategies were used to improve predictive performance by retaining structurally distinct models or assigning smaller prior model probabilities to correlated models. Since the synthetic models were designed using data from the Naturita site, the results of this study are expected to provide guidance for real-world modeling. Limitations of applying MLBMA to the
Curtis, Gary P.; Lu, Dan; Ye, Ming
2015-01-01
While Bayesian model averaging (BMA) has been widely used in groundwater modeling, it is infrequently applied to groundwater reactive transport modeling because of multiple sources of uncertainty in the coupled hydrogeochemical processes and because of the long execution time of each model run. To resolve these problems, this study analyzed different levels of uncertainty in a hierarchical way, and used the maximum likelihood version of BMA, i.e., MLBMA, to improve the computational efficiency. This study demonstrates the applicability of MLBMA to groundwater reactive transport modeling in a synthetic case in which twenty-seven reactive transport models were designed to predict the reactive transport of hexavalent uranium (U(VI)) based on observations at a former uranium mill site near Naturita, CO. These reactive transport models contain three uncertain model components, i.e., parameterization of hydraulic conductivity, configuration of model boundary, and surface complexation reactions that simulate U(VI) adsorption. These uncertain model components were aggregated into the alternative models by integrating a hierarchical structure into MLBMA. The modeling results of the individual models and MLBMA were analyzed to investigate their predictive performance. The predictive logscore results show that MLBMA generally outperforms the best model, suggesting that using MLBMA is a sound strategy to achieve more robust model predictions relative to a single model. MLBMA works best when the alternative models are structurally distinct and have diverse model predictions. When correlation in model structure exists, two strategies were used to improve predictive performance by retaining structurally distinct models or assigning smaller prior model probabilities to correlated models. Since the synthetic models were designed using data from the Naturita site, the results of this study are expected to provide guidance for real-world modeling. Limitations of applying MLBMA to the
Lu, Dan; Ye, Ming; Curtis, Gary P.
2015-08-01
While Bayesian model averaging (BMA) has been widely used in groundwater modeling, it is infrequently applied to groundwater reactive transport modeling because of multiple sources of uncertainty in the coupled hydrogeochemical processes and because of the long execution time of each model run. To resolve these problems, this study analyzed different levels of uncertainty in a hierarchical way, and used the maximum likelihood version of BMA, i.e., MLBMA, to improve the computational efficiency. Our study demonstrates the applicability of MLBMA to groundwater reactive transport modeling in a synthetic case in which twenty-seven reactive transport models were designed to predict the reactive transport of hexavalent uranium (U(VI)) based on observations at a former uranium mill site near Naturita, CO. Moreover, these reactive transport models contain three uncertain model components, i.e., parameterization of hydraulic conductivity, configuration of model boundary, and surface complexation reactions that simulate U(VI) adsorption. These uncertain model components were aggregated into the alternative models by integrating a hierarchical structure into MLBMA. The modeling results of the individual models and MLBMA were analyzed to investigate their predictive performance. The predictive logscore results show that MLBMA generally outperforms the best model, suggesting that using MLBMA is a sound strategy to achieve more robust model predictions relative to a single model. MLBMA works best when the alternative models are structurally distinct and have diverse model predictions. When correlation in model structure exists, two strategies were used to improve predictive performance by retaining structurally distinct models or assigning smaller prior model probabilities to correlated models. Since the synthetic models were designed using data from the Naturita site, the results of this study are expected to provide guidance for real-world modeling. Finally, limitations of
Lu, Dan; Ye, Ming; Curtis, Gary P.
2015-08-01
While Bayesian model averaging (BMA) has been widely used in groundwater modeling, it is infrequently applied to groundwater reactive transport modeling because of multiple sources of uncertainty in the coupled hydrogeochemical processes and because of the long execution time of each model run. To resolve these problems, this study analyzed different levels of uncertainty in a hierarchical way, and used the maximum likelihood version of BMA, i.e., MLBMA, to improve the computational efficiency. Our study demonstrates the applicability of MLBMA to groundwater reactive transport modeling in a synthetic case in which twenty-seven reactive transport models were designed to predict themore » reactive transport of hexavalent uranium (U(VI)) based on observations at a former uranium mill site near Naturita, CO. Moreover, these reactive transport models contain three uncertain model components, i.e., parameterization of hydraulic conductivity, configuration of model boundary, and surface complexation reactions that simulate U(VI) adsorption. These uncertain model components were aggregated into the alternative models by integrating a hierarchical structure into MLBMA. The modeling results of the individual models and MLBMA were analyzed to investigate their predictive performance. The predictive logscore results show that MLBMA generally outperforms the best model, suggesting that using MLBMA is a sound strategy to achieve more robust model predictions relative to a single model. MLBMA works best when the alternative models are structurally distinct and have diverse model predictions. When correlation in model structure exists, two strategies were used to improve predictive performance by retaining structurally distinct models or assigning smaller prior model probabilities to correlated models. Since the synthetic models were designed using data from the Naturita site, the results of this study are expected to provide guidance for real-world modeling. Finally
Non-stationary Bayesian estimation of parameters from a body cover model of the vocal folds.
Hadwin, Paul J; Galindo, Gabriel E; Daun, Kyle J; Zañartu, Matías; Erath, Byron D; Cataldo, Edson; Peterson, Sean D
2016-05-01
The evolution of reduced-order vocal fold models into clinically useful tools for subject-specific diagnosis and treatment hinges upon successfully and accurately representing an individual patient in the modeling framework. This, in turn, requires inference of model parameters from clinical measurements in order to tune a model to the given individual. Bayesian analysis is a powerful tool for estimating model parameter probabilities based upon a set of observed data. In this work, a Bayesian particle filter sampling technique capable of estimating time-varying model parameters, as occur in complex vocal gestures, is introduced. The technique is compared with time-invariant Bayesian estimation and least squares methods for determining both stationary and non-stationary parameters. The current technique accurately estimates the time-varying unknown model parameter and maintains tight credibility bounds. The credibility bounds are particularly relevant from a clinical perspective, as they provide insight into the confidence a clinician should have in the model predictions. PMID:27250162
Optimal speech motor control and token-to-token variability: a Bayesian modeling approach.
Patri, Jean-François; Diard, Julien; Perrier, Pascal
2015-12-01
The remarkable capacity of the speech motor system to adapt to various speech conditions is due to an excess of degrees of freedom, which enables producing similar acoustical properties with different sets of control strategies. To explain how the central nervous system selects one of the possible strategies, a common approach, in line with optimal motor control theories, is to model speech motor planning as the solution of an optimality problem based on cost functions. Despite the success of this approach, one of its drawbacks is the intrinsic contradiction between the concept of optimality and the observed experimental intra-speaker token-to-token variability. The present paper proposes an alternative approach by formulating feedforward optimal control in a probabilistic Bayesian modeling framework. This is illustrated by controlling a biomechanical model of the vocal tract for speech production and by comparing it with an existing optimal control model (GEPPETO). The essential elements of this optimal control model are presented first. From them the Bayesian model is constructed in a progressive way. Performance of the Bayesian model is evaluated based on computer simulations and compared to the optimal control model. This approach is shown to be appropriate for solving the speech planning problem while accounting for variability in a principled way. PMID:26497359
2016-01-01
Background Regional disparity in suicide rates is a serious problem worldwide. One possible cause is unequal distribution of the health workforce, especially psychiatrists. Research about the association between regional physician numbers and suicide rates is therefore important but studies are rare. The objective of this study was to evaluate the association between physician numbers and suicide rates in Japan, by municipality. Methods The study included all the municipalities in Japan (n = 1,896). We estimated smoothed standardized mortality ratios of suicide rates for each municipality and evaluated the association between health workforce and suicide rates using a hierarchical Bayesian model accounting for spatially correlated random effects, a conditional autoregressive model. We assumed a Poisson distribution for the observed number of suicides and set the expected number of suicides as the offset variable. The explanatory variables were numbers of physicians, a binary variable for the presence of psychiatrists, and social covariates. Results After adjustment for socioeconomic factors, suicide rates in municipalities that had at least one psychiatrist were lower than those in the other municipalities. There was, however, a positive and statistically significant association between the number of physicians and suicide rates. Conclusions Suicide rates in municipalities that had at least one psychiatrist were lower than those in other municipalities, but the number of physicians was positively and significantly related with suicide rates. To improve the regional disparity in suicide rates, the government should encourage psychiatrists to participate in community-based suicide prevention programs and to settle in municipalities that currently have no psychiatrists. The government and other stakeholders should also construct better networks between psychiatrists and non-psychiatrists to support sharing of information for suicide prevention. PMID:26840389
Prediction and assimilation of surf-zone processes using a Bayesian network: Part II: Inverse models
Plant, Nathaniel G.; Holland, K. Todd
2011-01-01
A Bayesian network model has been developed to simulate a relatively simple problem of wave propagation in the surf zone (detailed in Part I). Here, we demonstrate that this Bayesian model can provide both inverse modeling and data-assimilation solutions for predicting offshore wave heights and depth estimates given limited wave-height and depth information from an onshore location. The inverse method is extended to allow data assimilation using observational inputs that are not compatible with deterministic solutions of the problem. These inputs include sand bar positions (instead of bathymetry) and estimates of the intensity of wave breaking (instead of wave-height observations). Our results indicate that wave breaking information is essential to reduce prediction errors. In many practical situations, this information could be provided from a shore-based observer or from remote-sensing systems. We show that various combinations of the assimilated inputs significantly reduce the uncertainty in the estimates of water depths and wave heights in the model domain. Application of the Bayesian network model to new field data demonstrated significant predictive skill (R2 = 0.7) for the inverse estimate of a month-long time series of offshore wave heights. The Bayesian inverse results include uncertainty estimates that were shown to be most accurate when given uncertainty in the inputs (e.g., depth and tuning parameters). Furthermore, the inverse modeling was extended to directly estimate tuning parameters associated with the underlying wave-process model. The inverse estimates of the model parameters not only showed an offshore wave height dependence consistent with results of previous studies but the uncertainty estimates of the tuning parameters also explain previously reported variations in the model parameters.
Prediction and assimilation of surf-zone processes using a Bayesian network: Part I: Forward models
Plant, Nathaniel G.; Holland, K. Todd
2011-01-01
Prediction of coastal processes, including waves, currents, and sediment transport, can be obtained from a variety of detailed geophysical-process models with many simulations showing significant skill. This capability supports a wide range of research and applied efforts that can benefit from accurate numerical predictions. However, the predictions are only as accurate as the data used to drive the models and, given the large temporal and spatial variability of the surf zone, inaccuracies in data are unavoidable such that useful predictions require corresponding estimates of uncertainty. We demonstrate how a Bayesian-network model can be used to provide accurate predictions of wave-height evolution in the surf zone given very sparse and/or inaccurate boundary-condition data. The approach is based on a formal treatment of a data-assimilation problem that takes advantage of significant reduction of the dimensionality of the model system. We demonstrate that predictions of a detailed geophysical model of the wave evolution are reproduced accurately using a Bayesian approach. In this surf-zone application, forward prediction skill was 83%, and uncertainties in the model inputs were accurately transferred to uncertainty in output variables. We also demonstrate that if modeling uncertainties were not conveyed to the Bayesian network (i.e., perfect data or model were assumed), then overly optimistic prediction uncertainties were computed. More consistent predictions and uncertainties were obtained by including model-parameter errors as a source of input uncertainty. Improved predictions (skill of 90%) were achieved because the Bayesian network simultaneously estimated optimal parameters while predicting wave heights.
NASA Technical Reports Server (NTRS)
He, Yuning
2015-01-01
The behavior of complex aerospace systems is governed by numerous parameters. For safety analysis it is important to understand how the system behaves with respect to these parameter values. In particular, understanding the boundaries between safe and unsafe regions is of major importance. In this paper, we describe a hierarchical Bayesian statistical modeling approach for the online detection and characterization of such boundaries. Our method for classification with active learning uses a particle filter-based model and a boundary-aware metric for best performance. From a library of candidate shapes incorporated with domain expert knowledge, the location and parameters of the boundaries are estimated using advanced Bayesian modeling techniques. The results of our boundary analysis are then provided in a form understandable by the domain expert. We illustrate our approach using a simulation model of a NASA neuro-adaptive flight control system, as well as a system for the detection of separation violations in the terminal airspace.
Construction of an Improved Bayesian Clutter Suppression Model for Gas Detection
Heasler, Patrick G.; Anderson, Kevin K.; Hylden, Jeffrey L.
2002-10-28
This technical report describes a nonlinear Bayesian Regression model that can be used to estimate effuent concentrations from IR hyperspectral data. As the title implies, the model is constructed to account for background clutter more effectively than current estimators. Although the main objective is to account for background clutter, which is the dominant source of variability in IR data, the model could easily be extended to allow for uncertainties in the atmosphere. The term, "clutter," refers to the variations that occur in the image spectra because emissivity and background temperature change from pixel to pixel. The Bayesian regression model utilizes a more complete description of background clutter to obtain better estimates. The description is in terms of a "prior distribution" on background radiance.
Models and simulation of 3D neuronal dendritic trees using Bayesian networks.
López-Cruz, Pedro L; Bielza, Concha; Larrañaga, Pedro; Benavides-Piccione, Ruth; DeFelipe, Javier
2011-12-01
Neuron morphology is crucial for neuronal connectivity and brain information processing. Computational models are important tools for studying dendritic morphology and its role in brain function. We applied a class of probabilistic graphical models called Bayesian networks to generate virtual dendrites from layer III pyramidal neurons from three different regions of the neocortex of the mouse. A set of 41 morphological variables were measured from the 3D reconstructions of real dendrites and their probability distributions used in a machine learning algorithm to induce the model from the data. A simulation algorithm is also proposed to obtain new dendrites by sampling values from Bayesian networks. The main advantage of this approach is that it takes into account and automatically locates the relationships between variables in the data instead of using predefined dependencies. Therefore, the methodology can be applied to any neuronal class while at the same time exploiting class-specific properties. Also, a Bayesian network was defined for each part of the dendrite, allowing the relationships to change in the different sections and to model heterogeneous developmental factors or spatial influences. Several univariate statistical tests and a novel multivariate test based on Kullback-Leibler divergence estimation confirmed that virtual dendrites were similar to real ones. The analyses of the models showed relationships that conform to current neuroanatomical knowledge and support model correctness. At the same time, studying the relationships in the models can help to identify new interactions between variables related to dendritic morphology. PMID:21305364
Calibration of complex models through Bayesian evidence synthesis: a demonstration and tutorial.
Jackson, Christopher H; Jit, Mark; Sharples, Linda D; De Angelis, Daniela
2015-02-01
Decision-analytic models must often be informed using data that are only indirectly related to the main model parameters. The authors outline how to implement a Bayesian synthesis of diverse sources of evidence to calibrate the parameters of a complex model. A graphical model is built to represent how observed data are generated from statistical models with unknown parameters and how those parameters are related to quantities of interest for decision making. This forms the basis of an algorithm to estimate a posterior probability distribution, which represents the updated state of evidence for all unknowns given all data and prior beliefs. This process calibrates the quantities of interest against data and, at the same time, propagates all parameter uncertainties to the results used for decision making. To illustrate these methods, the authors demonstrate how a previously developed Markov model for the progression of human papillomavirus (HPV-16) infection was rebuilt in a Bayesian framework. Transition probabilities between states of disease severity are inferred indirectly from cross-sectional observations of prevalence of HPV-16 and HPV-16-related disease by age, cervical cancer incidence, and other published information. Previously, a discrete collection of plausible scenarios was identified but with no further indication of which of these are more plausible. Instead, the authors derive a Bayesian posterior distribution, in which scenarios are implicitly weighted according to how well they are supported by the data. In particular, we emphasize the appropriate choice of prior distributions and checking and comparison of fitted models. PMID:23886677
Bayesian Inference for Growth Mixture Models with Latent Class Dependent Missing Data
Lu, Zhenqiu Laura; Zhang, Zhiyong; Lubke, Gitta
2014-01-01
Growth mixture models (GMMs) with nonignorable missing data have drawn increasing attention in research communities but have not been fully studied. The goal of this article is to propose and to evaluate a Bayesian method to estimate the GMMs with latent class dependent missing data. An extended GMM is first presented in which class probabilities depend on some observed explanatory variables and data missingness depends on both the explanatory variables and a latent class variable. A full Bayesian method is then proposed to estimate the model. Through the data augmentation method, conditional posterior distributions for all model parameters and missing data are obtained. A Gibbs sampling procedure is then used to generate Markov chains of model parameters for statistical inference. The application of the model and the method is first demonstrated through the analysis of mathematical ability growth data from the National Longitudinal Survey of Youth 1997 (Bureau of Labor Statistics, U.S. Department of Labor, 1997). A simulation study considering 3 main factors (the sample size, the class probability, and the missing data mechanism) is then conducted and the results show that the proposed Bayesian estimation approach performs very well under the studied conditions. Finally, some implications of this study, including the misspecified missingness mechanism, the sample size, the sensitivity of the model, the number of latent classes, the model comparison, and the future directions of the approach, are discussed. PMID:24790248
Bayesian Monte Carlo updating of Hudson River PCB model using water column PCB measurements
Zhang, S.; Toll, J.; Cothern, K.
1995-12-31
The authors have developed prior probability distributions for model parameters and terms describing physico-chemical processes in sediment and water column models of PCB fate in a segment of the lower Hudson River, and performed importance analyses to identify the key uncertainties affecting the models` predictive power. In this work, the authors employ field measurements of the mean total water column PCB concentration from nearby river segments to refine the prior probability distributions for the important parameters and terms in the water column PCB model, using Bayesian Monte Carlo analysis. The principal objectives of the current work are (1) to implement Bayesian Monte Carlo analysis, to demonstrate the technique and evaluate its potential benefits, and (2) to improve the parameterization of the water column PCB model on the basis of site-specific PCB concentration data. The Bayesian updating procedure resulted in improved estimates of PCB mass loading and re-suspension velocity terms, but posteriors for three other key parameters -- settling velocity and particulate PCB fractions in the water column and surface sediments -- were unaffected by the information extracted from the new field data. In addition, the authors found that some of the high posterior probability parameter vectors, though mathematically plausible, were physically implausible, as a consequence of the unrealistic (but common) Monte Carlo assumption that the model`s parameters are independently distributed. The implications of this and other findings are discussed.
Genome scans for detecting footprints of local adaptation using a Bayesian factor model.
Duforet-Frebourg, Nicolas; Bazin, Eric; Blum, Michael G B
2014-09-01
There is a considerable impetus in population genomics to pinpoint loci involved in local adaptation. A powerful approach to find genomic regions subject to local adaptation is to genotype numerous molecular markers and look for outlier loci. One of the most common approaches for selection scans is based on statistics that measure population differentiation such as FST. However, there are important caveats with approaches related to FST because they require grouping individuals into populations and they additionally assume a particular model of population structure. Here, we implement a more flexible individual-based approach based on Bayesian factor models. Factor models capture population structure with latent variables called factors, which can describe clustering of individuals into populations or isolation-by-distance patterns. Using hierarchical Bayesian modeling, we both infer population structure and identify outlier loci that are candidates for local adaptation. In order to identify outlier loci, the hierarchical factor model searches for loci that are atypically related to population structure as measured by the latent factors. In a model of population divergence, we show that it can achieve a 2-fold or more reduction of false discovery rate compared with the software BayeScan or with an FST approach. We show that our software can handle large data sets by analyzing the single nucleotide polymorphisms of the Human Genome Diversity Project. The Bayesian factor model is implemented in the open-source PCAdapt software. PMID:24899666
NASA Astrophysics Data System (ADS)
Li, Lianlin; Jafarpour, Behnam
2010-09-01
We present a Bayesian framework for reconstructing hydraulic properties of rock formations from nonlinear dynamic flow data by imposing sparsity on the distribution of the parameters in a sparse transform basis through Laplace prior distribution. Sparse representation of the subsurface flow properties in a compression transform basis (where a compact representation is often possible) lends itself to a natural regularization approach, i.e. sparsity regularization, which has recently been exploited in solving ill-posed subsurface flow inverse problems. The Bayesian estimation approach presented here allows for a probabilistic treatment of the sparse reconstruction problem and has its roots in machine learning and the recently introduced relevance vector machine algorithm for linear inverse problems. We formulate the Bayesian sparse reconstruction algorithm and apply it to nonlinear subsurface inverse problems where solution sparsity in a discrete cosine transform is assumed. The probabilistic description of solution sparsity, as opposed to deterministic regularization, allows for quantification of the estimation uncertainty and avoids the need for specifying a regularization parameter. Several numerical experiments from multiphase subsurface flow application are presented to illustrate the performance of the proposed method and compare it with the regular Bayesian estimation approach that does not impose solution sparsity. While the examples are derived from subsurface flow modeling, the proposed framework can be applied to nonlinear inverse problems in other imaging applications including geophysical and medical imaging and electromagnetic inverse problem.
NASA Astrophysics Data System (ADS)
Caiado, C. C. S.; Goldstein, M.
2015-09-01
In this paper we present and illustrate basic Bayesian techniques for the uncertainty analysis of complex physical systems modelled by computer simulators. We focus on emulation and history matching and also discuss the treatment of observational errors and structural discrepancies in time series. We exemplify such methods using a four-box model for the termohaline circulation. We show how these methods may be applied to systems containing tipping points and how to treat possible discontinuities using multiple emulators.
Bayesian Statistical Inference in Ion-Channel Models with Exact Missed Event Correction.
Epstein, Michael; Calderhead, Ben; Girolami, Mark A; Sivilotti, Lucia G
2016-07-26
The stochastic behavior of single ion channels is most often described as an aggregated continuous-time Markov process with discrete states. For ligand-gated channels each state can represent a different conformation of the channel protein or a different number of bound ligands. Single-channel recordings show only whether the channel is open or shut: states of equal conductance are aggregated, so transitions between them have to be inferred indirectly. The requirement to filter noise from the raw signal further complicates the modeling process, as it limits the time resolution of the data. The consequence of the reduced bandwidth is that openings or shuttings that are shorter than the resolution cannot be observed; these are known as missed events. Postulated models fitted using filtered data must therefore explicitly account for missed events to avoid bias in the estimation of rate parameters and therefore assess parameter identifiability accurately. In this article, we present the first, to our knowledge, Bayesian modeling of ion-channels with exact missed events correction. Bayesian analysis represents uncertain knowledge of the true value of model parameters by considering these parameters as random variables. This allows us to gain a full appreciation of parameter identifiability and uncertainty when estimating values for model parameters. However, Bayesian inference is particularly challenging in this context as the correction for missed events increases the computational complexity of the model likelihood. Nonetheless, we successfully implemented a two-step Markov chain Monte Carlo method that we called "BICME", which performs Bayesian inference in models of realistic complexity. The method is demonstrated on synthetic and real single-channel data from muscle nicotinic acetylcholine channels. We show that parameter uncertainty can be characterized more accurately than with maximum-likelihood methods. Our code for performing inference in these ion channel
Fusing Continuous-Valued Medical Labels Using a Bayesian Model.
Zhu, Tingting; Dunkley, Nic; Behar, Joachim; Clifton, David A; Clifford, Gari D
2015-12-01
With the rapid increase in volume of time series medical data available through wearable devices, there is a need to employ automated algorithms to label data. Examples of labels include interventions, changes in activity (e.g. sleep) and changes in physiology (e.g. arrhythmias). However, automated algorithms tend to be unreliable resulting in lower quality care. Expert annotations are scarce, expensive, and prone to significant inter- and intra-observer variance. To address these problems, a Bayesian Continuous-valued Label Aggregator (BCLA) is proposed to provide a reliable estimation of label aggregation while accurately infer the precision and bias of each algorithm. The BCLA was applied to QT interval (pro-arrhythmic indicator) estimation from the electrocardiogram using labels from the 2006 PhysioNet/Computing in Cardiology Challenge database. It was compared to the mean, median, and a previously proposed Expectation Maximization (EM) label aggregation approaches. While accurately predicting each labelling algorithm's bias and precision, the root-mean-square error of the BCLA was 11.78 ± 0.63 ms, significantly outperforming the best Challenge entry (15.37 ± 2.13 ms) as well as the EM, mean, and median voting strategies (14.76 ± 0.52, 17.61 ± 0.55, and 14.43 ± 0.57 ms respectively with p < 0.0001). The BCLA could therefore provide accurate estimation for medical continuous-valued label tasks in an unsupervised manner even when the ground truth is not available. PMID:26036335
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach
Duarte, Belmiro P. M.; Wong, Weng Kee
2014-01-01
Summary This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted. PMID:26512159
Gilet, Estelle; Diard, Julien; Bessière, Pierre
2011-01-01
In this paper, we study the collaboration of perception and action representations involved in cursive letter recognition and production. We propose a mathematical formulation for the whole perception–action loop, based on probabilistic modeling and Bayesian inference, which we call the Bayesian Action–Perception (BAP) model. Being a model of both perception and action processes, the purpose of this model is to study the interaction of these processes. More precisely, the model includes a feedback loop from motor production, which implements an internal simulation of movement. Motor knowledge can therefore be involved during perception tasks. In this paper, we formally define the BAP model and show how it solves the following six varied cognitive tasks using Bayesian inference: i) letter recognition (purely sensory), ii) writer recognition, iii) letter production (with different effectors), iv) copying of trajectories, v) copying of letters, and vi) letter recognition (with internal simulation of movements). We present computer simulations of each of these cognitive tasks, and discuss experimental predictions and theoretical developments. PMID:21674043
NASA Astrophysics Data System (ADS)
Stockton, T.; Black, P.; Tauxe, J.; Catlett, K.
2004-12-01
Bayesian decision analysis provides a unified framework for coherent decision-making. Two key components of Bayesian decision analysis are probability distributions and utility functions. Calculating posterior distributions and performing decision analysis can be computationally challenging, especially for complex environmental models. In addition, probability distributions and utility functions for environmental models must be specified through expert elicitation, stakeholder consensus, or data collection, all of which have their own set of technical and political challenges. Nevertheless, a grand appeal of the Bayesian approach for environmental decision- making is the explicit treatment of uncertainty, including expert judgment. The impact of expert judgment on the environmental decision process, though integral, goes largely unassessed. Regulations and orders of the Environmental Protection Agency, Department Of Energy, and Nuclear Regulatory Agency orders require assessing the impact on human health of radioactive waste contamination over periods of up to ten thousand years. Towards this end complex environmental simulation models are used to assess "risk" to human and ecological health from migration of radioactive waste. As the computational burden of environmental modeling is continually reduced probabilistic process modeling using Monte Carlo simulation is becoming routinely used to propagate uncertainty from model inputs through model predictions. The utility of a Bayesian approach to environmental decision-making is discussed within the context of a buried radioactive waste example. This example highlights the desirability and difficulties of merging the cost of monitoring, the cost of the decision analysis, the cost and viability of clean up, and the probability of human health impacts within a rigorous decision framework.
Bayesian Analysis of Multivariate Probit Models with Surrogate Outcome Data
ERIC Educational Resources Information Center
Poon, Wai-Yin; Wang, Hai-Bin
2010-01-01
A new class of parametric models that generalize the multivariate probit model and the errors-in-variables model is developed to model and analyze ordinal data. A general model structure is assumed to accommodate the information that is obtained via surrogate variables. A hybrid Gibbs sampler is developed to estimate the model parameters. To…
Bayesian approaches to spatial inference: Modelling and computational challenges and solutions
NASA Astrophysics Data System (ADS)
Moores, Matthew; Mengersen, Kerrie
2014-12-01
We discuss a range of Bayesian modelling approaches for spatial data and investigate some of the associated computational challenges. This paper commences with a brief review of Bayesian mixture models and Markov random fields, with enabling computational algorithms including Markov chain Monte Carlo (MCMC) and integrated nested Laplace approximation (INLA). Following this, we focus on the Potts model as a canonical approach, and discuss the challenge of estimating the inverse temperature parameter that controls the degree of spatial smoothing. We compare three approaches to addressing the doubly intractable nature of the likelihood, namely pseudo-likelihood, path sampling and the exchange algorithm. These techniques are applied to satellite data used to analyse water quality in the Great Barrier Reef.
NASA Astrophysics Data System (ADS)
Freni, Gabriele; Mannina, Giorgio
In urban drainage modelling, uncertainty analysis is of undoubted necessity. However, uncertainty analysis in urban water-quality modelling is still in its infancy and only few studies have been carried out. Therefore, several methodological aspects still need to be experienced and clarified especially regarding water quality modelling. The use of the Bayesian approach for uncertainty analysis has been stimulated by its rigorous theoretical framework and by the possibility of evaluating the impact of new knowledge on the modelling predictions. Nevertheless, the Bayesian approach relies on some restrictive hypotheses that are not present in less formal methods like the Generalised Likelihood Uncertainty Estimation (GLUE). One crucial point in the application of Bayesian method is the formulation of a likelihood function that is conditioned by the hypotheses made regarding model residuals. Statistical transformations, such as the use of Box-Cox equation, are generally used to ensure the homoscedasticity of residuals. However, this practice may affect the reliability of the analysis leading to a wrong uncertainty estimation. The present paper aims to explore the influence of the Box-Cox equation for environmental water quality models. To this end, five cases were considered one of which was the “real” residuals distributions (i.e. drawn from available data). The analysis was applied to the Nocella experimental catchment (Italy) which is an agricultural and semi-urbanised basin where two sewer systems, two wastewater treatment plants and a river reach were monitored during both dry and wet weather periods. The results show that the uncertainty estimation is greatly affected by residual transformation and a wrong assumption may also affect the evaluation of model uncertainty. The use of less formal methods always provide an overestimation of modelling uncertainty with respect to Bayesian method but such effect is reduced if a wrong assumption is made regarding the
On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization
Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang
2015-02-01
The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesian inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.
Bayesian model selection framework for identifying growth patterns in filamentous fungi.
Lin, Xiao; Terejanu, Gabriel; Shrestha, Sajan; Banerjee, Sourav; Chanda, Anindya
2016-06-01
This paper describes a rigorous methodology for quantification of model errors in fungal growth models. This is essential to choose the model that best describes the data and guide modeling efforts. Mathematical modeling of growth of filamentous fungi is necessary in fungal biology for gaining systems level understanding on hyphal and colony behaviors in different environments. A critical challenge in the development of these mathematical models arises from the indeterminate nature of their colony architecture, which is a result of processing diverse intracellular signals induced in response to a heterogeneous set of physical and nutritional factors. There exists a practical gap in connecting fungal growth models with measurement data. Here, we address this gap by introducing the first unified computational framework based on Bayesian inference that can quantify individual model errors and rank the statistical models based on their descriptive power against data. We show that this Bayesian model comparison is just a natural formalization of Occam׳s razor. The application of this framework is discussed in comparing three models in the context of synthetic data generated from a known true fungal growth model. This framework of model comparison achieves a trade-off between data fitness and model complexity and the quantified model error not only helps in calibrating and comparing the models, but also in making better predictions and guiding model refinements. PMID:27000772
NASA Astrophysics Data System (ADS)
Xu, T.; Valocchi, A. J.
2014-12-01
Effective water resource management typically relies on numerical models to analyse groundwater flow and solute transport processes. These models are usually subject to model structure error due to simplification and/or misrepresentation of the real system. As a result, the model outputs may systematically deviate from measurements, thus violating a key assumption for traditional regression-based calibration and uncertainty analysis. On the other hand, model structure error induced bias can be described statistically in an inductive, data-driven way based on historical model-to-measurement misfit. We adopt a fully Bayesian approach that integrates a Gaussian process error model to account for model structure error to the calibration, prediction and uncertainty analysis of groundwater models. The posterior distributions of parameters of the groundwater model and the Gaussian process error model are jointly inferred using DREAM, an efficient Markov chain Monte Carlo sampler. We test the usefulness of the fully Bayesian approach towards a synthetic case study of surface-ground water interaction under changing pumping conditions. We first illustrate through this example that traditional least squares regression without accounting for model structure error yields biased parameter estimates due to parameter compensation as well as biased predictions. In contrast, the Bayesian approach gives less biased parameter estimates. Moreover, the integration of a Gaussian process error model significantly reduces predictive bias and leads to prediction intervals that are more consistent with observations. The results highlight the importance of explicit treatment of model structure error especially in circumstances where subsequent decision-making and risk analysis require accurate prediction and uncertainty quantification. In addition, the data-driven error modelling approach is capable of extracting more information from observation data than using a groundwater model alone.
A Bayesian method for construction of Markov models to describe dynamics on various time-scales
NASA Astrophysics Data System (ADS)
Rains, Emily K.; Andersen, Hans C.
2010-10-01
The dynamics of many biological processes of interest, such as the folding of a protein, are slow and complicated enough that a single molecular dynamics simulation trajectory of the entire process is difficult to obtain in any reasonable amount of time. Moreover, one such simulation may not be sufficient to develop an understanding of the mechanism of the process, and multiple simulations may be necessary. One approach to circumvent this computational barrier is the use of Markov state models. These models are useful because they can be constructed using data from a large number of shorter simulations instead of a single long simulation. This paper presents a new Bayesian method for the construction of Markov models from simulation data. A Markov model is specified by (τ,P,T), where τ is the mesoscopic time step, P is a partition of configuration space into mesostates, and T is an NP×NP transition rate matrix for transitions between the mesostates in one mesoscopic time step, where NP is the number of mesostates in P. The method presented here is different from previous Bayesian methods in several ways. (1) The method uses Bayesian analysis to determine the partition as well as the transition probabilities. (2) The method allows the construction of a Markov model for any chosen mesoscopic time-scale τ. (3) It constructs Markov models for which the diagonal elements of T are all equal to or greater than 0.5. Such a model will be called a "consistent mesoscopic Markov model" (CMMM). Such models have important advantages for providing an understanding of the dynamics on a mesoscopic time-scale. The Bayesian method uses simulation data to find a posterior probability distribution for (P,T) for any chosen τ. This distribution can be regarded as the Bayesian probability that the kinetics observed in the atomistic simulation data on the mesoscopic time-scale τ was generated by the CMMM specified by (P,T). An optimization algorithm is used to find the most probable
NASA Astrophysics Data System (ADS)
WöHling, Thomas; Vrugt, Jasper A.
2008-12-01
Most studies in vadose zone hydrology use a single conceptual model for predictive inference and analysis. Focusing on the outcome of a single model is prone to statistical bias and underestimation of uncertainty. In this study, we combine multiobjective optimization and Bayesian model averaging (BMA) to generate forecast ensembles of soil hydraulic models. To illustrate our method, we use observed tensiometric pressure head data at three different depths in a layered vadose zone of volcanic origin in New Zealand. A set of seven different soil hydraulic models is calibrated using a multiobjective formulation with three different objective functions that each measure the mismatch between observed and predicted soil water pressure head at one specific depth. The Pareto solution space corresponding to these three objectives is estimated with AMALGAM and used to generate four different model ensembles. These ensembles are postprocessed with BMA and used for predictive analysis and uncertainty estimation. Our most important conclusions for the vadose zone under consideration are (1) the mean BMA forecast exhibits similar predictive capabilities as the best individual performing soil hydraulic model, (2) the size of the BMA uncertainty ranges increase with increasing depth and dryness in the soil profile, (3) the best performing ensemble corresponds to the compromise (or balanced) solution of the three-objective Pareto surface, and (4) the combined multiobjective optimization and BMA framework proposed in this paper is very useful to generate forecast ensembles of soil hydraulic models.
Lifting a veil on diversity: a Bayesian approach to fitting relative-abundance models.
Golicher, Duncan J; O'Hara, Robert B; Ruíz-Montoya, Lorena; Cayuela, Luis
2006-02-01
Bayesian methods incorporate prior knowledge into a statistical analysis. This prior knowledge is usually restricted to assumptions regarding the form of probability distributions of the parameters of interest, leaving their values to be determined mainly through the data. Here we show how a Bayesian approach can be applied to the problem of drawing inference regarding species abundance distributions and comparing diversity indices between sites. The classic log series and the lognormal models of relative- abundance distribution are apparently quite different in form. The first is a sampling distribution while the other is a model of abundance of the underlying population. Bayesian methods help unite these two models in a common framework. Markov chain Monte Carlo simulation can be used to fit both distributions as small hierarchical models with shared common assumptions. Sampling error can be assumed to follow a Poisson distribution. Species not found in a sample, but suspected to be present in the region or community of interest, can be given zero abundance. This not only simplifies the process of model fitting, but also provides a convenient way of calculating confidence intervals for diversity indices. The method is especially useful when a comparison of species diversity between sites with different sample sizes is the key motivation behind the research. We illustrate the potential of the approach using data on fruit-feeding butterflies in southern Mexico. We conclude that, once all assumptions have been made transparent, a single data set may provide support for the belief that diversity is negatively affected by anthropogenic forest disturbance. Bayesian methods help to apply theory regarding the distribution of abundance in ecological communities to applied conservation. PMID:16705973
Kwak, Sehyun; Svensson, J; Brix, M; Ghim, Y-C
2016-02-01
A Bayesian model of the emission spectrum of the JET lithium beam has been developed to infer the intensity of the Li I (2p-2s) line radiation and associated uncertainties. The detected spectrum for each channel of the lithium beam emission spectroscopy system is here modelled by a single Li line modified by an instrumental function, Bremsstrahlung background, instrumental offset, and interference filter curve. Both the instrumental function and the interference filter curve are modelled with non-parametric Gaussian processes. All free parameters of the model, the intensities of the Li line, Bremsstrahlung background, and instrumental offset, are inferred using Bayesian probability theory with a Gaussian likelihood for photon statistics and electronic background noise. The prior distributions of the free parameters are chosen as Gaussians. Given these assumptions, the intensity of the Li line and corresponding uncertainties are analytically available using a Bayesian linear inversion technique. The proposed approach makes it possible to extract the intensity of Li line without doing a separate background subtraction through modulation of the Li beam. PMID:26931843
Bayesian estimation of airborne fugitive emissions using a Gaussian plume model
NASA Astrophysics Data System (ADS)
Hosseini, Bamdad; Stockie, John M.
2016-09-01
A new method is proposed for estimating the rate of fugitive emissions of particulate matter from multiple time-dependent sources via measurements of deposition and concentration. We cast this source inversion problem within the Bayesian framework, and use a forward model based on a Gaussian plume solution. We present three alternate models for constructing the prior distribution on the emission rates as functions of time. Next, we present an industrial case study in which our framework is applied to estimate the rate of fugitive emissions of lead particulates from a smelter in Trail, British Columbia, Canada. The Bayesian framework not only provides an approximate solution to the inverse problem, but also quantifies the uncertainty in the solution. Using this information we perform an uncertainty propagation study in order to assess the impact of the estimated sources on the area surrounding the industrial site.
Bayesian approach to color-difference models based on threshold and constant-stimuli methods.
Brusola, Fernando; Tortajada, Ignacio; Lengua, Ismael; Jordá, Begoña; Peris, Guillermo
2015-06-15
An alternative approach based on statistical Bayesian inference is presented to deal with the development of color-difference models and the precision of parameter estimation. The approach was applied to simulated data and real data, the latter published by selected authors involved with the development of color-difference formulae using traditional methods. Our results show very good agreement between the Bayesian and classical approaches. Among other benefits, our proposed methodology allows one to determine the marginal posterior distribution of each random individual parameter of the color-difference model. In this manner, it is possible to analyze the effect of individual parameters on the statistical significance calculation of a color-difference equation. PMID:26193510
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
NASA Astrophysics Data System (ADS)
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
Chain Graph Models to Elicit the Structure of a Bayesian Network
Stefanini, Federico M.
2014-01-01
Bayesian networks are possibly the most successful graphical models to build decision support systems. Building the structure of large networks is still a challenging task, but Bayesian methods are particularly suited to exploit experts' degree of belief in a quantitative way while learning the network structure from data. In this paper details are provided about how to build a prior distribution on the space of network structures by eliciting a chain graph model on structural reference features. Several structural features expected to be often useful during the elicitation are described. The statistical background needed to effectively use this approach is summarized, and some potential pitfalls are illustrated. Finally, a few seminal contributions from the literature are reformulated in terms of structural features. PMID:24688427
A Bayesian non-parametric Potts model with application to pre-surgical FMRI data.
Johnson, Timothy D; Liu, Zhuqing; Bartsch, Andreas J; Nichols, Thomas E
2013-08-01
The Potts model has enjoyed much success as a prior model for image segmentation. Given the individual classes in the model, the data are typically modeled as Gaussian random variates or as random variates from some other parametric distribution. In this article, we present a non-parametric Potts model and apply it to a functional magnetic resonance imaging study for the pre-surgical assessment of peritumoral brain activation. In our model, we assume that the Z-score image from a patient can be segmented into activated, deactivated, and null classes, or states. Conditional on the class, or state, the Z-scores are assumed to come from some generic distribution which we model non-parametrically using a mixture of Dirichlet process priors within the Bayesian framework. The posterior distribution of the model parameters is estimated with a Markov chain Monte Carlo algorithm, and Bayesian decision theory is used to make the final classifications. Our Potts prior model includes two parameters, the standard spatial regularization parameter and a parameter that can be interpreted as the a priori probability that each voxel belongs to the null, or background state, conditional on the lack of spatial regularization. We assume that both of these parameters are unknown, and jointly estimate them along with other model parameters. We show through simulation studies that our model performs on par, in terms of posterior expected loss, with parametric Potts models when the parametric model is correctly specified and outperforms parametric models when the parametric model in misspecified. PMID:22627277
Two levels of Bayesian model averaging for optimal control of stochastic systems
NASA Astrophysics Data System (ADS)
Darwen, Paul J.
2013-02-01
Bayesian model averaging provides the best possible estimate of a model, given the data. This article uses that approach twice: once to get a distribution of plausible models of the world, and again to find a distribution of plausible control functions. The resulting ensemble gives control instructions different from simply taking the single best-fitting model and using it to find a single lowest-error control function for that single model. The only drawback is, of course, the need for more computer time: this article demonstrates that the required computer time is feasible. The test problem here is from flood control and risk management.
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Illman, Walter A.; Wöhling, Thomas; Nowak, Wolfgang
2015-12-01
Groundwater modelers face the challenge of how to assign representative parameter values to the studied aquifer. Several approaches are available to parameterize spatial heterogeneity in aquifer parameters. They differ in their conceptualization and complexity, ranging from homogeneous models to heterogeneous random fields. While it is common practice to invest more effort into data collection for models with a finer resolution of heterogeneities, there is a lack of advice which amount of data is required to justify a certain level of model complexity. In this study, we propose to use concepts related to Bayesian model selection to identify this balance. We demonstrate our approach on the characterization of a heterogeneous aquifer via hydraulic tomography in a sandbox experiment (Illman et al., 2010). We consider four increasingly complex parameterizations of hydraulic conductivity: (1) Effective homogeneous medium, (2) geology-based zonation, (3) interpolation by pilot points, and (4) geostatistical random fields. First, we investigate the shift in justified complexity with increasing amount of available data by constructing a model confusion matrix. This matrix indicates the maximum level of complexity that can be justified given a specific experimental setup. Second, we determine which parameterization is most adequate given the observed drawdown data. Third, we test how the different parameterizations perform in a validation setup. The results of our test case indicate that aquifer characterization via hydraulic tomography does not necessarily require (or justify) a geostatistical description. Instead, a zonation-based model might be a more robust choice, but only if the zonation is geologically adequate.
Cuevas Rivera, Dario; Bitzer, Sebastian; Kiebel, Stefan J.
2015-01-01
The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an ‘intelligent coincidence detector’, which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena. PMID:26451888
Cuevas Rivera, Dario; Bitzer, Sebastian; Kiebel, Stefan J
2015-10-01
The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena. PMID:26451888
Cross-validation analysis of bias models in Bayesian multi-model projections of climate
NASA Astrophysics Data System (ADS)
Huttunen, J. M. J.; Räisänen, J.; Nissinen, A.; Lipponen, A.; Kolehmainen, V.
2016-05-01
Climate change projections are commonly based on multi-model ensembles of climate simulations. In this paper we consider the choice of bias models in Bayesian multimodel predictions. Buser et al. (Clim Res 44(2-3):227-241, 2010a) introduced a hybrid bias model which combines commonly used constant bias and constant relation bias assumptions. The hybrid model includes a weighting parameter which balances these bias models. In this study, we use a cross-validation approach to study which bias model or bias parameter leads to, in a specific sense, optimal climate change projections. The analysis is carried out for summer and winter season means of 2 m-temperatures spatially averaged over the IPCC SREX regions, using 19 model runs from the CMIP5 data set. The cross-validation approach is applied to calculate optimal bias parameters (in the specific sense) for projecting the temperature change from the control period (1961-2005) to the scenario period (2046-2090). The results are compared to the results of the Buser et al. (Clim Res 44(2-3):227-241, 2010a) method which includes the bias parameter as one of the unknown parameters to be estimated from the data.
Predictive data-derived Bayesian statistic-transport model and simulator of sunken oil mass
NASA Astrophysics Data System (ADS)
Echavarria Gregory, Maria Angelica
Sunken oil is difficult to locate because remote sensing techniques cannot as yet provide views of sunken oil over large areas. Moreover, the oil may re-suspend and sink with changes in salinity, sediment load, and temperature, making deterministic fate models difficult to deploy and calibrate when even the presence of sunken oil is difficult to assess. For these reasons, together with the expense of field data collection, there is a need for a statistical technique integrating limited data collection with stochastic transport modeling. Predictive Bayesian modeling techniques have been developed and demonstrated for exploiting limited information for decision support in many other applications. These techniques brought to a multi-modal Lagrangian modeling framework, representing a near-real time approach to locating and tracking sunken oil driven by intrinsic physical properties of field data collected following a spill after oil has begun collecting on a relatively flat bay bottom. Methods include (1) development of the conceptual predictive Bayesian model and multi-modal Gaussian computational approach based on theory and literature review; (2) development of an object-oriented programming and combinatorial structure capable of managing data, integration and computation over an uncertain and highly dimensional parameter space; (3) creating a new bi-dimensional approach of the method of images to account for curved shoreline boundaries; (4) confirmation of model capability for locating sunken oil patches using available (partial) real field data and capability for temporal projections near curved boundaries using simulated field data; and (5) development of a stand-alone open-source computer application with graphical user interface capable of calibrating instantaneous oil spill scenarios, obtaining sets maps of relative probability profiles at different prediction times and user-selected geographic areas and resolution, and capable of performing post
Estimation of temporal gait parameters using Bayesian models on acceleration signals.
López-Nava, I H; Muñoz-Meléndez, A; Pérez Sanpablo, A I; Alessi Montero, A; Quiñones Urióstegui, I; Núñez Carrera, L
2016-01-01
The purpose of this study is to develop a system capable of performing calculation of temporal gait parameters using two low-cost wireless accelerometers and artificial intelligence-based techniques as part of a larger research project for conducting human gait analysis. Ten healthy subjects of different ages participated in this study and performed controlled walking tests. Two wireless accelerometers were placed on their ankles. Raw acceleration signals were processed in order to obtain gait patterns from characteristic peaks related to steps. A Bayesian model was implemented to classify the characteristic peaks into steps or nonsteps. The acceleration signals were segmented based on gait events, such as heel strike and toe-off, of actual steps. Temporal gait parameters, such as cadence, ambulation time, step time, gait cycle time, stance and swing phase time, simple and double support time, were estimated from segmented acceleration signals. Gait data-sets were divided into two groups of ages to test Bayesian models in order to classify the characteristic peaks. The mean error obtained from calculating the temporal gait parameters was 4.6%. Bayesian models are useful techniques that can be applied to classification of gait data of subjects at different ages with promising results. PMID:25876180
A Bayesian model to predict oil spill consequences of management plans in the Gulf of Mexico
Obie, D.S.; Englehardt, J.
1996-12-31
A Bayesian risk analysis model, comprising of a release assessment module and an exposure assessment module for the oil transportation system in the Gulf of Mexico is described in this paper. The model is used to compute probability distributions for oil spill quantities for 160 grid cells in the Gulf of Mexico, and the volumes of that oil to reach 58 coastline segments over a user-specified planning period. In addition to historical oil spill data, the model can accept subjective information on management alternatives involving changes in the oil transportation system. For example, volumes, tugboat escorts, mechanical equipment and hull design can be altered, and user confidence can be entered concerning how changes will effect spill number and size. The release assessment module uses a predictive Bayesian negative binomial distribution for spill number, and a predictive Bayesian distribution based on the Pareto I distribution for spill size. Conditional transport probabilities developed by the Minerals Management Service and the results of the release assessment module were used in the exposure assessment module. Oil spill data maintained by the US Coast Guard for the years 1991-1995 were analyzed along with two basic oil transportation management scenarios.
Bayesian Analysis of a Reduced-Form Air Quality Model
Numerical air quality models are being used for assessing emission control strategies for improving ambient pollution levels across the globe. This paper applies probabilistic modeling to evaluate the effectiveness of emission reduction scenarios aimed at lowering ground-level oz...
Bayesian Multidimensional IRT Models with a Hierarchical Structure
ERIC Educational Resources Information Center
Sheng, Yanyan; Wikle, Christopher K.
2008-01-01
As item response models gain increased popularity in large-scale educational and measurement testing situations, many studies have been conducted on the development and applications of unidimensional and multidimensional models. Recently, attention has been paid to IRT-based models with an overall ability dimension underlying several ability…
A Bayesian Semiparametric Latent Variable Model for Mixed Responses
ERIC Educational Resources Information Center
Fahrmeir, Ludwig; Raach, Alexander
2007-01-01
In this paper we introduce a latent variable model (LVM) for mixed ordinal and continuous responses, where covariate effects on the continuous latent variables are modelled through a flexible semiparametric Gaussian regression model. We extend existing LVMs with the usual linear covariate effects by including nonparametric components for nonlinear…
NASA Astrophysics Data System (ADS)
Hobson, Michael P.; Jaffe, Andrew H.; Liddle, Andrew R.; Mukherjee, Pia; Parkinson, David
2014-02-01
Preface; Part I. Methods: 1. Foundations and algorithms John Skilling; 2. Simple applications of Bayesian methods D. S. Sivia and Steve Rawlings; 3. Parameter estimation using Monte Carlo sampling Antony Lewis and Sarah Bridle; 4. Model selection and multi-model interference Andrew R. Liddle, Pia Mukherjee and David Parkinson; 5. Bayesian experimental design and model selection forecasting Roberto Trotta, Martin Kunz, Pia Mukherjee and David Parkinson; 6. Signal separation in cosmology M. P. Hobson, M. A. J. Ashdown and V. Stolyarov; Part II. Applications: 7. Bayesian source extraction M. P. Hobson, Graça Rocha and R. Savage; 8. Flux measurement Daniel Mortlock; 9. Gravitational wave astronomy Neil Cornish; 10. Bayesian analysis of cosmic microwave background data Andrew H. Jaffe; 11. Bayesian multilevel modelling of cosmological populations Thomas J. Loredo and Martin A. Hendry; 12. A Bayesian approach to galaxy evolution studies Stefano Andreon; 13. Photometric redshift estimation: methods and applications Ofer Lahav, Filipe B. Abdalla and Manda Banerji; Index.
NASA Astrophysics Data System (ADS)
Hobson, Michael P.; Jaffe, Andrew H.; Liddle, Andrew R.; Mukherjee, Pia; Parkinson, David
2009-12-01
Preface; Part I. Methods: 1. Foundations and algorithms John Skilling; 2. Simple applications of Bayesian methods D. S. Sivia and Steve Rawlings; 3. Parameter estimation using Monte Carlo sampling Antony Lewis and Sarah Bridle; 4. Model selection and multi-model interference Andrew R. Liddle, Pia Mukherjee and David Parkinson; 5. Bayesian experimental design and model selection forecasting Roberto Trotta, Martin Kunz, Pia Mukherjee and David Parkinson; 6. Signal separation in cosmology M. P. Hobson, M. A. J. Ashdown and V. Stolyarov; Part II. Applications: 7. Bayesian source extraction M. P. Hobson, Graça Rocha and R. Savage; 8. Flux measurement Daniel Mortlock; 9. Gravitational wave astronomy Neil Cornish; 10. Bayesian analysis of cosmic microwave background data Andrew H. Jaffe; 11. Bayesian multilevel modelling of cosmological populations Thomas J. Loredo and Martin A. Hendry; 12. A Bayesian approach to galaxy evolution studies Stefano Andreon; 13. Photometric redshift estimation: methods and applications Ofer Lahav, Filipe B. Abdalla and Manda Banerji; Index.
A fully Bayesian method for jointly fitting instrumental calibration and X-ray spectral models
Xu, Jin; Yu, Yaming; Van Dyk, David A.; Kashyap, Vinay L.; Siemiginowska, Aneta; Drake, Jeremy; Ratzlaff, Pete; Connors, Alanna; Meng, Xiao-Li E-mail: yamingy@ics.uci.edu E-mail: vkashyap@cfa.harvard.edu E-mail: jdrake@cfa.harvard.edu E-mail: meng@stat.harvard.edu
2014-10-20
Owing to a lack of robust principled methods, systematic instrumental uncertainties have generally been ignored in astrophysical data analysis despite wide recognition of the importance of including them. Ignoring calibration uncertainty can cause bias in the estimation of source model parameters and can lead to underestimation of the variance of these estimates. We previously introduced a pragmatic Bayesian method to address this problem. The method is 'pragmatic' in that it introduced an ad hoc technique that simplified computation by neglecting the potential information in the data for narrowing the uncertainty for the calibration product. Following that work, we use a principal component analysis to efficiently represent the uncertainty of the effective area of an X-ray (or γ-ray) telescope. Here, however, we leverage this representation to enable a principled, fully Bayesian method that coherently accounts for the calibration uncertainty in high-energy spectral analysis. In this setting, the method is compared with standard analysis techniques and the pragmatic Bayesian method. The advantage of the fully Bayesian method is that it allows the data to provide information not only for estimation of the source parameters but also for the calibration product—here the effective area, conditional on the adopted spectral model. In this way, it can yield more accurate and efficient estimates of the source parameters along with valid estimates of their uncertainty. Provided that the source spectrum can be accurately described by a parameterized model, this method allows rigorous inference about the effective area by quantifying which possible curves are most consistent with the data.
Bayesian procedure for modeling dependence in generic estimates for component reliability
Lim, T.J.; Hwang, M.J.; Chung, W.D.
1997-12-01
This paper presents a mathematical model for aggregating component reliability data from dependent generic compendia. Our model postulates that generic data are sets of estimates for the parameters of the variability distribution, and the estimates are statistics of failure data from several plants. The same plant data may be utilized in some generic literature sources, which causes dependency among the generic estimates. We propose an estimation procedure based on a parametric empirical Bayesian framework. The proposed model accounts for the relative credibility as well as the dependence among generic estimates. Numerical examples are provided to show the characteristics of the model. 16 refs., 2 figs., 2 tabs.
NASA Astrophysics Data System (ADS)
Walker, David M.; Allingham, David; Lee, Heung Wing Joseph; Small, Michael
2010-02-01
Small world network models have been effective in capturing the variable behaviour of reported case data of the SARS coronavirus outbreak in Hong Kong during 2003. Simulations of these models have previously been realized using informed “guesses” of the proposed model parameters and tested for consistency with the reported data by surrogate analysis. In this paper we attempt to provide statistically rigorous parameter distributions using Approximate Bayesian Computation sampling methods. We find that such sampling schemes are a useful framework for fitting parameters of stochastic small world network models where simulation of the system is straightforward but expressing a likelihood is cumbersome.
Zhao, Ningning; Basarab, Adrian; Kouame, Denis; Tourneret, Jean-Yves
2016-08-01
This paper proposes a joint segmentation and deconvolution Bayesian method for medical ultrasound (US) images. Contrary to piecewise homogeneous images, US images exhibit heavy characteristic speckle patterns correlated with the tissue structures. The generalized Gaussian distribution (GGD) has been shown to be one of the most relevant distributions for characterizing the speckle in US images. Thus, we propose a GGD-Potts model defined by a label map coupling US image segmentation and deconvolution. The Bayesian estimators of the unknown model parameters, including the US image, the label map, and all the hyperparameters are difficult to be expressed in a closed form. Thus, we investigate a Gibbs sampler to generate samples distributed according to the posterior of interest. These generated samples are finally used to compute the Bayesian estimators of the unknown parameters. The performance of the proposed Bayesian model is compared with the existing approaches via several experiments conducted on realistic synthetic data and in vivo US images. PMID:27187959
Lee, Xing Ju; Drovandi, Christopher C; Pettitt, Anthony N
2015-03-01
Analytically or computationally intractable likelihood functions can arise in complex statistical inferential problems making them inaccessible to standard Bayesian inferential methods. Approximate Bayesian computation (ABC) methods address such inferential problems by replacing direct likelihood evaluations with repeated sampling from the model. ABC methods have been predominantly applied to parameter estimation problems and less to model choice problems due to the added difficulty of handling multiple model spaces. The ABC algorithm proposed here addresses model choice problems by extending Fearnhead and Prangle (2012, Journal of the Royal Statistical Society, Series B 74, 1-28) where the posterior mean of the model parameters estimated through regression formed the summary statistics used in the discrepancy measure. An additional stepwise multinomial logistic regression is performed on the model indicator variable in the regression step and the estimated model probabilities are incorporated into the set of summary statistics for model choice purposes. A reversible jump Markov chain Monte Carlo step is also included in the algorithm to increase model diversity for thorough exploration of the model space. This algorithm was applied to a validating example to demonstrate the robustness of the algorithm across a wide range of true model probabilities. Its subsequent use in three pathogen transmission examples of varying complexity illustrates the utility of the algorithm in inferring preference of particular transmission models for the pathogens. PMID:25303085
Ajami, N K; Duan, Q; Sorooshian, S
2006-05-05
This paper presents a new technique--Integrated Bayesian Uncertainty Estimator (IBUNE) to account for the major uncertainties of hydrologic rainfall-runoff predictions explicitly. The uncertainties from the input (forcing) data--mainly the precipitation observations and from the model parameters are reduced through a Monte Carlo Markov Chain (MCMC) scheme named Shuffled Complex Evolution Metropolis (SCEM) algorithm which has been extended to include a precipitation error model. Afterwards, the Bayesian Model Averaging (BMA) scheme is employed to further improve the prediction skill and uncertainty estimation using multiple model output. A series of case studies using three rainfall-runoff models to predict the streamflow in the Leaf River basin, Mississippi are used to examine the necessity and usefulness of this technique. The results suggests that ignoring either input forcings error or model structural uncertainty will lead to unrealistic model simulations and their associated uncertainty bounds which does not consistently capture and represent the real-world behavior of the watershed.
A High Performance Bayesian Computing Framework for Spatiotemporal Uncertainty Modeling
NASA Astrophysics Data System (ADS)
Cao, G.
2015-12-01
All types of spatiotemporal measurements are subject to uncertainty. With spatiotemporal data becomes increasingly involved in scientific research and decision making, it is important to appropriately model the impact of uncertainty. Quantitatively modeling spatiotemporal uncertainty, however, is a challenging problem considering the complex dependence and dataheterogeneities.State-space models provide a unifying and intuitive framework for dynamic systems modeling. In this paper, we aim to extend the conventional state-space models for uncertainty modeling in space-time contexts while accounting for spatiotemporal effects and data heterogeneities. Gaussian Markov Random Field (GMRF) models, also known as conditional autoregressive models, are arguably the most commonly used methods for modeling of spatially dependent data. GMRF models basically assume that a geo-referenced variable primarily depends on its neighborhood (Markov property), and the spatial dependence structure is described via a precision matrix. Recent study has shown that GMRFs are efficient approximation to the commonly used Gaussian fields (e.g., Kriging), and compared with Gaussian fields, GMRFs enjoy a series of appealing features, such as fast computation and easily accounting for heterogeneities in spatial data (e.g, point and areal). This paper represents each spatial dataset as a GMRF and integrates them into a state-space form to statistically model the temporal dynamics. Different types of spatial measurements (e.g., categorical, count or continuous), can be accounted for by according link functions. A fast alternative to MCMC framework, so-called Integrated Nested Laplace Approximation (INLA), was adopted for model inference.Preliminary case studies will be conducted to showcase the advantages of the described framework. In the first case, we apply the proposed method for modeling the water table elevation of Ogallala aquifer over the past decades. In the second case, we analyze the
Jiang, Yu; Simon, Steve; Mayo, Matthew S; Gajewski, Byron J
2015-02-20
Slow recruitment in clinical trials leads to increased costs and resource utilization, which includes both the clinic staff and patient volunteers. Careful planning and monitoring of the accrual process can prevent the unnecessary loss of these resources. We propose two hierarchical extensions to the existing Bayesian constant accrual model: the accelerated prior and the hedging prior. The new proposed priors are able to adaptively utilize the researcher's previous experience and current accrual data to produce the estimation of trial completion time. The performance of these models, including prediction precision, coverage probability, and correct decision-making ability, is evaluated using actual studies from our cancer center and simulation. The results showed that a constant accrual model with strongly informative priors is very accurate when accrual is on target or slightly off, producing smaller mean squared error, high percentage of coverage, and a high number of correct decisions as to whether or not continue the trial, but it is strongly biased when off target. Flat or weakly informative priors provide protection against an off target prior but are less efficient when the accrual is on target. The accelerated prior performs similar to a strong prior. The hedging prior performs much like the weak priors when the accrual is extremely off target but closer to the strong priors when the accrual is on target or only slightly off target. We suggest improvements in these models and propose new models for future research. PMID:25376910
Combining Bayesian Networks and Agent Based Modeling to develop a decision-support model in Vietnam
NASA Astrophysics Data System (ADS)
Nong, Bao Anh; Ertsen, Maurits; Schoups, Gerrit
2016-04-01
Complexity and uncertainty in natural resources management have been focus themes in recent years. Within these debates, with the aim to define an approach feasible for water management practice, we are developing an integrated conceptual modeling framework for simulating decision-making processes of citizens, in our case in the Day river area, Vietnam. The model combines Bayesian Networks (BNs) and Agent-Based Modeling (ABM). BNs are able to combine both qualitative data from consultants / experts / stakeholders, and quantitative data from observations on different phenomena or outcomes from other models. Further strengths of BNs are that the relationship between variables in the system is presented in a graphical interface, and that components of uncertainty are explicitly related to their probabilistic dependencies. A disadvantage is that BNs cannot easily identify the feedback of agents in the system once changes appear. Hence, ABM was adopted to represent the reaction among stakeholders under changes. The modeling framework is developed as an attempt to gain better understanding about citizen's behavior and factors influencing their decisions in order to reduce uncertainty in the implementation of water management policy.
Stoffenmanager exposure model: company-specific exposure assessments using a Bayesian methodology.
van de Ven, Peter; Fransman, Wouter; Schinkel, Jody; Rubingh, Carina; Warren, Nicholas; Tielemans, Erik
2010-04-01
The web-based tool "Stoffenmanager" was initially developed to assist small- and medium-sized enterprises in the Netherlands to make qualitative risk assessments and to provide advice on control at the workplace. The tool uses a mechanistic model to arrive at a "Stoffenmanager score" for exposure. In a recent study it was shown that variability in exposure measurements given a certain Stoffenmanager score is still substantial. This article discusses an extension to the tool that uses a Bayesian methodology for quantitative workplace/scenario-specific exposure assessment. This methodology allows for real exposure data observed in the company of interest to be combined with the prior estimate (based on the Stoffenmanager model). The output of the tool is a company-specific assessment of exposure levels for a scenario for which data is available. The Bayesian approach provides a transparent way of synthesizing different types of information and is especially preferred in situations where available data is sparse, as is often the case in small- and medium sized-enterprises. Real-world examples as well as simulation studies were used to assess how different parameters such as sample size, difference between prior and data, uncertainty in prior, and variance in the data affect the eventual posterior distribution of a Bayesian exposure assessment. PMID:20146134
Paddock, Susan M.; Savitsky, Terrance D.
2013-01-01
There are several challenges to testing the effectiveness of group therapy-based interventions in alcohol and other drug use (AOD) treatment settings. Enrollment into AOD therapy groups typically occurs on an open (rolling) basis. Changes in therapy group membership induce a complex correlation structure among client outcomes, with relatively small numbers of clients attending each therapy group session. Primary outcomes are measured post-treatment, so each datum reflects the effect of all sessions attended by a client. The number of post-treatment outcomes assessments is typically very limited. The first feature of our modeling approach relaxes the assumption of independent random effects in the standard multiple membership model by employing conditional autoregression (CAR) to model correlation in random therapy group session effects associated with clients’ attendance of common group therapy sessions. A second feature specifies a longitudinal growth model under which the posterior distribution of client-specific random effects, or growth parameters, is modeled non-parametrically. The Dirichlet process prior helps to overcome limitations of standard parametric growth models given limited numbers of longitudinal assessments. We motivate and illustrate our approach with a data set from a study of group cognitive behavioral therapy to reduce depressive symptoms among residential AOD treatment clients. PMID:24353375
Paddock, Susan M; Savitsky, Terrance D
2013-06-01
There are several challenges to testing the effectiveness of group therapy-based interventions in alcohol and other drug use (AOD) treatment settings. Enrollment into AOD therapy groups typically occurs on an open (rolling) basis. Changes in therapy group membership induce a complex correlation structure among client outcomes, with relatively small numbers of clients attending each therapy group session. Primary outcomes are measured post-treatment, so each datum reflects the effect of all sessions attended by a client. The number of post-treatment outcomes assessments is typically very limited. The first feature of our modeling approach relaxes the assumption of independent random effects in the standard multiple membership model by employing conditional autoregression (CAR) to model correlation in random therapy group session effects associated with clients' attendance of common group therapy sessions. A second feature specifies a longitudinal growth model under which the posterior distribution of client-specific random effects, or growth parameters, is modeled non-parametrically. The Dirichlet process prior helps to overcome limitations of standard parametric growth models given limited numbers of longitudinal assessments. We motivate and illustrate our approach with a data set from a study of group cognitive behavioral therapy to reduce depressive symptoms among residential AOD treatment clients. PMID:24353375
Comparing models for perfluorooctanoic acid pharmacokinetics using Bayesian analysis
Selecting the appropriate pharmacokinetic (PK) model given the available data is investigated for perfluorooctanoic acid (PFOA), which has been widely analyzed with an empirical, one-compartment model. This research examined the results of experiments [Kemper R. A., DuPont Haskel...
A Bayesian Semiparametric Item Response Model with Dirichlet Process Priors
ERIC Educational Resources Information Center
Miyazaki, Kei; Hoshino, Takahiro
2009-01-01
In Item Response Theory (IRT), item characteristic curves (ICCs) are illustrated through logistic models or normal ogive models, and the probability that examinees give the correct answer is usually a monotonically increasing function of their ability parameters. However, since only limited patterns of shapes can be obtained from logistic models…
NASA Astrophysics Data System (ADS)
Han, Feng; Zheng, Yi
2016-02-01
While watershed water quality (WWQ) models have been widely used to support water quality management, their profound modeling uncertainty remains an unaddressed issue. Data assimilation via Bayesian calibration is a promising solution to the uncertainty, but has been rarely practiced for WWQ modeling. This study applied multiple-response Bayesian calibration (MRBC) to SWAT, a classic WWQ model, using the nitrate pollution in the Newport Bay Watershed (southern California, USA) as the study case. How typical input and model structure errors would impact modeling uncertainty, parameter identification and management decision-making was systematically investigated through both synthetic and real-situation modeling cases. The main study findings include: (1) with an efficient sampling scheme, MRBC is applicable to WWQ modeling in characterizing its parametric and predictive uncertainties; (2) incorporating hydrology responses, which are less susceptible to input and model structure errors than water quality responses, can improve the Bayesian calibration results and benefit potential modeling-based management decisions; and (3) the value of MRBC to modeling-based decision-making essentially depends on pollution severity, management objective and decision maker's risk tolerance.
A Bayesian state-space formulation of dynamic occupancy models.
Royle, J Andrew; Kéry, Marc
2007-07-01
Species occurrence and its dynamic components, extinction and colonization probabilities, are focal quantities in biogeography and metapopulation biology, and for species conservation assessments. It has been increasingly appreciated that these parameters must be estimated separately from detection probability to avoid the biases induced by non-detection error. Hence, there is now considerable theoretical and practical interest in dynamic occupancy models that contain explicit representations of metapopulation dynamics such as extinction, colonization, and turnover as well as growth rates. We describe a hierarchical parameterization of these models that is analogous to the state-space formulation of models in time series, where the model is represented by two components, one for the partially observable occupancy process and another for the observations conditional on that process. This parameterization naturally allows estimation of all parameters of the conventional approach to occupancy models, but in addition, yields great flexibility and extensibility, e.g., to modeling heterogeneity or latent structure in model parameters. We also highlight the important distinction between population and finite sample inference; the latter yields much more precise estimates for the particular sample at hand. Finite sample estimates can easily be obtained using the state-space representation of the model but are difficult to obtain under the conventional approach of likelihood-based estimation. We use R and WinBUGS to apply the model to two examples. In a standard analysis for the European Crossbill in a large Swiss monitoring program, we fit a model with year-specific parameters. Estimates of the dynamic parameters varied greatly among years, highlighting the irruptive population dynamics of that species. In the second example, we analyze route occupancy of Cerulean Warblers in the North American Breeding Bird Survey (BBS) using a model allowing for site
A Bayesian state-space formulation of dynamic occupancy models
Royle, J. Andrew; Kery, M.
2007-01-01
Species occurrence and its dynamic components, extinction and colonization probabilities, are focal quantities in biogeography and metapopulation biology, and for species conservation assessments. It has been increasingly appreciated that these parameters must be estimated separately from detection probability to avoid the biases induced by nondetection error. Hence, there is now considerable theoretical and practical interest in dynamic occupancy models that contain explicit representations of metapopulation dynamics such as extinction, colonization, and turnover as well as growth rates. We describe a hierarchical parameterization of these models that is analogous to the state-space formulation of models in time series, where the model is represented by two components, one for the partially observable occupancy process and another for the observations conditional on that process. This parameterization naturally allows estimation of all parameters of the conventional approach to occupancy models, but in addition, yields great flexibility and extensibility, e.g., to modeling heterogeneity or latent structure in model parameters. We also highlight the important distinction between population and finite sample inference; the latter yields much more precise estimates for the particular sample at hand. Finite sample estimates can easily be obtained using the state-space representation of the model but are difficult to obtain under the conventional approach of likelihood-based estimation. We use R and Win BUGS to apply the model to two examples. In a standard analysis for the European Crossbill in a large Swiss monitoring program, we fit a model with year-specific parameters. Estimates of the dynamic parameters varied greatly among years, highlighting the irruptive population dynamics of that species. In the second example, we analyze route occupancy of Cerulean Warblers in the North American Breeding Bird Survey (BBS) using a model allowing for site
Bayesian spatial transformation models with applications in neuroimaging data
Miranda, Michelle F.; Zhu, Hongtu; Ibrahim, Joseph G.
2013-01-01
Summary The aim of this paper is to develop a class of spatial transformation models (STM) to spatially model the varying association between imaging measures in a three-dimensional (3D) volume (or 2D surface) and a set of covariates. Our STMs include a varying Box-Cox transformation model for dealing with the issue of non-Gaussian distributed imaging data and a Gaussian Markov Random Field model for incorporating spatial smoothness of the imaging data. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. Simulations and real data analysis demonstrate that the STM significantly outperforms the voxel-wise linear model with Gaussian noise in recovering meaningful geometric patterns. Our STM is able to reveal important brain regions with morphological changes in children with attention deficit hyperactivity disorder. PMID:24128143
NASA Astrophysics Data System (ADS)
Zeng, Xiankui; Wu, Jichun; Wang, Dong; Zhu, Xiaobin; Long, Yuqiao
2016-07-01
Because of groundwater conceptualization uncertainty, multi-model methods are usually used and the corresponding uncertainties are estimated by integrating Markov Chain Monte Carlo (MCMC) and Bayesian model averaging (BMA) methods. Generally, the variance method is used to measure the uncertainties of BMA prediction. The total variance of ensemble prediction is decomposed into within-model and between-model variances, which represent the uncertainties derived from parameter and conceptual model, respectively. However, the uncertainty of a probability distribution couldn't be comprehensively quantified by variance solely. A new measuring method based on information entropy theory is proposed in this study. Due to actual BMA process hard to meet the ideal mutually exclusive collectively exhaustive condition, BMA predictive uncertainty could be decomposed into parameter, conceptual model, and overlapped uncertainties, respectively. Overlapped uncertainty is induced by the combination of predictions from correlated model structures. In this paper, five simple analytical functions are firstly used to illustrate the feasibility of the variance and information entropy methods. A discrete distribution example shows that information entropy could be more appropriate to describe between-model uncertainty than variance. Two continuous distribution examples show that the two methods are consistent in measuring normal distribution, and information entropy is more appropriate to describe bimodal distribution than variance. The two examples of BMA uncertainty decomposition demonstrate that the two methods are relatively consistent in assessing the uncertainty of unimodal BMA prediction. Information entropy is more informative in describing the uncertainty decomposition of bimodal BMA prediction. Then, based on a synthetical groundwater model, the variance and information entropy methods are used to assess the BMA uncertainty of groundwater modeling. The uncertainty assessments of
Bayesian sensitivity analysis of a nonlinear finite element model
NASA Astrophysics Data System (ADS)
Becker, W.; Oakley, J. E.; Surace, C.; Gili, P.; Rowson, J.; Worden, K.
2012-10-01
A major problem in uncertainty and sensitivity analysis is that the computational cost of propagating probabilistic uncertainty through large nonlinear models can be prohibitive when using conventional methods (such as Monte Carlo methods). A powerful solution to this problem is to use an emulator, which is a mathematical representation of the model built from a small set of model runs at specified points in input space. Such emulators are massively cheaper to run and can be used to mimic the "true" model, with the result that uncertainty analysis and sensitivity analysis can be performed for a greatly reduced computational cost. The work here investigates the use of an emulator known as a Gaussian process (GP), which is an advanced probabilistic form of regression. The GP is particularly suited to uncertainty analysis since it is able to emulate a wide class of models, and accounts for its own emulation uncertainty. Additionally, uncertainty and sensitivity measures can be estimated analytically, given certain assumptions. The GP approach is explained in detail here, and a case study of a finite element model of an airship is used to demonstrate the method. It is concluded that the GP is a very attractive way of performing uncertainty and sensitivity analysis on large models, provided that the dimensionality is not too high.
Bayesian structural equation modeling: a more flexible representation of substantive theory.
Muthén, Bengt; Asparouhov, Tihomir
2012-09-01
This article proposes a new approach to factor analysis and structural equation modeling using Bayesian analysis. The new approach replaces parameter specifications of exact zeros with approximate zeros based on informative, small-variance priors. It is argued that this produces an analysis that better reflects substantive theories. The proposed Bayesian approach is particularly beneficial in applications where parameters are added to a conventional model such that a nonidentified model is obtained if maximum-likelihood estimation is applied. This approach is useful for measurement aspects of latent variable modeling, such as with confirmatory factor analysis, and the measurement part of structural equation modeling. Two application areas are studied, cross-loadings and residual correlations in confirmatory factor analysis. An example using a full structural equation model is also presented, showing an efficient way to find model misspecification. The approach encompasses 3 elements: model testing using posterior predictive checking, model estimation, and model modification. Monte Carlo simulations and real data are analyzed using Mplus. The real-data analyses use data from Holzinger and Swineford's (1939) classic mental abilities study, Big Five personality factor data from a British survey, and science achievement data from the National Educational Longitudinal Study of 1988. PMID:22962886
Application of Bayesian model averaging to measurements of the primordial power spectrum
Parkinson, David; Liddle, Andrew R.
2010-11-15
Cosmological parameter uncertainties are often stated assuming a particular model, neglecting the model uncertainty, even when Bayesian model selection is unable to identify a conclusive best model. Bayesian model averaging is a method for assessing parameter uncertainties in situations where there is also uncertainty in the underlying model. We apply model averaging to the estimation of the parameters associated with the primordial power spectra of curvature and tensor perturbations. We use CosmoNest and MultiNest to compute the model evidences and posteriors, using cosmic microwave data from WMAP, ACBAR, BOOMERanG, and CBI, plus large-scale structure data from the SDSS DR7. We find that the model-averaged 95% credible interval for the spectral index using all of the data is 0.940
NASA Astrophysics Data System (ADS)
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods
Lawson, Daniel J; Holtrop, Grietje; Flint, Harry
2011-07-01
Process models specified by non-linear dynamic differential equations contain many parameters, which often must be inferred from a limited amount of data. We discuss a hierarchical Bayesian approach combining data from multiple related experiments in a meaningful way, which permits more powerful inference than treating each experiment as independent. The approach is illustrated with a simulation study and example data from experiments replicating the aspects of the human gut microbial ecosystem. A predictive model is obtained that contains prediction uncertainty caused by uncertainty in the parameters, and we extend the model to capture situations of interest that cannot easily be studied experimentally. PMID:21681780
A Bayesian approach to the analysis of quantal bioassay studies using nonparametric mixture models.