Bayesian nonparametric centered random effects models with variable selection.
Yang, Mingan
2013-03-01
In a linear mixed effects model, it is common practice to assume that the random effects follow a parametric distribution such as a normal distribution with mean zero. However, in the case of variable selection, substantial violation of the normality assumption can potentially impact the subset selection and result in poor interpretation and even incorrect results. In nonparametric random effects models, the random effects generally have a nonzero mean, which causes an identifiability problem for the fixed effects that are paired with the random effects. In this article, we focus on a Bayesian method for variable selection. We characterize the subject-specific random effects nonparametrically with a Dirichlet process and resolve the bias simultaneously. In particular, we propose flexible modeling of the conditional distribution of the random effects with changes across the predictor space. The approach is implemented using a stochastic search Gibbs sampler to identify subsets of fixed effects and random effects to be included in the model. Simulations are provided to evaluate and compare the performance of our approach to the existing ones. We then apply the new approach to a real data example, cross-country and interlaboratory rodent uterotrophic bioassay.
A Robust Bayesian Random Effects Model for Nonlinear Calibration Problems
Fong, Y.; Wakefield, J.; De Rosa, S.; Frahm, N.
2013-01-01
Summary In the context of a bioassay or an immunoassay, calibration means fitting a curve, usually nonlinear, through the observations collected on a set of samples containing known concentrations of a target substance, and then using the fitted curve and observations collected on samples of interest to predict the concentrations of the target substance in these samples. Recent technological advances have greatly improved our ability to quantify minute amounts of substance from a tiny volume of biological sample. This has in turn led to a need to improve statistical methods for calibration. In this paper, we focus on developing calibration methods robust to dependent outliers. We introduce a novel normal mixture model with dependent error terms to model the experimental noise. In addition, we propose a re-parameterization of the five parameter logistic nonlinear regression model that allows us to better incorporate prior information. We examine the performance of our methods with simulation studies and show that they lead to a substantial increase in performance measured in terms of mean squared error of estimation and a measure of the average prediction accuracy. A real data example from the HIV Vaccine Trials Network Laboratory is used to illustrate the methods. PMID:22551415
Thomas, D.L.; Johnson, D.; Griffith, B.
2006-01-01
Modeling the probability of use of land units characterized by discrete and continuous measures, we present a Bayesian random-effects model to assess resource selection. This model provides simultaneous estimation of both individual- and population-level selection. Deviance information criterion (DIC), a Bayesian alternative to AIC that is sample-size specific, is used for model selection. Aerial radiolocation data from 76 adult female caribou (Rangifer tarandus) and calf pairs during 1 year on an Arctic coastal plain calving ground were used to illustrate models and assess population-level selection of landscape attributes, as well as individual heterogeneity of selection. Landscape attributes included elevation, NDVI (a measure of forage greenness), and land cover-type classification. Results from the first of a 2-stage model-selection procedure indicated that there is substantial heterogeneity among cow-calf pairs with respect to selection of the landscape attributes. In the second stage, selection of models with heterogeneity included indicated that at the population-level, NDVI and land cover class were significant attributes for selection of different landscapes by pairs on the calving ground. Population-level selection coefficients indicate that the pairs generally select landscapes with higher levels of NDVI, but the relationship is quadratic. The highest rate of selection occurs at values of NDVI less than the maximum observed. Results for land cover-class selections coefficients indicate that wet sedge, moist sedge, herbaceous tussock tundra, and shrub tussock tundra are selected at approximately the same rate, while alpine and sparsely vegetated landscapes are selected at a lower rate. Furthermore, the variability in selection by individual caribou for moist sedge and sparsely vegetated landscapes is large relative to the variability in selection of other land cover types. The example analysis illustrates that, while sometimes computationally intense, a
A Bayesian nonlinear random effects model for identification of defective batteries from lot samples
NASA Astrophysics Data System (ADS)
Cripps, Edward; Pecht, Michael
2017-02-01
Numerous materials and processes go into the manufacture of lithium-ion batteries, resulting in variations across batteries' capacity fade measurements. Accounting for this variability is essential when determining whether batteries are performing satisfactorily. Motivated by a real manufacturing problem, this article presents an approach to assess whether lithium-ion batteries from a production lot are not representative of a healthy population of batteries from earlier production lots, and to determine, based on capacity fade data, the earliest stage (in terms of cycles) that battery anomalies can be identified. The approach involves the use of a double exponential function to describe nonlinear capacity fade data. To capture the variability of repeated measurements on a number of individual batteries, the double exponential function is then embedded as the individual batteries' trajectories in a Bayesian random effects model. The model allows for probabilistic predictions of capacity fading not only at the underlying mean process level but also at the individual battery level. The results show good predictive coverage for individual batteries and demonstrate that, for our data, non-healthy lithium-ion batteries can be identified in as few as 50 cycles.
Gracia, Enrique; López-Quílez, Antonio; Marco, Miriam; Lladosa, Silvia; Lila, Marisol
2014-01-01
This paper uses spatial data of cases of intimate partner violence against women (IPVAW) to examine neighborhood-level influences on small-area variations in IPVAW risk in a police district of the city of Valencia (Spain). To analyze area variations in IPVAW risk and its association with neighborhood-level explanatory variables we use a Bayesian spatial random-effects modeling approach, as well as disease mapping methods to represent risk probabilities in each area. Analyses show that IPVAW cases are more likely in areas of high immigrant concentration, high public disorder and crime, and high physical disorder. Results also show a spatial component indicating remaining variability attributable to spatially structured random effects. Bayesian spatial modeling offers a new perspective to identify IPVAW high and low risk areas, and provides a new avenue for the design of better-informed prevention and intervention strategies. PMID:24413701
Chan, Jennifer S K
2016-05-01
Dropouts are common in longitudinal study. If the dropout probability depends on the missing observations at or after dropout, this type of dropout is called informative (or nonignorable) dropout (ID). Failure to accommodate such dropout mechanism into the model will bias the parameter estimates. We propose a conditional autoregressive model for longitudinal binary data with an ID model such that the probabilities of positive outcomes as well as the drop-out indicator in each occasion are logit linear in some covariates and outcomes. This model adopting a marginal model for outcomes and a conditional model for dropouts is called a selection model. To allow for the heterogeneity and clustering effects, the outcome model is extended to incorporate mixture and random effects. Lastly, the model is further extended to a novel model that models the outcome and dropout jointly such that their dependency is formulated through an odds ratio function. Parameters are estimated by a Bayesian approach implemented using the user-friendly Bayesian software WinBUGS. A methadone clinic dataset is analyzed to illustrate the proposed models. Result shows that the treatment time effect is still significant but weaker after allowing for an ID process in the data. Finally the effect of drop-out on parameter estimates is evaluated through simulation studies.
Application of Poisson random effect models for highway network screening.
Jiang, Ximiao; Abdel-Aty, Mohamed; Alamili, Samer
2014-02-01
In recent years, Bayesian random effect models that account for the temporal and spatial correlations of crash data became popular in traffic safety research. This study employs random effect Poisson Log-Normal models for crash risk hotspot identification. Both the temporal and spatial correlations of crash data were considered. Potential for Safety Improvement (PSI) were adopted as a measure of the crash risk. Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in the Central Florida area, the random effect approaches were compared to the traditional Empirical Bayesian (EB) method and the conventional Bayesian Poisson Log-Normal model. A series of method examination tests were conducted to evaluate the performance of different approaches. These tests include the previously developed site consistence test, method consistence test, total rank difference test, and the modified total score test, as well as the newly proposed total safety performance measure difference test. Results show that the Bayesian Poisson model accounting for both temporal and spatial random effects (PTSRE) outperforms the model that with only temporal random effect, and both are superior to the conventional Poisson Log-Normal model (PLN) and the EB model in the fitting of crash data. Additionally, the method evaluation tests indicate that the PTSRE model is significantly superior to the PLN model and the EB model in consistently identifying hotspots during successive time periods. The results suggest that the PTSRE model is a superior alternative for road site crash risk hotspot identification.
Ryu, Duchwan; Li, Erning; Mallick, Bani K
2011-06-01
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves.
ERIC Educational Resources Information Center
Huang, Hung-Yu; Wang, Wen-Chung
2014-01-01
The DINA (deterministic input, noisy, and gate) model has been widely used in cognitive diagnosis tests and in the process of test development. The outcomes known as slip and guess are included in the DINA model function representing the responses to the items. This study aimed to extend the DINA model by using the random-effect approach to allow…
Modelling the random effects covariance matrix in longitudinal data.
Daniels, Michael J; Zhao, Yan D
2003-05-30
A common class of models for longitudinal data are random effects (mixed) models. In these models, the random effects covariance matrix is typically assumed constant across subject. However, in many situations this matrix may differ by measured covariates. In this paper, we propose an approach to model the random effects covariance matrix by using a special Cholesky decomposition of the matrix. In particular, we will allow the parameters that result from this decomposition to depend on subject-specific covariates and also explore ways to parsimoniously model these parameters. An advantage of this parameterization is that there is no concern about the positive definiteness of the resulting estimator of the covariance matrix. In addition, the parameters resulting from this decomposition have a sensible interpretation. We propose fully Bayesian modelling for which a simple Gibbs sampler can be implemented to sample from the posterior distribution of the parameters. We illustrate these models on data from depression studies and examine the impact of heterogeneity in the covariance matrix on estimation of both fixed and random effects.
Random-effects models for longitudinal data
Laird, N.M.; Ware, J.H.
1982-12-01
Models for the analysis of longitudinal data must recognize the relationship between serial observations on the same unit. Multivariate models with general covariance structure are often difficult to apply to highly unbalanced data, whereas two-stage random-effects models can be used easily. In two-stage models, the probability distributions for the response vectors of different individuals belong to a single family, but some random-effects parameters vary across individuals, with a distribution specified at the second stage. A general family of models is discussed, which includes both growth models and repeated-measures models as special cases. A unified approach to fitting these models, based on a combination of empirical Bayes and maximum likelihood estimation of model parameters and using the EM algorithm, is discussed. Two examples are taken from a current epidemiological study of the health effects of air pollution.
De la Cruz, Rolando; Meza, Cristian; Arribas-Gil, Ana; Carroll, Raymond J.
2016-01-01
Joint models for a wide class of response variables and longitudinal measurements consist on a mixed-effects model to fit longitudinal trajectories whose random effects enter as covariates in a generalized linear model for the primary response. They provide a useful way to assess association between these two kinds of data, which in clinical studies are often collected jointly on a series of individuals and may help understanding, for instance, the mechanisms of recovery of a certain disease or the efficacy of a given therapy. When a nonlinear mixed-effects model is used to fit the longitudinal trajectories, the existing estimation strategies based on likelihood approximations have been shown to exhibit some computational efficiency problems (De la Cruz et al., 2011). In this article we consider a Bayesian estimation procedure for the joint model with a nonlinear mixed-effects model for the longitudinal data and a generalized linear model for the primary response. The proposed prior structure allows for the implementation of an MCMC sampler. Moreover, we consider that the errors in the longitudinal model may be correlated. We apply our method to the analysis of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. We also conduct a simulation study to assess the importance of modelling correlated errors and quantify the consequences of model misspecification. PMID:27274601
ERIC Educational Resources Information Center
Beretvas, S. Natasha; Murphy, Daniel L.
2013-01-01
The authors assessed correct model identification rates of Akaike's information criterion (AIC), corrected criterion (AICC), consistent AIC (CAIC), Hannon and Quinn's information criterion (HQIC), and Bayesian information criterion (BIC) for selecting among cross-classified random effects models. Performance of default values for the 5…
A random effects epidemic-type aftershock sequence model.
Lin, Feng-Chang
2011-04-01
We consider an extension of the temporal epidemic-type aftershock sequence (ETAS) model with random effects as a special case of a well-known doubly stochastic self-exciting point process. The new model arises from a deterministic function that is randomly scaled by a nonnegative random variable, which is unobservable but assumed to follow either positive stable or one-parameter gamma distribution with unit mean. Both random effects models are of interest although the one-parameter gamma random effects model is more popular when modeling associated survival times. Our estimation is based on the maximum likelihood approach with marginalized intensity. The methods are shown to perform well in simulation experiments. When applied to an earthquake sequence on the east coast of Taiwan, the extended model with positive stable random effects provides a better model fit, compared to the original ETAS model and the extended model with one-parameter gamma random effects.
Model Diagnostics for Bayesian Networks
ERIC Educational Resources Information Center
Sinharay, Sandip
2006-01-01
Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…
A Gompertzian model with random effects to cervical cancer growth
Mazlan, Mazma Syahidatul Ayuni; Rosli, Norhayati
2015-05-15
In this paper, a Gompertzian model with random effects is introduced to describe the cervical cancer growth. The parameters values of the mathematical model are estimated via maximum likehood estimation. We apply 4-stage Runge-Kutta (SRK4) for solving the stochastic model numerically. The efficiency of mathematical model is measured by comparing the simulated result and the clinical data of the cervical cancer growth. Low values of root mean-square error (RMSE) of Gompertzian model with random effect indicate good fits.
A Gompertzian model with random effects to cervical cancer growth
NASA Astrophysics Data System (ADS)
Mazlan, Mazma Syahidatul Ayuni; Rosli, Norhayati
2015-05-01
In this paper, a Gompertzian model with random effects is introduced to describe the cervical cancer growth. The parameters values of the mathematical model are estimated via maximum likehood estimation. We apply 4-stage Runge-Kutta (SRK4) for solving the stochastic model numerically. The efficiency of mathematical model is measured by comparing the simulated result and the clinical data of the cervical cancer growth. Low values of root mean-square error (RMSE) of Gompertzian model with random effect indicate good fits.
Cure fraction model with random effects for regional variation in cancer survival.
Seppä, Karri; Hakulinen, Timo; Kim, Hyon-Jung; Läärä, Esa
2010-11-30
Assessing regional differences in the survival of cancer patients is important but difficult when separate regions are small or sparsely populated. In this paper, we apply a mixture cure fraction model with random effects to cause-specific survival data of female breast cancer patients collected by the population-based Finnish Cancer Registry. Two sets of random effects were used to capture the regional variation in the cure fraction and in the survival of the non-cured patients, respectively. This hierarchical model was implemented in a Bayesian framework using a Metropolis-within-Gibbs algorithm. To avoid poor mixing of the Markov chain, when the variance of either set of random effects was close to zero, posterior simulations were based on a parameter-expanded model with tailor-made proposal distributions in Metropolis steps. The random effects allowed the fitting of the cure fraction model to the sparse regional data and the estimation of the regional variation in 10-year cause-specific breast cancer survival with a parsimonious number of parameters. Before 1986, the capital of Finland clearly stood out from the rest, but since then all the 21 hospital districts have achieved approximately the same level of survival.
Performance of Random Effects Model Estimators under Complex Sampling Designs
ERIC Educational Resources Information Center
Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan
2011-01-01
In this article, we consider estimation of parameters of random effects models from samples collected via complex multistage designs. Incorporation of sampling weights is one way to reduce estimation bias due to unequal probabilities of selection. Several weighting methods have been proposed in the literature for estimating the parameters of…
The Random-Effect Generalized Rating Scale Model
ERIC Educational Resources Information Center
Wang, Wen-Chung; Wu, Shiu-Lien
2011-01-01
Rating scale items have been widely used in educational and psychological tests. These items require people to make subjective judgments, and these subjective judgments usually involve randomness. To account for this randomness, Wang, Wilson, and Shih proposed the random-effect rating scale model in which the threshold parameters are treated as…
Bayesian Model Selection for Group Studies
Stephan, Klaas Enno; Penny, Will D.; Daunizeau, Jean; Moran, Rosalyn J.; Friston, Karl J.
2009-01-01
Bayesian model selection (BMS) is a powerful method for determining the most likely among a set of competing hypotheses about the mechanisms that generated observed data. BMS has recently found widespread application in neuroimaging, particularly in the context of dynamic causal modelling (DCM). However, so far, combining BMS results from several subjects has relied on simple (fixed effects) metrics, e.g. the group Bayes factor (GBF), that do not account for group heterogeneity or outliers. In this paper, we compare the GBF with two random effects methods for BMS at the between-subject or group level. These methods provide inference on model-space using a classical and Bayesian perspective respectively. First, a classical (frequentist) approach uses the log model evidence as a subject-specific summary statistic. This enables one to use analysis of variance to test for differences in log-evidences over models, relative to inter-subject differences. We then consider the same problem in Bayesian terms and describe a novel hierarchical model, which is optimised to furnish a probability density on the models themselves. This new variational Bayes method rests on treating the model as a random variable and estimating the parameters of a Dirichlet distribution which describes the probabilities for all models considered. These probabilities then define a multinomial distribution over model space, allowing one to compute how likely it is that a specific model generated the data of a randomly chosen subject as well as the exceedance probability of one model being more likely than any other model. Using empirical and synthetic data, we show that optimising a conditional density of the model probabilities, given the log-evidences for each model over subjects, is more informative and appropriate than both the GBF and frequentist tests of the log-evidences. In particular, we found that the hierarchical Bayesian approach is considerably more robust than either of the other
Cheung, Mike W-L; Cheung, Shu Fai
2016-06-01
Meta-analytic structural equation modeling (MASEM) combines the techniques of meta-analysis and structural equation modeling for the purpose of synthesizing correlation or covariance matrices and fitting structural equation models on the pooled correlation or covariance matrix. Both fixed-effects and random-effects models can be defined in MASEM. Random-effects models are well known in conventional meta-analysis but are less studied in MASEM. The primary objective of this paper was to address issues related to random-effects models in MASEM. Specifically, we compared two different random-effects models in MASEM-correlation-based MASEM and parameter-based MASEM-and explored their strengths and limitations. Two examples were used to illustrate the similarities and differences between these models. We offered some practical guidelines for choosing between these two models. Future directions for research on random-effects models in MASEM were also discussed. Copyright © 2016 John Wiley & Sons, Ltd.
Random-effects models for serial observations with binary response
Stiratelli, R.; Laird, N.; Ware, J.H.
1984-12-01
This paper presents a general mixed model for the analysis of serial dichotomous responses provided by a panel of study participants. Each subject's serial responses are assumed to arise from a logistic model, but with regression coefficients that vary between subjects. The logistic regression parameters are assumed to be normally distributed in the population. Inference is based upon maximum likelihood estimation of fixed effects and variance components, and empirical Bayes estimation of random effects. Exact solutions are analytically and computationally infeasible, but an approximation based on the mode of the posterior distribution of the random parameters is proposed, and is implemented by means of the EM algorithm. This approximate method is compared with a simpler two-step method proposed by Korn and Whittemore, using data from a panel study of asthmatics originally described in that paper. One advantage of the estimation strategy described here is the ability to use all of the data, including that from subjects with insufficient data to permit fitting of a separate logistic regression model, as required by the Korn and Whittemore method. However, the new method is computationally intensive.
Tang, An-Min; Tang, Nian-Sheng
2015-02-28
We propose a semiparametric multivariate skew-normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within-subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew-normal distribution to specify the within-subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis-Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within-subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies.
Li, Gang; Elashoff, Robert M.; Pan, Jianxin
2011-01-01
This article studies a general joint model for longitudinal measurements and competing risks survival data. The model consists of a linear mixed effects sub-model for the longitudinal outcome, a proportional cause-specific hazards frailty sub-model for the competing risks survival data, and a regression sub-model for the variance–covariance matrix of the multivariate latent random effects based on a modified Cholesky decomposition. The model provides a useful approach to adjust for non-ignorable missing data due to dropout for the longitudinal outcome, enables analysis of the survival outcome with informative censoring and intermittently measured time-dependent covariates, as well as joint analysis of the longitudinal and survival outcomes. Unlike previously studied joint models, our model allows for heterogeneous random covariance matrices. It also offers a framework to assess the homogeneous covariance assumption of existing joint models. A Bayesian MCMC procedure is developed for parameter estimation and inference. Its performances and frequentist properties are investigated using simulations. A real data example is used to illustrate the usefulness of the approach. PMID:20549344
Bayesian Model Averaging for Propensity Score Analysis
ERIC Educational Resources Information Center
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
Mixture Random-Effect IRT Models for Controlling Extreme Response Style on Rating Scales
Huang, Hung-Yu
2016-01-01
Respondents are often requested to provide a response to Likert-type or rating-scale items during the assessment of attitude, interest, and personality to measure a variety of latent traits. Extreme response style (ERS), which is defined as a consistent and systematic tendency of a person to locate on a limited number of available rating-scale options, may distort the test validity. Several latent trait models have been proposed to address ERS, but all these models have limitations. Mixture random-effect item response theory (IRT) models for ERS are developed in this study to simultaneously identify the mixtures of latent classes from different ERS levels and detect the possible differential functioning items that result from different latent mixtures. The model parameters can be recovered fairly well in a series of simulations that use Bayesian estimation with the WinBUGS program. In addition, the model parameters in the developed models can be used to identify items that are likely to elicit ERS. The results show that a long test and large sample can improve the parameter estimation process; the precision of the parameter estimates increases with the number of response options, and the model parameter estimation outperforms the person parameter estimation. Ignoring the mixtures and ERS results in substantial rank-order changes in the target latent trait and a reduced classification accuracy of the response styles. An empirical survey of emotional intelligence in college students is presented to demonstrate the applications and implications of the new models. PMID:27853444
Bayesian stable isotope mixing models
In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...
Modeling multivariate survival data by a semiparametric random effects proportional odds model.
Lam, K F; Lee, Y W; Leung, T L
2002-06-01
In this article, the focus is on the analysis of multivariate survival time data with various types of dependence structures. Examples of multivariate survival data include clustered data and repeated measurements from the same subject, such as the interrecurrence times of cancer tumors. A random effect semiparametric proportional odds model is proposed as an alternative to the proportional hazards model. The distribution of the random effects is assumed to be multivariate normal and the random effect is assumed to act additively to the baseline log-odds function. This class of models, which includes the usual shared random effects model, the additive variance components model, and the dynamic random effects model as special cases, is highly flexible and is capable of modeling a wide range of multivariate survival data. A unified estimation procedure is proposed to estimate the regression and dependence parameters simultaneously by means of a marginal-likelihood approach. Unlike the fully parametric case, the regression parameter estimate is not sensitive to the choice of correlation structure of the random effects. The marginal likelihood is approximated by the Monte Carlo method. Simulation studies are carried out to investigate the performance of the proposed method. The proposed method is applied to two well-known data sets, including clustered data and recurrent event times data.
Drikvandi, Reza
2017-02-13
Nonlinear mixed-effects models are frequently used for pharmacokinetic data analysis, and they account for inter-subject variability in pharmacokinetic parameters by incorporating subject-specific random effects into the model. The random effects are often assumed to follow a (multivariate) normal distribution. However, many articles have shown that misspecifying the random-effects distribution can introduce bias in the estimates of parameters and affect inferences about the random effects themselves, such as estimation of the inter-subject variability. Because random effects are unobservable latent variables, it is difficult to assess their distribution. In a recent paper we developed a diagnostic tool based on the so-called gradient function to assess the random-effects distribution in mixed models. There we evaluated the gradient function for generalized liner mixed models and in the presence of a single random effect. However, assessing the random-effects distribution in nonlinear mixed-effects models is more challenging, especially when multiple random effects are present, and therefore the results from linear and generalized linear mixed models may not be valid for such nonlinear models. In this paper, we further investigate the gradient function and evaluate its performance for such nonlinear mixed-effects models which are common in pharmacokinetics and pharmacodynamics. We use simulations as well as real data from an intensive pharmacokinetic study to illustrate the proposed diagnostic tool.
Bayesian Model Averaging for Propensity Score Analysis.
Kaplan, David; Chen, Jianshen
2014-01-01
This article considers Bayesian model averaging as a means of addressing uncertainty in the selection of variables in the propensity score equation. We investigate an approximate Bayesian model averaging approach based on the model-averaged propensity score estimates produced by the R package BMA but that ignores uncertainty in the propensity score. We also provide a fully Bayesian model averaging approach via Markov chain Monte Carlo sampling (MCMC) to account for uncertainty in both parameters and models. A detailed study of our approach examines the differences in the causal estimate when incorporating noninformative versus informative priors in the model averaging stage. We examine these approaches under common methods of propensity score implementation. In addition, we evaluate the impact of changing the size of Occam's window used to narrow down the range of possible models. We also assess the predictive performance of both Bayesian model averaging propensity score approaches and compare it with the case without Bayesian model averaging. Overall, results show that both Bayesian model averaging propensity score approaches recover the treatment effect estimates well and generally provide larger uncertainty estimates, as expected. Both Bayesian model averaging approaches offer slightly better prediction of the propensity score compared with the Bayesian approach with a single propensity score equation. Covariate balance checks for the case study show that both Bayesian model averaging approaches offer good balance. The fully Bayesian model averaging approach also provides posterior probability intervals of the balance indices.
ERIC Educational Resources Information Center
Cheung, Mike W.-L.; Cheung, Shu Fai
2016-01-01
Meta-analytic structural equation modeling (MASEM) combines the techniques of meta-analysis and structural equation modeling for the purpose of synthesizing correlation or covariance matrices and fitting structural equation models on the pooled correlation or covariance matrix. Both fixed-effects and random-effects models can be defined in MASEM.…
Bayesian Networks for Social Modeling
Whitney, Paul D.; White, Amanda M.; Walsh, Stephen J.; Dalton, Angela C.; Brothers, Alan J.
2011-03-28
This paper describes a body of work developed over the past five years. The work addresses the use of Bayesian network (BN) models for representing and predicting social/organizational behaviors. The topics covered include model construction, validation, and use. These topics show the bulk of the lifetime of such model, beginning with construction, moving to validation and other aspects of model ‘critiquing’, and finally demonstrating how the modeling approach might be used to inform policy analysis. To conclude, we discuss limitations of using BN for this activity and suggest remedies to address those limitations. The primary benefits of using a well-developed computational, mathematical, and statistical modeling structure, such as BN, are 1) there are significant computational, theoretical and capability bases on which to build 2) ability to empirically critique the model, and potentially evaluate competing models for a social/behavioral phenomena.
Bayesian non parametric modelling of Higgs pair production
NASA Astrophysics Data System (ADS)
Scarpa, Bruno; Dorigo, Tommaso
2017-03-01
Statistical classification models are commonly used to separate a signal from a background. In this talk we face the problem of isolating the signal of Higgs pair production using the decay channel in which each boson decays into a pair of b-quarks. Typically in this context non parametric methods are used, such as Random Forests or different types of boosting tools. We remain in the same non-parametric framework, but we propose to face the problem following a Bayesian approach. A Dirichlet process is used as prior for the random effects in a logit model which is fitted by leveraging the Polya-Gamma data augmentation. Refinements of the model include the insertion in the simple model of P-splines to relate explanatory variables with the response and the use of Bayesian trees (BART) to describe the atoms in the Dirichlet process.
Modeling Diagnostic Assessments with Bayesian Networks
ERIC Educational Resources Information Center
Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego
2007-01-01
This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…
Bayesian Models of Individual Differences
Powell, Georgie; Meredith, Zoe; McMillin, Rebecca; Freeman, Tom C. A.
2016-01-01
According to Bayesian models, perception and cognition depend on the optimal combination of noisy incoming evidence with prior knowledge of the world. Individual differences in perception should therefore be jointly determined by a person’s sensitivity to incoming evidence and his or her prior expectations. It has been proposed that individuals with autism have flatter prior distributions than do nonautistic individuals, which suggests that prior variance is linked to the degree of autistic traits in the general population. We tested this idea by studying how perceived speed changes during pursuit eye movement and at low contrast. We found that individual differences in these two motion phenomena were predicted by differences in thresholds and autistic traits when combined in a quantitative Bayesian model. Our findings therefore support the flatter-prior hypothesis and suggest that individual differences in prior expectations are more systematic than previously thought. In order to be revealed, however, individual differences in sensitivity must also be taken into account. PMID:27770059
Modeling zero-inflated count data using a covariate-dependent random effect model.
Wong, Kin-Yau; Lam, K F
2013-04-15
In various medical related researches, excessive zeros, which make the standard Poisson regression model inadequate, often exist in count data. We proposed a covariate-dependent random effect model to accommodate the excess zeros and the heterogeneity in the population simultaneously. This work is motivated by a data set from a survey on the dental health status of Hong Kong preschool children where the response variable is the number of decayed, missing, or filled teeth. The random effect has a sound biological interpretation as the overall oral health status or other personal qualities of an individual child that is unobserved and unable to be quantified easily. The overall measure of oral health status, responsible for accommodating the excessive zeros and also the heterogeneity among the children, is covariate dependent. This covariate-dependent random effect model allows one to distinguish whether a potential covariate has an effect on the conceived overall oral health condition of the children, that is, the random effect, or has a direct effect on the magnitude of the counts, or both. We proposed a multiple imputation approach for estimation of the parameters. We discussed the choice of the imputation size. We evaluated the performance of the proposed estimation method through simulation studies, and we applied the model and method to the dental data.
Properties of the Bayesian Knowledge Tracing Model
ERIC Educational Resources Information Center
van de Sande, Brett
2013-01-01
Bayesian Knowledge Tracing is used very widely to model student learning. It comes in two different forms: The first form is the Bayesian Knowledge Tracing "hidden Markov model" which predicts the probability of correct application of a skill as a function of the number of previous opportunities to apply that skill and the model…
Bayesian inference for OPC modeling
NASA Astrophysics Data System (ADS)
Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.
2016-03-01
The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.
Cross-Classified Random Effects Models in Institutional Research
ERIC Educational Resources Information Center
Meyers, Laura E.
2012-01-01
Multilevel modeling offers researchers a rich array of tools that can be used for a variety of purposes, such as analyzing specific institutional issues, looking for macro-level trends, and helping to shape and inform educational policy. One of the more complex multilevel modeling tools available to institutional researchers is cross-classified…
Bayesian Calibration of Microsimulation Models.
Rutter, Carolyn M; Miglioretti, Diana L; Savarino, James E
2009-12-01
Microsimulation models that describe disease processes synthesize information from multiple sources and can be used to estimate the effects of screening and treatment on cancer incidence and mortality at a population level. These models are characterized by simulation of individual event histories for an idealized population of interest. Microsimulation models are complex and invariably include parameters that are not well informed by existing data. Therefore, a key component of model development is the choice of parameter values. Microsimulation model parameter values are selected to reproduce expected or known results though the process of model calibration. Calibration may be done by perturbing model parameters one at a time or by using a search algorithm. As an alternative, we propose a Bayesian method to calibrate microsimulation models that uses Markov chain Monte Carlo. We show that this approach converges to the target distribution and use a simulation study to demonstrate its finite-sample performance. Although computationally intensive, this approach has several advantages over previously proposed methods, including the use of statistical criteria to select parameter values, simultaneous calibration of multiple parameters to multiple data sources, incorporation of information via prior distributions, description of parameter identifiability, and the ability to obtain interval estimates of model parameters. We develop a microsimulation model for colorectal cancer and use our proposed method to calibrate model parameters. The microsimulation model provides a good fit to the calibration data. We find evidence that some parameters are identified primarily through prior distributions. Our results underscore the need to incorporate multiple sources of variability (i.e., due to calibration data, unknown parameters, and estimated parameters and predicted values) when calibrating and applying microsimulation models.
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models
Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A.; Burgueño, Juan; Pérez-Rodríguez, Paulino; de los Campos, Gustavo
2016-01-01
The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects (u) that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model (u) plus an extra component, f, that captures random effects between environments that were not captured by the random effects u. We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u. PMID:27793970
Bayesian model selection and isocurvature perturbations
NASA Astrophysics Data System (ADS)
Beltrán, María; García-Bellido, Juan; Lesgourgues, Julien; Liddle, Andrew R.; Slosar, Anže
2005-03-01
Present cosmological data are well explained assuming purely adiabatic perturbations, but an admixture of isocurvature perturbations is also permitted. We use a Bayesian framework to compare the performance of cosmological models including isocurvature modes with the purely adiabatic case; this framework automatically and consistently penalizes models which use more parameters to fit the data. We compute the Bayesian evidence for fits to a data set comprised of WMAP and other microwave anisotropy data, the galaxy power spectrum from 2dFGRS and SDSS, and Type Ia supernovae luminosity distances. We find that Bayesian model selection favors the purely adiabatic models, but so far only at low significance.
A Mixture Proportional Hazards Model with Random Effects for Response Times in Tests
ERIC Educational Resources Information Center
Ranger, Jochen; Kuhn, Jörg-Tobias
2016-01-01
In this article, a new model for test response times is proposed that combines latent class analysis and the proportional hazards model with random effects in a similar vein as the mixture factor model. The model assumes the existence of different latent classes. In each latent class, the response times are distributed according to a…
SEMIPARAMETRIC TRANSFORMATION MODELS WITH RANDOM EFFECTS FOR CLUSTERED FAILURE TIME DATA
Zeng, Donglin; Lin, D. Y.; Lin, Xihong
2009-01-01
We propose a general class of semiparametric transformation models with random effects to formulate the effects of possibly time-dependent covariates on clustered or correlated failure times. This class encompasses all commonly used transformation models, including proportional hazards and proportional odds models, and it accommodates a variety of random-effects distributions, particularly Gaussian distributions. We show that the nonparametric maximum likelihood estimators of the model parameters are consistent, asymptotically normal and asymptotically efficient. We develop the corresponding likelihood-based inference procedures. Simulation studies demonstrate that the proposed methods perform well in practical situations. An illustration with a well-known diabetic retinopathy study is provided. PMID:19809573
ERIC Educational Resources Information Center
Clarke, Paul; Crawford, Claire; Steele, Fiona; Vignoles, Anna
2015-01-01
The use of fixed (FE) and random effects (RE) in two-level hierarchical linear regression is discussed in the context of education research. We compare the robustness of FE models with the modelling flexibility and potential efficiency of those from RE models. We argue that the two should be seen as complementary approaches. We then compare both…
Mixed-Effects Modeling with Crossed Random Effects for Subjects and Items
ERIC Educational Resources Information Center
Baayen, R. H.; Davidson, D. J.; Bates, D. M.
2008-01-01
This paper provides an introduction to mixed-effects models for the analysis of repeated measurement data with subjects and items as crossed random effects. A worked-out example of how to use recent software for mixed-effects modeling is provided. Simulation studies illustrate the advantages offered by mixed-effects analyses compared to…
Petersen, Jørgen Holm
2016-01-15
This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied. For each term in the composite likelihood, a conditional likelihood is used that eliminates the influence of the random effects, which results in a composite conditional likelihood consisting of only one-dimensional integrals that may be solved numerically. Good properties of the resulting estimator are described in a small simulation study.
Estimation of the Nonlinear Random Coefficient Model when Some Random Effects Are Separable
ERIC Educational Resources Information Center
du Toit, Stephen H. C.; Cudeck, Robert
2009-01-01
A method is presented for marginal maximum likelihood estimation of the nonlinear random coefficient model when the response function has some linear parameters. This is done by writing the marginal distribution of the repeated measures as a conditional distribution of the response given the nonlinear random effects. The resulting distribution…
The Impact of Five Missing Data Treatments on a Cross-Classified Random Effects Model
ERIC Educational Resources Information Center
Hoelzle, Braden R.
2012-01-01
The present study compared the performance of five missing data treatment methods within a Cross-Classified Random Effects Model environment under various levels and patterns of missing data given a specified sample size. Prior research has shown the varying effect of missing data treatment options within the context of numerous statistical…
Fixed- and random-effects meta-analytic structural equation modeling: examples and analyses in R.
Cheung, Mike W-L
2014-03-01
Meta-analytic structural equation modeling (MASEM) combines the ideas of meta-analysis and structural equation modeling for the purpose of synthesizing correlation or covariance matrices and fitting structural equation models on the pooled correlation or covariance matrix. Cheung and Chan (Psychological Methods 10:40-64, 2005b, Structural Equation Modeling 16:28-53, 2009) proposed a two-stage structural equation modeling (TSSEM) approach to conducting MASEM that was based on a fixed-effects model by assuming that all studies have the same population correlation or covariance matrices. The main objective of this article is to extend the TSSEM approach to a random-effects model by the inclusion of study-specific random effects. Another objective is to demonstrate the procedures with two examples using the metaSEM package implemented in the R statistical environment. Issues related to and future directions for MASEM are discussed.
Bayesian Data-Model Fit Assessment for Structural Equation Modeling
ERIC Educational Resources Information Center
Levy, Roy
2011-01-01
Bayesian approaches to modeling are receiving an increasing amount of attention in the areas of model construction and estimation in factor analysis, structural equation modeling (SEM), and related latent variable models. However, model diagnostics and model criticism remain relatively understudied aspects of Bayesian SEM. This article describes…
Aguero-Valverde, Jonathan
2013-01-01
In recent years, complex statistical modeling approaches have being proposed to handle the unobserved heterogeneity and the excess of zeros frequently found in crash data, including random effects and zero inflated models. This research compares random effects, zero inflated, and zero inflated random effects models using a full Bayes hierarchical approach. The models are compared not just in terms of goodness-of-fit measures but also in terms of precision of posterior crash frequency estimates since the precision of these estimates is vital for ranking of sites for engineering improvement. Fixed-over-time random effects models are also compared to independent-over-time random effects models. For the crash dataset being analyzed, it was found that once the random effects are included in the zero inflated models, the probability of being in the zero state is drastically reduced, and the zero inflated models degenerate to their non zero inflated counterparts. Also by fixing the random effects over time the fit of the models and the precision of the crash frequency estimates are significantly increased. It was found that the rankings of the fixed-over-time random effects models are very consistent among them. In addition, the results show that by fixing the random effects over time, the standard errors of the crash frequency estimates are significantly reduced for the majority of the segments on the top of the ranking.
Estimating anatomical trajectories with Bayesian mixed-effects modeling.
Ziegler, G; Penny, W D; Ridgway, G R; Ourselin, S; Friston, K J
2015-11-01
We introduce a mass-univariate framework for the analysis of whole-brain structural trajectories using longitudinal Voxel-Based Morphometry data and Bayesian inference. Our approach to developmental and aging longitudinal studies characterizes heterogeneous structural growth/decline between and within groups. In particular, we propose a probabilistic generative model that parameterizes individual and ensemble average changes in brain structure using linear mixed-effects models of age and subject-specific covariates. Model inversion uses Expectation Maximization (EM), while voxelwise (empirical) priors on the size of individual differences are estimated from the data. Bayesian inference on individual and group trajectories is realized using Posterior Probability Maps (PPM). In addition to parameter inference, the framework affords comparisons of models with varying combinations of model order for fixed and random effects using model evidence. We validate the model in simulations and real MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) project. We further demonstrate how subject specific characteristics contribute to individual differences in longitudinal volume changes in healthy subjects, Mild Cognitive Impairment (MCI), and Alzheimer's Disease (AD).
Estimating anatomical trajectories with Bayesian mixed-effects modeling
Ziegler, G.; Penny, W.D.; Ridgway, G.R.; Ourselin, S.; Friston, K.J.
2015-01-01
We introduce a mass-univariate framework for the analysis of whole-brain structural trajectories using longitudinal Voxel-Based Morphometry data and Bayesian inference. Our approach to developmental and aging longitudinal studies characterizes heterogeneous structural growth/decline between and within groups. In particular, we propose a probabilistic generative model that parameterizes individual and ensemble average changes in brain structure using linear mixed-effects models of age and subject-specific covariates. Model inversion uses Expectation Maximization (EM), while voxelwise (empirical) priors on the size of individual differences are estimated from the data. Bayesian inference on individual and group trajectories is realized using Posterior Probability Maps (PPM). In addition to parameter inference, the framework affords comparisons of models with varying combinations of model order for fixed and random effects using model evidence. We validate the model in simulations and real MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) project. We further demonstrate how subject specific characteristics contribute to individual differences in longitudinal volume changes in healthy subjects, Mild Cognitive Impairment (MCI), and Alzheimer's Disease (AD). PMID:26190405
Bayesian modeling of unknown diseases for biosurveillance.
Shen, Yanna; Cooper, Gregory F
2009-11-14
This paper investigates Bayesian modeling of unknown causes of events in the context of disease-outbreak detection. We introduce a Bayesian approach that models and detects both (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A key contribution of this paper is that it introduces a Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has broad applicability in medical informatics, where the space of known causes of outcomes of interest is seldom complete.
Current Challenges in Bayesian Model Choice
NASA Astrophysics Data System (ADS)
Clyde, M. A.; Berger, J. O.; Bullard, F.; Ford, E. B.; Jefferys, W. H.; Luo, R.; Paulo, R.; Loredo, T.
2007-11-01
Model selection (and the related issue of model uncertainty) arises in many astronomical problems, and, in particular, has been one of the focal areas of the Exoplanet working group under the SAMSI (Statistics and Applied Mathematical Sciences Institute) Astrostatistcs Exoplanet program. We provide an overview of the Bayesian approach to model selection and highlight the challenges involved in implementing Bayesian model choice in four stylized problems. We review some of the current methods used by statisticians and astronomers and present recent developments in the area. We discuss the applicability, computational challenges, and performance of suggested methods and conclude with recommendations and open questions.
Gosho, Masahiko; Maruo, Kazushi; Ishii, Ryota; Hirakawa, Akihiro
2016-11-16
The total score, which is calculated as the sum of scores in multiple items or questions, is repeatedly measured in longitudinal clinical studies. A mixed effects model for repeated measures method is often used to analyze these data; however, if one or more individual items are not measured, the method cannot be directly applied to the total score. We develop two simple and interpretable procedures that infer fixed effects for a longitudinal continuous composite variable. These procedures consider that the items that compose the total score are multivariate longitudinal continuous data and, simultaneously, handle subject-level and item-level missing data. One procedure is based on a multivariate marginalized random effects model with a multiple of Kronecker product covariance matrices for serial time dependence and correlation among items. The other procedure is based on a multiple imputation approach with a multivariate normal model. In terms of the type-1 error rate and the bias of treatment effect in total score, the marginalized random effects model and multiple imputation procedures performed better than the standard mixed effects model for repeated measures analysis with listwise deletion and single imputations for handling item-level missing data. In particular, the mixed effects model for repeated measures with listwise deletion resulted in substantial inflation of the type-1 error rate. The marginalized random effects model and multiple imputation methods provide for a more efficient analysis by fully utilizing the partially available data, compared to the mixed effects model for repeated measures method with listwise deletion.
A Bayesian Nonparametric Meta-Analysis Model
ERIC Educational Resources Information Center
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
Posterior Predictive Model Checking in Bayesian Networks
ERIC Educational Resources Information Center
Crawford, Aaron
2014-01-01
This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex…
An Integrated Bayesian Model for DIF Analysis
ERIC Educational Resources Information Center
Soares, Tufi M.; Goncalves, Flavio B.; Gamerman, Dani
2009-01-01
In this article, an integrated Bayesian model for differential item functioning (DIF) analysis is proposed. The model is integrated in the sense of modeling the responses along with the DIF analysis. This approach allows DIF detection and explanation in a simultaneous setup. Previous empirical studies and/or subjective beliefs about the item…
Bayesian modeling of flexible cognitive control
Jiang, Jiefeng; Heller, Katherine; Egner, Tobias
2014-01-01
“Cognitive control” describes endogenous guidance of behavior in situations where routine stimulus-response associations are suboptimal for achieving a desired goal. The computational and neural mechanisms underlying this capacity remain poorly understood. We examine recent advances stemming from the application of a Bayesian learner perspective that provides optimal prediction for control processes. In reviewing the application of Bayesian models to cognitive control, we note that an important limitation in current models is a lack of a plausible mechanism for the flexible adjustment of control over conflict levels changing at varying temporal scales. We then show that flexible cognitive control can be achieved by a Bayesian model with a volatility-driven learning mechanism that modulates dynamically the relative dependence on recent and remote experiences in its prediction of future control demand. We conclude that the emergent Bayesian perspective on computational mechanisms of cognitive control holds considerable promise, especially if future studies can identify neural substrates of the variables encoded by these models, and determine the nature (Bayesian or otherwise) of their neural implementation. PMID:24929218
Crash risk analysis for Shanghai urban expressways: A Bayesian semi-parametric modeling approach.
Yu, Rongjie; Wang, Xuesong; Yang, Kui; Abdel-Aty, Mohamed
2016-10-01
Urban expressway systems have been developed rapidly in recent years in China; it has become one key part of the city roadway networks as carrying large traffic volume and providing high traveling speed. Along with the increase of traffic volume, traffic safety has become a major issue for Chinese urban expressways due to the frequent crash occurrence and the non-recurrent congestions caused by them. For the purpose of unveiling crash occurrence mechanisms and further developing Active Traffic Management (ATM) control strategies to improve traffic safety, this study developed disaggregate crash risk analysis models with loop detector traffic data and historical crash data. Bayesian random effects logistic regression models were utilized as it can account for the unobserved heterogeneity among crashes. However, previous crash risk analysis studies formulated random effects distributions in a parametric approach, which assigned them to follow normal distributions. Due to the limited information known about random effects distributions, subjective parametric setting may be incorrect. In order to construct more flexible and robust random effects to capture the unobserved heterogeneity, Bayesian semi-parametric inference technique was introduced to crash risk analysis in this study. Models with both inference techniques were developed for total crashes; semi-parametric models were proved to provide substantial better model goodness-of-fit, while the two models shared consistent coefficient estimations. Later on, Bayesian semi-parametric random effects logistic regression models were developed for weekday peak hour crashes, weekday non-peak hour crashes, and weekend non-peak hour crashes to investigate different crash occurrence scenarios. Significant factors that affect crash risk have been revealed and crash mechanisms have been concluded.
An efficient technique for Bayesian modeling of family data using the BUGS software.
Bae, Harold T; Perls, Thomas T; Sebastiani, Paola
2014-01-01
Linear mixed models have become a popular tool to analyze continuous data from family-based designs by using random effects that model the correlation of subjects from the same family. However, mixed models for family data are challenging to implement with the BUGS (Bayesian inference Using Gibbs Sampling) software because of the high-dimensional covariance matrix of the random effects. This paper describes an efficient parameterization that utilizes the singular value decomposition of the covariance matrix of random effects, includes the BUGS code for such implementation, and extends the parameterization to generalized linear mixed models. The implementation is evaluated using simulated data and an example from a large family-based study is presented with a comparison to other existing methods.
Survey of Bayesian Models for Modelling of Stochastic Temporal Processes
Ng, B
2006-10-12
This survey gives an overview of popular generative models used in the modeling of stochastic temporal systems. In particular, this survey is organized into two parts. The first part discusses the discrete-time representations of dynamic Bayesian networks and dynamic relational probabilistic models, while the second part discusses the continuous-time representation of continuous-time Bayesian networks.
Hierarchical Bayesian Models of Subtask Learning
ERIC Educational Resources Information Center
Anglim, Jeromy; Wynton, Sarah K. A.
2015-01-01
The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking…
Xu, Chengcheng; Wang, Wei; Liu, Pan; Li, Zhibin
2015-12-01
This study aimed to develop a real-time crash risk model with limited data in China by using Bayesian meta-analysis and Bayesian inference approach. A systematic review was first conducted by using three different Bayesian meta-analyses, including the fixed effect meta-analysis, the random effect meta-analysis, and the meta-regression. The meta-analyses provided a numerical summary of the effects of traffic variables on crash risks by quantitatively synthesizing results from previous studies. The random effect meta-analysis and the meta-regression produced a more conservative estimate for the effects of traffic variables compared with the fixed effect meta-analysis. Then, the meta-analyses results were used as informative priors for developing crash risk models with limited data. Three different meta-analyses significantly affect model fit and prediction accuracy. The model based on meta-regression can increase the prediction accuracy by about 15% as compared to the model that was directly developed with limited data. Finally, the Bayesian predictive densities analysis was used to identify the outliers in the limited data. It can further improve the prediction accuracy by 5.0%.
Crash Frequency Analysis Using Hurdle Models with Random Effects Considering Short-Term Panel Data.
Chen, Feng; Ma, Xiaoxiang; Chen, Suren; Yang, Lin
2016-10-26
Random effect panel data hurdle models are established to research the daily crash frequency on a mountainous section of highway I-70 in Colorado. Road Weather Information System (RWIS) real-time traffic and weather and road surface conditions are merged into the models incorporating road characteristics. The random effect hurdle negative binomial (REHNB) model is developed to study the daily crash frequency along with three other competing models. The proposed model considers the serial correlation of observations, the unbalanced panel-data structure, and dominating zeroes. Based on several statistical tests, the REHNB model is identified as the most appropriate one among four candidate models for a typical mountainous highway. The results show that: (1) the presence of over-dispersion in the short-term crash frequency data is due to both excess zeros and unobserved heterogeneity in the crash data; and (2) the REHNB model is suitable for this type of data. Moreover, time-varying variables including weather conditions, road surface conditions and traffic conditions are found to play importation roles in crash frequency. Besides the methodological advancements, the proposed technology bears great potential for engineering applications to develop short-term crash frequency models by utilizing detailed data from field monitoring data such as RWIS, which is becoming more accessible around the world.
Crash Frequency Analysis Using Hurdle Models with Random Effects Considering Short-Term Panel Data
Chen, Feng; Ma, Xiaoxiang; Chen, Suren; Yang, Lin
2016-01-01
Random effect panel data hurdle models are established to research the daily crash frequency on a mountainous section of highway I-70 in Colorado. Road Weather Information System (RWIS) real-time traffic and weather and road surface conditions are merged into the models incorporating road characteristics. The random effect hurdle negative binomial (REHNB) model is developed to study the daily crash frequency along with three other competing models. The proposed model considers the serial correlation of observations, the unbalanced panel-data structure, and dominating zeroes. Based on several statistical tests, the REHNB model is identified as the most appropriate one among four candidate models for a typical mountainous highway. The results show that: (1) the presence of over-dispersion in the short-term crash frequency data is due to both excess zeros and unobserved heterogeneity in the crash data; and (2) the REHNB model is suitable for this type of data. Moreover, time-varying variables including weather conditions, road surface conditions and traffic conditions are found to play importation roles in crash frequency. Besides the methodological advancements, the proposed technology bears great potential for engineering applications to develop short-term crash frequency models by utilizing detailed data from field monitoring data such as RWIS, which is becoming more accessible around the world. PMID:27792209
Objective Bayesian model selection for Cox regression.
Held, Leonhard; Gravestock, Isaac; Sabanés Bové, Daniel
2016-12-20
There is now a large literature on objective Bayesian model selection in the linear model based on the g-prior. The methodology has been recently extended to generalized linear models using test-based Bayes factors. In this paper, we show that test-based Bayes factors can also be applied to the Cox proportional hazards model. If the goal is to select a single model, then both the maximum a posteriori and the median probability model can be calculated. For clinical prediction of survival, we shrink the model-specific log hazard ratio estimates with subsequent calculation of the Breslow estimate of the cumulative baseline hazard function. A Bayesian model average can also be employed. We illustrate the proposed methodology with the analysis of survival data on primary biliary cirrhosis patients and the development of a clinical prediction model for future cardiovascular events based on data from the Second Manifestations of ARTerial disease (SMART) cohort study. Cross-validation is applied to compare the predictive performance with alternative model selection approaches based on Harrell's c-Index, the calibration slope and the integrated Brier score. Finally, a novel application of Bayesian variable selection to optimal conditional prediction via landmarking is described. Copyright © 2016 John Wiley & Sons, Ltd.
Road network safety evaluation using Bayesian hierarchical joint model.
Wang, Jie; Huang, Helai
2016-05-01
Safety and efficiency are commonly regarded as two significant performance indicators of transportation systems. In practice, road network planning has focused on road capacity and transport efficiency whereas the safety level of a road network has received little attention in the planning stage. This study develops a Bayesian hierarchical joint model for road network safety evaluation to help planners take traffic safety into account when planning a road network. The proposed model establishes relationships between road network risk and micro-level variables related to road entities and traffic volume, as well as socioeconomic, trip generation and network density variables at macro level which are generally used for long term transportation plans. In addition, network spatial correlation between intersections and their connected road segments is also considered in the model. A road network is elaborately selected in order to compare the proposed hierarchical joint model with a previous joint model and a negative binomial model. According to the results of the model comparison, the hierarchical joint model outperforms the joint model and negative binomial model in terms of the goodness-of-fit and predictive performance, which indicates the reasonableness of considering the hierarchical data structure in crash prediction and analysis. Moreover, both random effects at the TAZ level and the spatial correlation between intersections and their adjacent segments are found to be significant, supporting the employment of the hierarchical joint model as an alternative in road-network-level safety modeling as well.
Hierarchical Bayesian model updating for structural identification
NASA Astrophysics Data System (ADS)
Behmanesh, Iman; Moaveni, Babak; Lombaert, Geert; Papadimitriou, Costas
2015-12-01
A new probabilistic finite element (FE) model updating technique based on Hierarchical Bayesian modeling is proposed for identification of civil structural systems under changing ambient/environmental conditions. The performance of the proposed technique is investigated for (1) uncertainty quantification of model updating parameters, and (2) probabilistic damage identification of the structural systems. Accurate estimation of the uncertainty in modeling parameters such as mass or stiffness is a challenging task. Several Bayesian model updating frameworks have been proposed in the literature that can successfully provide the "parameter estimation uncertainty" of model parameters with the assumption that there is no underlying inherent variability in the updating parameters. However, this assumption may not be valid for civil structures where structural mass and stiffness have inherent variability due to different sources of uncertainty such as changing ambient temperature, temperature gradient, wind speed, and traffic loads. Hierarchical Bayesian model updating is capable of predicting the overall uncertainty/variability of updating parameters by assuming time-variability of the underlying linear system. A general solution based on Gibbs Sampler is proposed to estimate the joint probability distributions of the updating parameters. The performance of the proposed Hierarchical approach is evaluated numerically for uncertainty quantification and damage identification of a 3-story shear building model. Effects of modeling errors and incomplete modal data are considered in the numerical study.
Normativity, interpretation, and Bayesian models
Oaksford, Mike
2014-01-01
It has been suggested that evaluative normativity should be expunged from the psychology of reasoning. A broadly Davidsonian response to these arguments is presented. It is suggested that two distinctions, between different types of rationality, are more permeable than this argument requires and that the fundamental objection is to selecting theories that make the most rational sense of the data. It is argued that this is inevitable consequence of radical interpretation where understanding others requires assuming they share our own norms of reasoning. This requires evaluative normativity and it is shown that when asked to evaluate others’ arguments participants conform to rational Bayesian norms. It is suggested that logic and probability are not in competition and that the variety of norms is more limited than the arguments against evaluative normativity suppose. Moreover, the universality of belief ascription suggests that many of our norms are universal and hence evaluative. It is concluded that the union of evaluative normativity and descriptive psychology implicit in Davidson and apparent in the psychology of reasoning is a good thing. PMID:24860519
Sparse Event Modeling with Hierarchical Bayesian Kernel Methods
2016-01-05
events (and subsequently, their likelihood of occurrence) based on historical evidence of the counts of previous event occurrences. The novel Bayesian...Aug-2014 22-May-2015 Approved for Public Release; Distribution Unlimited Final Report: Sparse Event Modeling with Hierarchical Bayesian Kernel Methods...Sparse Event Modeling with Hierarchical Bayesian Kernel Methods Report Title The research objective of this proposal was to develop a predictive Bayesian
Bayesian network modelling of upper gastrointestinal bleeding
NASA Astrophysics Data System (ADS)
Aisha, Nazziwa; Shohaimi, Shamarina; Adam, Mohd Bakri
2013-09-01
Bayesian networks are graphical probabilistic models that represent causal and other relationships between domain variables. In the context of medical decision making, these models have been explored to help in medical diagnosis and prognosis. In this paper, we discuss the Bayesian network formalism in building medical support systems and we learn a tree augmented naive Bayes Network (TAN) from gastrointestinal bleeding data. The accuracy of the TAN in classifying the source of gastrointestinal bleeding into upper or lower source is obtained. The TAN achieves a high classification accuracy of 86% and an area under curve of 92%. A sensitivity analysis of the model shows relatively high levels of entropy reduction for color of the stool, history of gastrointestinal bleeding, consistency and the ratio of blood urea nitrogen to creatinine. The TAN facilitates the identification of the source of GIB and requires further validation.
Karr, Justin E; Areshenkoff, Corson N; Duggan, Emily C; Garcia-Barrera, Mauricio A
2014-12-01
Throughout their careers, many soldiers experience repeated blasts exposures from improvised explosive devices, which often involve head injury. Consequentially, blast-related mild Traumatic Brain Injury (mTBI) has become prevalent in modern conflicts, often occuring co-morbidly with psychiatric illness (e.g., post-traumatic stress disorder [PTSD]). In turn, a growing body of research has begun to explore the cognitive and psychiatric sequelae of blast-related mTBI. The current meta-analysis aimed to evaluate the chronic effects of blast-related mTBI on cognitive performance. A systematic review identified 9 studies reporting 12 samples meeting eligibility criteria. A Bayesian random-effects meta-analysis was conducted with cognitive construct and PTSD symptoms explored as moderators. The overall posterior mean effect size and Highest Density Interval (HDI) came to d = -0.12 [-0.21, -0.04], with executive function (-0.16 [-0.31, 0.00]), verbal delayed memory (-0.19 [-0.44, 0.06]) and processing speed (-0.11 [-0.26, 0.01]) presenting as the most sensitive cognitive domains to blast-related mTBI. When dividing executive function into diverse sub-constructs (i.e., working memory, inhibition, set-shifting), set-shifting presented the largest effect size (-0.33 [-0.55, -0.05]). PTSD symptoms did not predict cognitive effects sizes, β PTSD = -0.02 [-0.23, 0.20]. The results indicate a subtle, but chronic cognitive impairment following mTBI, especially in set-shifting, a relevant aspect of executive attention. These findings are consistent with past meta-analyses on multiple mTBI and correspond with past neuroimaging research on the cognitive correlates of white matter damage common in mTBI. However, all studies had cross-sectional designs, which resulted in universally low quality ratings and limited the conclusions inferable from this meta-analysis.
Saunders, Jessica M
2010-01-01
The group-based trajectory modeling approach is a systematic way of categorizing subjects into different groups based on their developmental trajectories using formal and objective statistical criteria. With the recent advancement in methods and statistical software, modeling possibilities are almost limitless; however, parallel advances in theory development have not kept pace. This paper examines some of the modeling options that are becoming more widespread and how they impact both empirical and theoretical findings. The key issue that is explored is the impact of adding random effects to the latent growth factors and how this alters the meaning of a group. The paper argues that technical specification should be guided by theory, and Moffitt's developmental taxonomy is used as an illustration of how modeling decisions can be matched to theory.
Technology diffusion in hospitals: a log odds random effects regression model.
Blank, Jos L T; Valdmanis, Vivian G
2015-01-01
This study identifies the factors that affect the diffusion of hospital innovations. We apply a log odds random effects regression model on hospital micro data. We introduce the concept of clustering innovations and the application of a log odds random effects regression model to describe the diffusion of technologies. We distinguish a number of determinants, such as service, physician, and environmental, financial and organizational characteristics of the 60 Dutch hospitals in our sample. On the basis of this data set on Dutch general hospitals over the period 1995-2002, we conclude that there is a relation between a number of determinants and the diffusion of innovations underlining conclusions from earlier research. Positive effects were found on the basis of the size of the hospitals, competition and a hospital's commitment to innovation. It appears that if a policy is developed to further diffuse innovations, the external effects of demand and market competition need to be examined, which would de facto lead to an efficient use of technology. For the individual hospital, instituting an innovations office appears to be the most prudent course of action.
Bayesian model selection analysis of WMAP3
Parkinson, David; Mukherjee, Pia; Liddle, Andrew R.
2006-06-15
We present a Bayesian model selection analysis of WMAP3 data using our code CosmoNest. We focus on the density perturbation spectral index n{sub S} and the tensor-to-scalar ratio r, which define the plane of slow-roll inflationary models. We find that while the Bayesian evidence supports the conclusion that n{sub S}{ne}1, the data are not yet powerful enough to do so at a strong or decisive level. If tensors are assumed absent, the current odds are approximately 8 to 1 in favor of n{sub S}{ne}1 under our assumptions, when WMAP3 data is used together with external data sets. WMAP3 data on its own is unable to distinguish between the two models. Further, inclusion of r as a parameter weakens the conclusion against the Harrison-Zel'dovich case (n{sub S}=1, r=0), albeit in a prior-dependent way. In appendices we describe the CosmoNest code in detail, noting its ability to supply posterior samples as well as to accurately compute the Bayesian evidence. We make a first public release of CosmoNest, now available at www.cosmonest.org.
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
Bayesian Nonparametric Models for Multiway Data Analysis.
Xu, Zenglin; Yan, Feng; Qi, Yuan
2015-02-01
Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches-such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)-amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g., missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models for multiway data analysis. We name these models InfTucker. These new models essentially conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or t processes with nonlinear covariance functions. Moreover, on network data, our models reduce to nonparametric stochastic blockmodels and can be used to discover latent groups and predict missing interactions. To learn the models efficiently from data, we develop a variational inference technique and explore properties of the Kronecker product for computational efficiency. Compared with a classical variational implementation, this technique reduces both time and space complexities by several orders of magnitude. On real multiway and network data, our new models achieved significantly higher prediction accuracy than state-of-art tensor decomposition methods and blockmodels.
Vock, David M.; Davidian, Marie; Tsiatis, Anastasios A.
2014-01-01
Generalized linear and nonlinear mixed models (GMMMs and NLMMs) are commonly used to represent non-Gaussian or nonlinear longitudinal or clustered data. A common assumption is that the random effects are Gaussian. However, this assumption may be unrealistic in some applications, and misspecification of the random effects density may lead to maximum likelihood parameter estimators that are inconsistent, biased, and inefficient. Because testing if the random effects are Gaussian is difficult, previous research has recommended using a flexible random effects density. However, computational limitations have precluded widespread use of flexible random effects densities for GLMMs and NLMMs. We develop a SAS macro, SNP_NLMM, that overcomes the computational challenges to fit GLMMs and NLMMs where the random effects are assumed to follow a smooth density that can be represented by the seminonparametric formulation proposed by Gallant and Nychka (1987). The macro is flexible enough to allow for any density of the response conditional on the random effects and any nonlinear mean trajectory. We demonstrate the SNP_NLMM macro on a GLMM of the disease progression of toenail infection and on a NLMM of intravenous drug concentration over time. PMID:24688453
Cafri, Guy; Hedeker, Donald; Aarons, Gregory A
2015-12-01
In longitudinal studies, time-varying group membership and group effects are important issues that need to be addressed. In this article we describe use of cross-classified and multiple membership random-effects models to address time-varying group membership, and dynamic group random-effects models to address time-varying group effects. We propose new models that integrate features of existing models, evaluate these models through simulation, provide guidance on how to fit these models, and apply the models in 2 real data examples. The discussion focuses on challenges in the application of these models.
Bayesian Model Selection with Network Based Diffusion Analysis
Whalen, Andrew; Hoppitt, William J. E.
2016-01-01
A number of recent studies have used Network Based Diffusion Analysis (NBDA) to detect the role of social transmission in the spread of a novel behavior through a population. In this paper we present a unified framework for performing NBDA in a Bayesian setting, and demonstrate how the Watanabe Akaike Information Criteria (WAIC) can be used for model selection. We present a specific example of applying this method to Time to Acquisition Diffusion Analysis (TADA). To examine the robustness of this technique, we performed a large scale simulation study and found that NBDA using WAIC could recover the correct model of social transmission under a wide range of cases, including under the presence of random effects, individual level variables, and alternative models of social transmission. This work suggests that NBDA is an effective and widely applicable tool for uncovering whether social transmission underpins the spread of a novel behavior, and may still provide accurate results even when key model assumptions are relaxed. PMID:27092089
Bayesian variable selection for latent class models.
Ghosh, Joyee; Herring, Amy H; Siega-Riz, Anna Maria
2011-09-01
In this article, we develop a latent class model with class probabilities that depend on subject-specific covariates. One of our major goals is to identify important predictors of latent classes. We consider methodology that allows estimation of latent classes while allowing for variable selection uncertainty. We propose a Bayesian variable selection approach and implement a stochastic search Gibbs sampler for posterior computation to obtain model-averaged estimates of quantities of interest such as marginal inclusion probabilities of predictors. Our methods are illustrated through simulation studies and application to data on weight gain during pregnancy, where it is of interest to identify important predictors of latent weight gain classes.
A Bayesian Shrinkage Approach for AMMI Models
de Oliveira, Luciano Antonio; Nuvunga, Joel Jorge; Pamplona, Andrezza Kéllen Alves
2015-01-01
Linear-bilinear models, especially the additive main effects and multiplicative interaction (AMMI) model, are widely applicable to genotype-by-environment interaction (GEI) studies in plant breeding programs. These models allow a parsimonious modeling of GE interactions, retaining a small number of principal components in the analysis. However, one aspect of the AMMI model that is still debated is the selection criteria for determining the number of multiplicative terms required to describe the GE interaction pattern. Shrinkage estimators have been proposed as selection criteria for the GE interaction components. In this study, a Bayesian approach was combined with the AMMI model with shrinkage estimators for the principal components. A total of 55 maize genotypes were evaluated in nine different environments using a complete blocks design with three replicates. The results show that the traditional Bayesian AMMI model produces low shrinkage of singular values but avoids the usual pitfalls in determining the credible intervals in the biplot. On the other hand, Bayesian shrinkage AMMI models have difficulty with the credible interval for model parameters, but produce stronger shrinkage of the principal components, converging to GE matrices that have more shrinkage than those obtained using mixed models. This characteristic allowed more parsimonious models to be chosen, and resulted in models being selected that were similar to those obtained by the Cornelius F-test (α = 0.05) in traditional AMMI models and cross validation based on leave-one-out. This characteristic allowed more parsimonious models to be chosen and more GEI pattern retained on the first two components. The resulting model chosen by posterior distribution of singular value was also similar to those produced by the cross-validation approach in traditional AMMI models. Our method enables the estimation of credible interval for AMMI biplot plus the choice of AMMI model based on direct posterior
A Bayesian Shrinkage Approach for AMMI Models.
da Silva, Carlos Pereira; de Oliveira, Luciano Antonio; Nuvunga, Joel Jorge; Pamplona, Andrezza Kéllen Alves; Balestre, Marcio
2015-01-01
Linear-bilinear models, especially the additive main effects and multiplicative interaction (AMMI) model, are widely applicable to genotype-by-environment interaction (GEI) studies in plant breeding programs. These models allow a parsimonious modeling of GE interactions, retaining a small number of principal components in the analysis. However, one aspect of the AMMI model that is still debated is the selection criteria for determining the number of multiplicative terms required to describe the GE interaction pattern. Shrinkage estimators have been proposed as selection criteria for the GE interaction components. In this study, a Bayesian approach was combined with the AMMI model with shrinkage estimators for the principal components. A total of 55 maize genotypes were evaluated in nine different environments using a complete blocks design with three replicates. The results show that the traditional Bayesian AMMI model produces low shrinkage of singular values but avoids the usual pitfalls in determining the credible intervals in the biplot. On the other hand, Bayesian shrinkage AMMI models have difficulty with the credible interval for model parameters, but produce stronger shrinkage of the principal components, converging to GE matrices that have more shrinkage than those obtained using mixed models. This characteristic allowed more parsimonious models to be chosen, and resulted in models being selected that were similar to those obtained by the Cornelius F-test (α = 0.05) in traditional AMMI models and cross validation based on leave-one-out. This characteristic allowed more parsimonious models to be chosen and more GEI pattern retained on the first two components. The resulting model chosen by posterior distribution of singular value was also similar to those produced by the cross-validation approach in traditional AMMI models. Our method enables the estimation of credible interval for AMMI biplot plus the choice of AMMI model based on direct posterior
Ma, Zhuanglin; Zhang, Honglu; Chien, Steven I-Jy; Wang, Jin; Dong, Chunjiao
2017-01-01
To investigate the relationship between crash frequency and potential influence factors, the accident data for events occurring on a 50km long expressway in China, including 567 crash records (2006-2008), were collected and analyzed. Both the fixed-length and the homogeneous longitudinal grade methods were applied to divide the study expressway section into segments. A negative binomial (NB) model and a random effect negative binomial (RENB) model were developed to predict crash frequency. The parameters of both models were determined using the maximum likelihood (ML) method, and the mixed stepwise procedure was applied to examine the significance of explanatory variables. Three explanatory variables, including longitudinal grade, road width, and ratio of longitudinal grade and curve radius (RGR), were found as significantly affecting crash frequency. The marginal effects of significant explanatory variables to the crash frequency were analyzed. The model performance was determined by the relative prediction error and the cumulative standardized residual. The results show that the RENB model outperforms the NB model. It was also found that the model performance with the fixed-length segment method is superior to that with the homogeneous longitudinal grade segment method.
Model Comparison of Bayesian Semiparametric and Parametric Structural Equation Models
ERIC Educational Resources Information Center
Song, Xin-Yuan; Xia, Ye-Mao; Pan, Jun-Hao; Lee, Sik-Yum
2011-01-01
Structural equation models have wide applications. One of the most important issues in analyzing structural equation models is model comparison. This article proposes a Bayesian model comparison statistic, namely the "L[subscript nu]"-measure for both semiparametric and parametric structural equation models. For illustration purposes, we consider…
Refined-scale panel data crash rate analysis using random-effects tobit model.
Chen, Feng; Ma, XiaoXiang; Chen, Suren
2014-12-01
Random effects tobit models are developed in predicting hourly crash rates with refined-scale panel data structure in both temporal and spatial domains. The proposed models address left-censoring effects of crash rates data while accounting for unobserved heterogeneity across groups and serial correlations within group in the meantime. The utilization of panel data in both refined temporal and spatial scales (hourly record and 1-mile roadway segments on average) exhibits strong potential on capturing the nature of time-varying and spatially varying contributing variables that is usually ignored in traditional aggregated traffic accident modeling. 1-year accident data and detailed traffic, environment, road geometry and surface condition data from a segment of I-25 in Colorado are adopted to demonstrate the proposed methodology. To better understand significantly different characteristics of crashes, two separate models, one for daytime and another for nighttime, have been developed. The results show major difference in contributing factors towards crash rate between daytime and nighttime models, implying considerable needs to investigate daytime and nighttime crashes separately using refined-scale data. After the models are developed, a comprehensive review of various contributing factors is made, followed by discussions on some interesting findings.
Model feedback in Bayesian propensity score estimation.
Zigler, Corwin M; Watts, Krista; Yeh, Robert W; Wang, Yun; Coull, Brent A; Dominici, Francesca
2013-03-01
Methods based on the propensity score comprise one set of valuable tools for comparative effectiveness research and for estimating causal effects more generally. These methods typically consist of two distinct stages: (1) a propensity score stage where a model is fit to predict the propensity to receive treatment (the propensity score), and (2) an outcome stage where responses are compared in treated and untreated units having similar values of the estimated propensity score. Traditional techniques conduct estimation in these two stages separately; estimates from the first stage are treated as fixed and known for use in the second stage. Bayesian methods have natural appeal in these settings because separate likelihoods for the two stages can be combined into a single joint likelihood, with estimation of the two stages carried out simultaneously. One key feature of joint estimation in this context is "feedback" between the outcome stage and the propensity score stage, meaning that quantities in a model for the outcome contribute information to posterior distributions of quantities in the model for the propensity score. We provide a rigorous assessment of Bayesian propensity score estimation to show that model feedback can produce poor estimates of causal effects absent strategies that augment propensity score adjustment with adjustment for individual covariates. We illustrate this phenomenon with a simulation study and with a comparative effectiveness investigation of carotid artery stenting versus carotid endarterectomy among 123,286 Medicare beneficiaries hospitlized for stroke in 2006 and 2007.
Experience With Bayesian Image Based Surface Modeling
NASA Technical Reports Server (NTRS)
Stutz, John C.
2005-01-01
Bayesian surface modeling from images requires modeling both the surface and the image generation process, in order to optimize the models by comparing actual and generated images. Thus it differs greatly, both conceptually and in computational difficulty, from conventional stereo surface recovery techniques. But it offers the possibility of using any number of images, taken under quite different conditions, and by different instruments that provide independent and often complementary information, to generate a single surface model that fuses all available information. I describe an implemented system, with a brief introduction to the underlying mathematical models and the compromises made for computational efficiency. I describe successes and failures achieved on actual imagery, where we went wrong and what we did right, and how our approach could be improved. Lastly I discuss how the same approach can be extended to distinct types of instruments, to achieve true sensor fusion.
A Hierarchical Bayesian Model for Crowd Emotions
Urizar, Oscar J.; Baig, Mirza S.; Barakova, Emilia I.; Regazzoni, Carlo S.; Marcenaro, Lucio; Rauterberg, Matthias
2016-01-01
Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366
Hopes and Cautions in Implementing Bayesian Structural Equation Modeling
ERIC Educational Resources Information Center
MacCallum, Robert C.; Edwards, Michael C.; Cai, Li
2012-01-01
Muthen and Asparouhov (2012) have proposed and demonstrated an approach to model specification and estimation in structural equation modeling (SEM) using Bayesian methods. Their contribution builds on previous work in this area by (a) focusing on the translation of conventional SEM models into a Bayesian framework wherein parameters fixed at zero…
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls
Bayesian Inference for Nonnegative Matrix Factorisation Models
Cemgil, Ali Taylan
2009-01-01
We describe nonnegative matrix factorisation (NMF) with a Kullback-Leibler (KL) error measure in a statistical framework, with a hierarchical generative model consisting of an observation and a prior component. Omitting the prior leads to the standard KL-NMF algorithms as special cases, where maximum likelihood parameter estimation is carried out via the Expectation-Maximisation (EM) algorithm. Starting from this view, we develop full Bayesian inference via variational Bayes or Monte Carlo. Our construction retains conjugacy and enables us to develop more powerful models while retaining attractive features of standard NMF such as monotonic convergence and easy implementation. We illustrate our approach on model order selection and image reconstruction. PMID:19536273
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.
Merging Digital Surface Models Implementing Bayesian Approaches
NASA Astrophysics Data System (ADS)
Sadeq, H.; Drummond, J.; Li, Z.
2016-06-01
In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades). It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
A Bayesian Analysis of Finite Mixtures in the LISREL Model.
ERIC Educational Resources Information Center
Zhu, Hong-Tu; Lee, Sik-Yum
2001-01-01
Proposes a Bayesian framework for estimating finite mixtures of the LISREL model. The model augments the observed data of the manifest variables with the latent variables and allocation variables and uses the Gibbs sampler to obtain the Bayesian solution. Discusses other associated statistical inferences. (SLD)
Bayesian model of Snellen visual acuity.
Nestares, Oscar; Navarro, Rafael; Antona, Beatriz
2003-07-01
A Bayesian model of Snellen visual acuity (VA) has been developed that, as far as we know, is the first one that includes the three main stages of VA: (1) optical degradations, (2) neural image representation and contrast thresholding, and (3) character recognition. The retinal image of a Snellen test chart is obtained from experimental wave-aberration data. Then a subband image decomposition with a set of visual channels tuned to different spatial frequencies and orientations is applied to the retinal image, as in standard computational models of early cortical image representation. A neural threshold is applied to the contrast responses to include the effect of the neural contrast sensitivity. The resulting image representation is the base of a Bayesian pattern-recognition method robust to the presence of optical aberrations. The model is applied to images containing sets of letter optotypes at different scales, and the number of correct answers is obtained at each scale; the final output is the decimal Snellen VA. The model has no free parameters to adjust. The main input data are the eye's optical aberrations, and standard values are used for all other parameters, including the Stiles-Crawford effect, visual channels, and neural contrast threshold, when no subject specific values are available. When aberrations are large, Snellen VA involving pattern recognition differs from grating acuity, which is based on a simpler detection (or orientation-discrimination) task and hence is basically unaffected by phase distortions introduced by the optical transfer function. A preliminary test of the model in one subject produced close agreement between actual measurements and predicted VA values. Two examples are also included: (1) application of the method to the prediction of the VAin refractive-surgery patients and (2) simulation of the VA attainable by correcting ocular aberrations.
Snyder, James
2009-01-01
In social interaction studies, one commonly encounters repeated displays of behaviors along with their duration data. Statistical methods for the analysis of such data use either parametric (e.g., Weibull) or semi-nonparametric (e.g., Cox) proportional hazard models, modified to include random effects (frailty) which account for the correlation of repeated occurrences of behaviors within a unit (dyad). However, dyad-specific random effects by themselves are not able to account for the ordering of event occurrences within dyads. The occurrence of an event (behavior) can make further occurrences of the same behavior to be more or less likely during an interaction. This paper develops event-dependent random effects models for analyzing repeated behaviors data using a Bayesian approach. The models are illustrated by a dataset relating to emotion regulation in families with children who have behavioral or emotional problems. PMID:20161593
Krøigård, Thomas; Gaist, David; Otto, Marit; Højlund, Dorthe; Selmar, Peter E; Sindrup, Søren H
2014-08-01
The reproducibility of variables commonly included in studies of peripheral nerve conduction in healthy individuals has not previously been analyzed using a random effects regression model. We examined the temporal changes and variability of standard nerve conduction measures in the leg. Peroneal nerve distal motor latency, motor conduction velocity, and compound motor action potential amplitude; sural nerve sensory action potential amplitude and sensory conduction velocity; and tibial nerve minimal F-wave latency were examined in 51 healthy subjects, aged 40 to 67 years. They were reexamined after 2 and 26 weeks. There was no change in the variables except for a minor decrease in sural nerve sensory action potential amplitude and a minor increase in tibial nerve minimal F-wave latency. Reproducibility was best for peroneal nerve distal motor latency and motor conduction velocity, sural nerve sensory conduction velocity, and tibial nerve minimal F-wave latency. Between-subject variability was greater than within-subject variability. Sample sizes ranging from 21 to 128 would be required to show changes twice the magnitude of the spontaneous changes observed in this study. Nerve conduction studies have a high reproducibility, and variables are mainly unaltered during 6 months. This study provides a solid basis for the planning of future clinical trials assessing changes in nerve conduction.
Bayesian model selection for LISA pathfinder
NASA Astrophysics Data System (ADS)
Karnesis, Nikolaos; Nofrarias, Miquel; Sopuerta, Carlos F.; Gibert, Ferran; Armano, Michele; Audley, Heather; Congedo, Giuseppe; Diepholz, Ingo; Ferraioli, Luigi; Hewitson, Martin; Hueller, Mauro; Korsakova, Natalia; McNamara, Paul W.; Plagnol, Eric; Vitale, Stefano
2014-03-01
The main goal of the LISA Pathfinder (LPF) mission is to fully characterize the acceleration noise models and to test key technologies for future space-based gravitational-wave observatories similar to the eLISA concept. The data analysis team has developed complex three-dimensional models of the LISA Technology Package (LTP) experiment onboard the LPF. These models are used for simulations, but, more importantly, they will be used for parameter estimation purposes during flight operations. One of the tasks of the data analysis team is to identify the physical effects that contribute significantly to the properties of the instrument noise. A way of approaching this problem is to recover the essential parameters of a LTP model fitting the data. Thus, we want to define the simplest model that efficiently explains the observations. To do so, adopting a Bayesian framework, one has to estimate the so-called Bayes factor between two competing models. In our analysis, we use three main different methods to estimate it: the reversible jump Markov chain Monte Carlo method, the Schwarz criterion, and the Laplace approximation. They are applied to simulated LPF experiments in which the most probable LTP model that explains the observations is recovered. The same type of analysis presented in this paper is expected to be followed during flight operations. Moreover, the correlation of the output of the aforementioned methods with the design of the experiment is explored.
A multivariate Bayesian model for embryonic growth.
Willemsen, Sten P; Eilers, Paul H C; Steegers-Theunissen, Régine P M; Lesaffre, Emmanuel
2015-04-15
Most longitudinal growth curve models evaluate the evolution of each of the anthropometric measurements separately. When applied to a 'reference population', this exercise leads to univariate reference curves against which new individuals can be evaluated. However, growth should be evaluated in totality, that is, by evaluating all body characteristics jointly. Recently, Cole et al. suggested the Superimposition by Translation and Rotation (SITAR) model, which expresses individual growth curves by three subject-specific parameters indicating their deviation from a flexible overall growth curve. This model allows the characterization of normal growth in a flexible though compact manner. In this paper, we generalize the SITAR model in a Bayesian way to multiple dimensions. The multivariate SITAR model allows us to create multivariate reference regions, which is advantageous for prediction. The usefulness of the model is illustrated on longitudinal measurements of embryonic growth obtained in the first semester of pregnancy, collected in the ongoing Rotterdam Predict study. Further, we demonstrate how the model can be used to find determinants of embryonic growth.
Shabankhani, B; Kazemnezhad, A; Zaeri, F
2015-01-01
Background and objectives:Out of 10 apparently healthy humans, one was somewhat suffering from one of the types of renal disease. Hemodialysis is known as the most applicable method of taking care of this group of patients. In addition, serum creatinine is an important mark in the performance of kidneys. The aim of the present study was to investigate the effective factors in creatinine and its effect on the performance of kidneys. Materials and methods: The present study is a longitudinal experiment in which 500 participants were randomly selected from the hemodialysis patients in Mazandaran Province. Creatinine variable was considered as the longitudinal responding variable, which was measured 3 times per year over a period of 6 years. The random effects model was also considered the most appropriate model for the collected data. Results:The total mean value of creatinine was 1.62 ± 0.49, among men 1.69 ± 0.46 and among women 35.1 ± 0.49. Variables of weight (p<0.001), age of disease diagnosis (p<0.001), time (p<0.001), gender (p<0.005), and cardiovascular diseases were significant and had effects on the trend of creatinine changes among the hemodialysis patients. Creatinine mean value had an increasing trend. Conclusion:Blood creatinine had a significant effect on the performance of kidneys, and the identification of variables that affected the creatinine level was highly helpful in controlling the performance of the kidneys. The results of most studies conducted on hemodialysis patients indicated that by measuring and controlling variables like weight, tobacco consumption, and control of related diseases like blood pressure could predict and control creatinine changes precisely. PMID:28255403
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION
We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
A Tutorial Introduction to Bayesian Models of Cognitive Development
ERIC Educational Resources Information Center
Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei
2011-01-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…
Implementing Relevance Feedback in the Bayesian Network Retrieval Model.
ERIC Educational Resources Information Center
de Campos, Luis M.; Fernandez-Luna, Juan M.; Huete, Juan F.
2003-01-01
Discussion of relevance feedback in information retrieval focuses on a proposal for the Bayesian Network Retrieval Model. Bases the proposal on the propagation of partial evidences in the Bayesian network, representing new information obtained from the user's relevance judgments to compute the posterior relevance probabilities of the documents…
Bayesian Student Modeling and the Problem of Parameter Specification.
ERIC Educational Resources Information Center
Millan, Eva; Agosta, John Mark; Perez de la Cruz, Jose Luis
2001-01-01
Discusses intelligent tutoring systems and the application of Bayesian networks to student modeling. Considers reasons for not using Bayesian networks, including the computational complexity of the algorithms and the difficulty of knowledge acquisition, and proposes an approach to simplify knowledge acquisition that applies causal independence to…
ERIC Educational Resources Information Center
Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan
2011-01-01
Estimation of parameters of random effects models from samples collected via complex multistage designs is considered. One way to reduce estimation bias due to unequal probabilities of selection is to incorporate sampling weights. Many researchers have been proposed various weighting methods (Korn, & Graubard, 2003; Pfeffermann, Skinner,…
ERIC Educational Resources Information Center
Hedeker, Donald; And Others
1996-01-01
Methods are proposed and described for estimating the degree to which relations among variables vary at the individual level. As an example, M. Fishbein and I. Ajzen's theory of reasoned action is examined. This article illustrates the use of empirical Bayes methods based on a random-effects regression model to estimate individual influences…
Bayesian analysis. II. Signal detection and model selection
NASA Astrophysics Data System (ADS)
Bretthorst, G. Larry
In the preceding. paper, Bayesian analysis was applied to the parameter estimation problem, given quadrature NMR data. Here Bayesian analysis is extended to the problem of selecting the model which is most probable in view of the data and all the prior information. In addition to the analytic calculation, two examples are given. The first example demonstrates how to use Bayesian probability theory to detect small signals in noise. The second example uses Bayesian probability theory to compute the probability of the number of decaying exponentials in simulated T1 data. The Bayesian answer to this question is essentially a microcosm of the scientific method and a quantitative statement of Ockham's razor: theorize about possible models, compare these to experiment, and select the simplest model that "best" fits the data.
Advances in Bayesian Modeling in Educational Research
ERIC Educational Resources Information Center
Levy, Roy
2016-01-01
In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…
Bayesian analysis of the backreaction models
Kurek, Aleksandra; Bolejko, Krzysztof; Szydlowski, Marek
2010-03-15
We present a Bayesian analysis of four different types of backreaction models, which are based on the Buchert equations. In this approach, one considers a solution to the Einstein equations for a general matter distribution and then an average of various observable quantities is taken. Such an approach became of considerable interest when it was shown that it could lead to agreement with observations without resorting to dark energy. In this paper we compare the {Lambda}CDM model and the backreaction models with type Ia supernovae, baryon acoustic oscillations, and cosmic microwave background data, and find that the former is favored. However, the tested models were based on some particular assumptions about the relation between the average spatial curvature and the backreaction, as well as the relation between the curvature and curvature index. In this paper we modified the latter assumption, leaving the former unchanged. We find that, by varying the relation between the curvature and curvature index, we can obtain a better fit. Therefore, some further work is still needed--in particular, the relation between the backreaction and the curvature should be revisited in order to fully determine the feasibility of the backreaction models to mimic dark energy.
Hierarchical Bayesian models of subtask learning.
Anglim, Jeromy; Wynton, Sarah K A
2015-07-01
The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking task, which logged participant actions, enabling measurement of strategy use and subtask performance. Model comparison was performed using deviance information criterion (DIC), posterior predictive checks, plots of model fits, and model recovery simulations. Results showed that although learning tended to be monotonically decreasing and decelerating, and approaching an asymptote for all subtasks, there was substantial inconsistency in learning curves both at the group- and individual-levels. This inconsistency was most apparent when constraining both the rate and the ratio of learning to asymptote to be equal across subtasks, thereby giving learning curves only 1 parameter for scaling. The inclusion of 6 strategy covariates provided improved prediction of subtask performance capturing different subtask learning processes and subtask trade-offs. In addition, strategy use partially explained the inconsistency in subtask learning. Overall, the model provided a more nuanced representation of how complex tasks can be decomposed in terms of simpler learning mechanisms.
Scale Mixture Models with Applications to Bayesian Inference
NASA Astrophysics Data System (ADS)
Qin, Zhaohui S.; Damien, Paul; Walker, Stephen
2003-11-01
Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.
Stochastic model updating utilizing Bayesian approach and Gaussian process model
NASA Astrophysics Data System (ADS)
Wan, Hua-Ping; Ren, Wei-Xin
2016-03-01
Stochastic model updating (SMU) has been increasingly applied in quantifying structural parameter uncertainty from responses variability. SMU for parameter uncertainty quantification refers to the problem of inverse uncertainty quantification (IUQ), which is a nontrivial task. Inverse problem solved with optimization usually brings about the issues of gradient computation, ill-conditionedness, and non-uniqueness. Moreover, the uncertainty present in response makes the inverse problem more complicated. In this study, Bayesian approach is adopted in SMU for parameter uncertainty quantification. The prominent strength of Bayesian approach for IUQ problem is that it solves IUQ problem in a straightforward manner, which enables it to avoid the previous issues. However, when applied to engineering structures that are modeled with a high-resolution finite element model (FEM), Bayesian approach is still computationally expensive since the commonly used Markov chain Monte Carlo (MCMC) method for Bayesian inference requires a large number of model runs to guarantee the convergence. Herein we reduce computational cost in two aspects. On the one hand, the fast-running Gaussian process model (GPM) is utilized to approximate the time-consuming high-resolution FEM. On the other hand, the advanced MCMC method using delayed rejection adaptive Metropolis (DRAM) algorithm that incorporates local adaptive strategy with global adaptive strategy is employed for Bayesian inference. In addition, we propose the use of the powerful variance-based global sensitivity analysis (GSA) in parameter selection to exclude non-influential parameters from calibration parameters, which yields a reduced-order model and thus further alleviates the computational burden. A simulated aluminum plate and a real-world complex cable-stayed pedestrian bridge are presented to illustrate the proposed framework and verify its feasibility.
A guide to Bayesian model selection for ecologists
Hooten, Mevin B.; Hobbs, N.T.
2015-01-01
The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.
Bayesian Case-deletion Model Complexity and Information Criterion
Zhu, Hongtu; Ibrahim, Joseph G.; Chen, Qingxia
2015-01-01
We establish a connection between Bayesian case influence measures for assessing the influence of individual observations and Bayesian predictive methods for evaluating the predictive performance of a model and comparing different models fitted to the same dataset. Based on such a connection, we formally propose a new set of Bayesian case-deletion model complexity (BCMC) measures for quantifying the effective number of parameters in a given statistical model. Its properties in linear models are explored. Adding some functions of BCMC to a conditional deviance function leads to a Bayesian case-deletion information criterion (BCIC) for comparing models. We systematically investigate some properties of BCIC and its connection with other information criteria, such as the Deviance Information Criterion (DIC). We illustrate the proposed methodology on linear mixed models with simulations and a real data example. PMID:26180578
Constructive Epistemic Modeling: A Hierarchical Bayesian Model Averaging Method
NASA Astrophysics Data System (ADS)
Tsai, F. T. C.; Elshall, A. S.
2014-12-01
Constructive epistemic modeling is the idea that our understanding of a natural system through a scientific model is a mental construct that continually develops through learning about and from the model. Using the hierarchical Bayesian model averaging (HBMA) method [1], this study shows that segregating different uncertain model components through a BMA tree of posterior model probabilities, model prediction, within-model variance, between-model variance and total model variance serves as a learning tool [2]. First, the BMA tree of posterior model probabilities permits the comparative evaluation of the candidate propositions of each uncertain model component. Second, systemic model dissection is imperative for understanding the individual contribution of each uncertain model component to the model prediction and variance. Third, the hierarchical representation of the between-model variance facilitates the prioritization of the contribution of each uncertain model component to the overall model uncertainty. We illustrate these concepts using the groundwater modeling of a siliciclastic aquifer-fault system. The sources of uncertainty considered are from geological architecture, formation dip, boundary conditions and model parameters. The study shows that the HBMA analysis helps in advancing knowledge about the model rather than forcing the model to fit a particularly understanding or merely averaging several candidate models. [1] Tsai, F. T.-C., and A. S. Elshall (2013), Hierarchical Bayesian model averaging for hydrostratigraphic modeling: Uncertainty segregation and comparative evaluation. Water Resources Research, 49, 5520-5536, doi:10.1002/wrcr.20428. [2] Elshall, A.S., and F. T.-C. Tsai (2014). Constructive epistemic modeling of groundwater flow with geological architecture and boundary condition uncertainty under Bayesian paradigm, Journal of Hydrology, 517, 105-119, doi: 10.1016/j.jhydrol.2014.05.027.
Bayesian analysis of a disability model for lung cancer survival.
Armero, C; Cabras, S; Castellanos, M E; Perra, S; Quirós, A; Oruezábal, M J; Sánchez-Rubio, J
2016-02-01
Bayesian reasoning, survival analysis and multi-state models are used to assess survival times for Stage IV non-small-cell lung cancer patients and the evolution of the disease over time. Bayesian estimation is done using minimum informative priors for the Weibull regression survival model, leading to an automatic inferential procedure. Markov chain Monte Carlo methods have been used for approximating posterior distributions and the Bayesian information criterion has been considered for covariate selection. In particular, the posterior distribution of the transition probabilities, resulting from the multi-state model, constitutes a very interesting tool which could be useful to help oncologists and patients make efficient and effective decisions.
Nonparametric Bayesian Modeling for Automated Database Schema Matching
Ferragut, Erik M; Laska, Jason A
2015-01-01
The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.
Calibrating Bayesian Network Representations of Social-Behavioral Models
Whitney, Paul D.; Walsh, Stephen J.
2010-04-08
While human behavior has long been studied, recent and ongoing advances in computational modeling present opportunities for recasting research outcomes in human behavior. In this paper we describe how Bayesian networks can represent outcomes of human behavior research. We demonstrate a Bayesian network that represents political radicalization research – and show a corresponding visual representation of aspects of this research outcome. Since Bayesian networks can be quantitatively compared with external observations, the representation can also be used for empirical assessments of the research which the network summarizes. For a political radicalization model based on published research, we show this empirical comparison with data taken from the Minorities at Risk Organizational Behaviors database.
Bayesian approach to decompression sickness model parameter estimation.
Howle, L E; Weber, P W; Nichols, J M
2017-03-01
We examine both maximum likelihood and Bayesian approaches for estimating probabilistic decompression sickness model parameters. Maximum likelihood estimation treats parameters as fixed values and determines the best estimate through repeated trials, whereas the Bayesian approach treats parameters as random variables and determines the parameter probability distributions. We would ultimately like to know the probability that a parameter lies in a certain range rather than simply make statements about the repeatability of our estimator. Although both represent powerful methods of inference, for models with complex or multi-peaked likelihoods, maximum likelihood parameter estimates can prove more difficult to interpret than the estimates of the parameter distributions provided by the Bayesian approach. For models of decompression sickness, we show that while these two estimation methods are complementary, the credible intervals generated by the Bayesian approach are more naturally suited to quantifying uncertainty in the model parameters.
BAYESIAN METHODS FOR REGIONAL-SCALE EUTROPHICATION MODELS. (R830887)
We demonstrate a Bayesian classification and regression tree (CART) approach to link multiple environmental stressors to biological responses and quantify uncertainty in model predictions. Such an approach can: (1) report prediction uncertainty, (2) be consistent with the amou...
Otis, D.L.; White, Gary C.
2004-01-01
Increased population survival rate after an episode of seasonal exploitation is considered a type of compensatory population response. Lack of an increase is interpreted as evidence that exploitation results in added annual mortality in the population. Despite its importance to management of exploited species, there are limited statistical techniques for comparing relative support for these two alternative models. For exploited bird species, the most common technique is to use a fixed effect, deterministic ultrastructure model incorporated into band recovery models to estimate the relationship between harvest and survival rate. We present a new likelihood-based technique within a framework that assumes that survival and harvest are random effects that covary through time. We conducted a Monte Carlo simulation study under this framework to evaluate the performance of these two techniques. The ultrastructure models performed poorly in all simulated scenarios, due mainly to pathological distributional properties. The random effects estimators and their associated estimators of precision had relatively small negative bias under most scenarios, and profile likelihood intervals achieved nominal coverage. We suggest that the random effects estimation method approach has many advantages compared to the ultrastructure models, and that evaluation of robustness and generalization to more complex population structures are topics for additional research. ?? 2004 Museu de Cie??ncies Naturals.
Sattar, Abdus; Weissfeld, Lisa A; Molenberghs, Geert
2011-11-30
In a longitudinal study with response data collected during a hospital stay, observations may be missing because of the subject's discharge from the hospital prior to completion of the study or the death of the subject, resulting in non-ignorable missing data. In addition to non-ignorable missingness, there is left-censoring in the response measurements because of the inherent limit of detection. For analyzing non-ignorable missing and left-censored longitudinal data, we have proposed to extend the theory of random effects tobit regression model to weighted random effects tobit regression model. The weights are computed on the basis of inverse probability weighted augmented methodology. An extensive simulation study was performed to compare the performance of the proposed model with a number of competitive models. The simulation study shows that the estimates are consistent and that the root mean square errors of the estimates are minimal for the use of augmented inverse probability weights in the random effects tobit model. The proposed method is also applied to the non-ignorable missing and left-censored interleukin-6 biomarker data obtained from the Genetic and Inflammatory Markers of Sepsis study.
Modeling Grade IV Gas Emboli using a Limited Failure Population Model with Random Effects
NASA Technical Reports Server (NTRS)
Thompson, Laura A.; Conkin, Johnny; Chhikara, Raj S.; Powell, Michael R.
2002-01-01
Venous gas emboli (VGE) (gas bubbles in venous blood) are associated with an increased risk of decompression sickness (DCS) in hypobaric environments. A high grade of VGE can be a precursor to serious DCS. In this paper, we model time to Grade IV VGE considering a subset of individuals assumed to be immune from experiencing VGE. Our data contain monitoring test results from subjects undergoing up to 13 denitrogenation test procedures prior to exposure to a hypobaric environment. The onset time of Grade IV VGE is recorded as contained within certain time intervals. We fit a parametric (lognormal) mixture survival model to the interval-and right-censored data to account for the possibility of a subset of "cured" individuals who are immune to the event. Our model contains random subject effects to account for correlations between repeated measurements on a single individual. Model assessments and cross-validation indicate that this limited failure population mixture model is an improvement over a model that does not account for the potential of a fraction of cured individuals. We also evaluated some alternative mixture models. Predictions from the best fitted mixture model indicate that the actual process is reasonably approximated by a limited failure population model.
A Generalizable Hierarchical Bayesian Model for Persistent SAR Change Detection
2012-04-01
6] K. Ranney and M. Soumekh, “Signal subspace change detection in averaged multilook sar imagery,” Geoscience and Remote Sensing, IEEE Transactions on...A Generalizable Hierarchical Bayesian Model for Persistent SAR Change Detection Gregory E. Newstadta, Edmund G. Zelniob, and Alfred O. Hero IIIa...Base, OH, 45433, USA ABSTRACT This paper proposes a hierarchical Bayesian model for multiple-pass, multiple antenna synthetic aperture radar ( SAR
A novel random effect model for GWAS meta-analysis and its application to trans-ethnic meta-analysis
Shi, Jingchunzi; Lee, Seunggeun
2016-01-01
Summary Meta-analysis of trans-ethnic genome-wide association studies (GWAS) has proven to be a practical and profitable approach for identifying loci that contribute to the risk of complex diseases. However, the expected genetic effect heterogeneity cannot easily be accommodated through existing fixed-effects and random-effects methods. In response, we propose a novel random effect model for trans-ethnic meta-analysis with flexible modeling of the expected genetic effect heterogeneity across diverse populations. Specifically, we adopt a modified random effect model from the kernel regression framework, in which genetic effect coefficients are random variables whose correlation structure reflects the genetic distances across ancestry groups. In addition, we use the adaptive variance component test to achieve robust power regardless of the degree of genetic effect heterogeneity. Simulation studies show that our proposed method has well-calibrated type I error rates at very stringent significance levels and can improve power over the traditional meta-analysis methods. We re-analyzed the published type 2 diabetes GWAS meta-analysis (Consortium et al., 2014) and successfully identified one additional SNP that clearly exhibits genetic effect heterogeneity across different ancestry groups. Furthermore, our proposed method provides scalable computing time for genome-wide datasets, in which an analysis of one million SNPs would require less than 3 hours. PMID:26916671
Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.
Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi
2015-02-01
We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.
Bayesian graphical models for genomewide association studies.
Verzilli, Claudio J; Stallard, Nigel; Whittaker, John C
2006-07-01
As the extent of human genetic variation becomes more fully characterized, the research community is faced with the challenging task of using this information to dissect the heritable components of complex traits. Genomewide association studies offer great promise in this respect, but their analysis poses formidable difficulties. In this article, we describe a computationally efficient approach to mining genotype-phenotype associations that scales to the size of the data sets currently being collected in such studies. We use discrete graphical models as a data-mining tool, searching for single- or multilocus patterns of association around a causative site. The approach is fully Bayesian, allowing us to incorporate prior knowledge on the spatial dependencies around each marker due to linkage disequilibrium, which reduces considerably the number of possible graphical structures. A Markov chain-Monte Carlo scheme is developed that yields samples from the posterior distribution of graphs conditional on the data from which probabilistic statements about the strength of any genotype-phenotype association can be made. Using data simulated under scenarios that vary in marker density, genotype relative risk of a causative allele, and mode of inheritance, we show that the proposed approach has better localization properties and leads to lower false-positive rates than do single-locus analyses. Finally, we present an application of our method to a quasi-synthetic data set in which data from the CYP2D6 region are embedded within simulated data on 100K single-nucleotide polymorphisms. Analysis is quick (<5 min), and we are able to localize the causative site to a very short interval.
Assessing Fit of Unidimensional Graded Response Models Using Bayesian Methods
ERIC Educational Resources Information Center
Zhu, Xiaowen; Stone, Clement A.
2011-01-01
The posterior predictive model checking method is a flexible Bayesian model-checking tool and has recently been used to assess fit of dichotomous IRT models. This paper extended previous research to polytomous IRT models. A simulation study was conducted to explore the performance of posterior predictive model checking in evaluating different…
Joint spatial Bayesian modeling for studies combining longitudinal and cross-sectional data
Lawson, Andrew B; Carroll, Rachel; Castro, Marcia
2017-01-01
Design for intervention studies may combine longitudinal data collected from sampled locations over several survey rounds and cross-sectional data from other locations in the study area. In this case, modeling the impact of the intervention requires an approach that can accommodate both types of data, accounting for the dependence between individuals followed up over time. Inadequate modeling can mask intervention effects, with serious implications for policy making. In this paper we use data from a large-scale larviciding intervention for malaria control implemented in Dar es Salaam, United Republic of Tanzania, collected over a period of almost 5 years. We apply a longitudinal Bayesian spatial model to the Dar es Salaam data, combining follow-up and cross-sectional data, treating the correlation in longitudinal observations separately, and controlling for potential confounders. An innovative feature of this modeling is the use of Ornstein–Uhlenbeck process to model random time effects. We contrast the results with other Bayesian modeling formulations, including cross-sectional approaches that consider individual-level random effects to account for subjects followed up in two or more surveys. The longitudinal modeling approach indicates that the intervention significantly reduced the prevalence of malaria infection in Dar es Salaam by 20% whereas the joint model did not suggest significance within the results. Our results suggest that the longitudinal model is to be preferred when longitudinal information is available at the individual level. PMID:24713159
Hwang, Kyu-Baek; Zhang, Byoung-Tak
2005-12-01
Bayesian model averaging (BMA) can resolve the overfitting problem by explicitly incorporating the model uncertainty into the analysis procedure. Hence, it can be used to improve the generalization performance of Bayesian network classifiers. Until now, BMA of Bayesian network classifiers has only been performed in some restricted forms, e.g., the model is averaged given a single node-order, because of its heavy computational burden. However, it can be hard to obtain a good node-order when the available training dataset is sparse. To alleviate this problem, we propose BMA of Bayesian network classifiers over several distinct node-orders obtained using the Markov chain Monte Carlo sampling technique. The proposed method was examined using two synthetic problems and four real-life datasets. First, we show that the proposed method is especially effective when the given dataset is very sparse. The classification accuracy of averaging over multiple node-orders was higher in most cases than that achieved using a single node-order in our experiments. We also present experimental results for test datasets with unobserved variables, where the quality of the averaged node-order is more important. Through these experiments, we show that the difference in classification performance between the cases of multiple node-orders and single node-order is related to the level of noise, confirming the relative benefit of averaging over multiple node-orders for incomplete data. We conclude that BMA of Bayesian network classifiers over multiple node-orders has an apparent advantage when the given dataset is sparse and noisy, despite the method's heavy computational cost.
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling.
Using consensus bayesian network to model the reactive oxygen species regulatory pathway.
Hu, Liangdong; Wang, Limin
2013-01-01
Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway.
Bayesian analysis for nonlinear mixed-effects models under heavy-tailed distributions.
De la Cruz, Rolando
2014-01-01
A common assumption in nonlinear mixed-effects models is the normality of both random effects and within-subject errors. However, such assumptions make inferences vulnerable to the presence of outliers. More flexible distributions are therefore necessary for modeling both sources of variability in this class of models. In the present paper, I consider an extension of the nonlinear mixed-effects models in which random effects and within-subject errors are assumed to be distributed according to a rich class of parametric models that are often used for robust inference. The class of distributions I consider is the scale mixture of multivariate normal distributions that consist of a wide range of symmetric and continuous distributions. This class includes heavy-tailed multivariate distributions, such as the Student's t and slash and contaminated normal. With the scale mixture of multivariate normal distributions, robustification is achieved from the tail behavior of the different distributions. A Bayesian framework is adopted, and MCMC is used to carry out posterior analysis. Model comparison using different criteria was considered. The procedures are illustrated using a real dataset from a pharmacokinetic study. I contrast results from the normal and robust models and show how the implementation can be used to detect outliers.
Bayesian Hierarchical Duration Model for Repeated Events : An Application to Behavioral Observations
Dagne, Getachew A.; Snyder, James
2009-01-01
This paper presents a continuous-time Bayesian model for analyzing durations of behavior displays in social interactions. Duration data of social interactions are often complex because of repeated behaviors (events) at individual or group (e.g., dyad) level, multiple behaviors (multistates), and several choices of exit from a current event (competing risks). A multilevel, multistate model is proposed to adequately characterize the behavioral processes. The model incorporates dyad-specific and transition-specific random effects to account for heterogeneity among dyads and interdependence among competing risks. The proposed method is applied to child-parent observational data derived from the School Transitions Project to assess the relation of emotional expression in child-parent interaction to risk for early and persisting child conduct problems. PMID:20209032
Bayesian Estimation of the Logistic Positive Exponent IRT Model
ERIC Educational Resources Information Center
Bolfarine, Heleno; Bazan, Jorge Luis
2010-01-01
A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…
Liu, Xiaolei; Huang, Meng; Fan, Bin; Buckler, Edward S; Zhang, Zhiwu
2016-02-01
False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises true positives. The modified MLM method, Multiple Loci Linear Mixed Model (MLMM), incorporates multiple markers simultaneously as covariates in a stepwise MLM to partially remove the confounding between testing markers and kinship. To completely eliminate the confounding, we divided MLMM into two parts: Fixed Effect Model (FEM) and a Random Effect Model (REM) and use them iteratively. FEM contains testing markers, one at a time, and multiple associated markers as covariates to control false positives. To avoid model over-fitting problem in FEM, the associated markers are estimated in REM by using them to define kinship. The P values of testing markers and the associated markers are unified at each iteration. We named the new method as Fixed and random model Circulating Probability Unification (FarmCPU). Both real and simulated data analyses demonstrated that FarmCPU improves statistical power compared to current methods. Additional benefits include an efficient computing time that is linear to both number of individuals and number of markers. Now, a dataset with half million individuals and half million markers can be analyzed within three days.
Liu, Xiaolei; Huang, Meng; Fan, Bin; Buckler, Edward S.; Zhang, Zhiwu
2016-01-01
False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises true positives. The modified MLM method, Multiple Loci Linear Mixed Model (MLMM), incorporates multiple markers simultaneously as covariates in a stepwise MLM to partially remove the confounding between testing markers and kinship. To completely eliminate the confounding, we divided MLMM into two parts: Fixed Effect Model (FEM) and a Random Effect Model (REM) and use them iteratively. FEM contains testing markers, one at a time, and multiple associated markers as covariates to control false positives. To avoid model over-fitting problem in FEM, the associated markers are estimated in REM by using them to define kinship. The P values of testing markers and the associated markers are unified at each iteration. We named the new method as Fixed and random model Circulating Probability Unification (FarmCPU). Both real and simulated data analyses demonstrated that FarmCPU improves statistical power compared to current methods. Additional benefits include an efficient computing time that is linear to both number of individuals and number of markers. Now, a dataset with half million individuals and half million markers can be analyzed within three days. PMID:26828793
Potgieter, Cornelis J.; Wei, Rubin; Kipnis, Victor; Freedman, Laurence S.; Carroll, Raymond J.
2016-01-01
Summary For the classical, homoscedastic measurement error model, moment reconstruction (Freedman et al., 2004, 2008) and moment-adjusted imputation (Thomas et al., 2011) are appealing, computationally simple imputation-like methods for general model fitting. Like classical regression calibration, the idea is to replace the unobserved variable subject to measurement error with a proxy that can be used in a variety of analyses. Moment reconstruction and moment-adjusted imputation differ from regression calibration in that they attempt to match multiple features of the latent variable, and also to match some of the latent variable’s relationships with the response and additional covariates. In this note, we consider a problem where true exposure is generated by a complex, nonlinear random effects modeling process, and develop analogues of moment reconstruction and moment-adjusted imputation for this case. This general model includes classical measurement errors, Berkson measurement errors, mixtures of Berkson and classical errors and problems that are not measurement error problems, but also cases where the data generating process for true exposure is a complex, nonlinear random effects modeling process. The methods are illustrated using the National Institutes of Health-AARP Diet and Health Study where the latent variable is a dietary pattern score called the Healthy Eating Index - 2005. We also show how our general model includes methods used in radiation epidemiology as a special case. Simulations are used to illustrate the methods. PMID:27061196
Bayesian Networks for Modeling Dredging Decisions
2011-10-01
position unless so designated by other authorized documents. DESTROY THIS REPORT WHEN NO LONGER NEEDED. DO NOT RETURN IT TO THE ORIGINATOR. ERDC/EL TR...links within a network often do indicate causality and it is usually best to work from information about... work in this area. ERDC/EL TR-11-14 16 Table 1. Bayesian network applications reviewed in the literature. Author(s) Year Substantive issue
On the Adequacy of Bayesian Evaluations of Categorization Models: Reply to Vanpaemel and Lee (2012)
ERIC Educational Resources Information Center
Wills, Andy J.; Pothos, Emmanuel M.
2012-01-01
Vanpaemel and Lee (2012) argued, and we agree, that the comparison of formal models can be facilitated by Bayesian methods. However, Bayesian methods neither precede nor supplant our proposals (Wills & Pothos, 2012), as Bayesian methods can be applied both to our proposals and to their polar opposites. Furthermore, the use of Bayesian methods to…
Meta-analysis of diagnostic test data: a bivariate Bayesian modeling approach.
Verde, Pablo E
2010-12-30
In the last decades, the amount of published results on clinical diagnostic tests has expanded very rapidly. The counterpart to this development has been the formal evaluation and synthesis of diagnostic results. However, published results present substantial heterogeneity and they can be regarded as so far removed from the classical domain of meta-analysis, that they can provide a rather severe test of classical statistical methods. Recently, bivariate random effects meta-analytic methods, which model the pairs of sensitivities and specificities, have been presented from the classical point of view. In this work a bivariate Bayesian modeling approach is presented. This approach substantially extends the scope of classical bivariate methods by allowing the structural distribution of the random effects to depend on multiple sources of variability. Meta-analysis is summarized by the predictive posterior distributions for sensitivity and specificity. This new approach allows, also, to perform substantial model checking, model diagnostic and model selection. Statistical computations are implemented in the public domain statistical software (WinBUGS and R) and illustrated with real data examples.
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected
Bayesian failure probability model sensitivity study. Final report
Not Available
1986-05-30
The Office of the Manager, National Communications System (OMNCS) has developed a system-level approach for estimating the effects of High-Altitude Electromagnetic Pulse (HEMP) on the connectivity of telecommunications networks. This approach incorporates a Bayesian statistical model which estimates the HEMP-induced failure probabilities of telecommunications switches and transmission facilities. The purpose of this analysis is to address the sensitivity of the Bayesian model. This is done by systematically varying two model input parameters--the number of observations, and the equipment failure rates. Throughout the study, a non-informative prior distribution is used. The sensitivity of the Bayesian model to the noninformative prior distribution is investigated from a theoretical mathematical perspective.
Back to basics for Bayesian model building in genomic selection.
Kärkkäinen, Hanni P; Sillanpää, Mikko J
2012-07-01
Numerous Bayesian methods of phenotype prediction and genomic breeding value estimation based on multilocus association models have been proposed. Computationally the methods have been based either on Markov chain Monte Carlo or on faster maximum a posteriori estimation. The demand for more accurate and more efficient estimation has led to the rapid emergence of workable methods, unfortunately at the expense of well-defined principles for Bayesian model building. In this article we go back to the basics and build a Bayesian multilocus association model for quantitative and binary traits with carefully defined hierarchical parameterization of Student's t and Laplace priors. In this treatment we consider alternative model structures, using indicator variables and polygenic terms. We make the most of the conjugate analysis, enabled by the hierarchical formulation of the prior densities, by deriving the fully conditional posterior densities of the parameters and using the acquired known distributions in building fast generalized expectation-maximization estimation algorithms.
Bayesian Case Influence Measures for Statistical Models with Missing Data
Zhu, Hongtu; Ibrahim, Joseph G.; Cho, Hyunsoon; Tang, Niansheng
2011-01-01
We examine three Bayesian case influence measures including the φ-divergence, Cook's posterior mode distance and Cook's posterior mean distance for identifying a set of influential observations for a variety of statistical models with missing data including models for longitudinal data and latent variable models in the absence/presence of missing data. Since it can be computationally prohibitive to compute these Bayesian case influence measures in models with missing data, we derive simple first-order approximations to the three Bayesian case influence measures by using the Laplace approximation formula and examine the applications of these approximations to the identification of influential sets. All of the computations for the first-order approximations can be easily done using Markov chain Monte Carlo samples from the posterior distribution based on the full data. Simulated data and an AIDS dataset are analyzed to illustrate the methodology. PMID:23399928
On the Bayesian Nonparametric Generalization of IRT-Type Models
ERIC Educational Resources Information Center
San Martin, Ernesto; Jara, Alejandro; Rolin, Jean-Marie; Mouchart, Michel
2011-01-01
We study the identification and consistency of Bayesian semiparametric IRT-type models, where the uncertainty on the abilities' distribution is modeled using a prior distribution on the space of probability measures. We show that for the semiparametric Rasch Poisson counts model, simple restrictions ensure the identification of a general…
Bayesian non-parametrics and the probabilistic approach to modelling
Ghahramani, Zoubin
2013-01-01
Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman’s coalescent, Dirichlet diffusion trees and Wishart processes. PMID:23277609
A General Bayesian Model for Testlets: Theory and Applications.
ERIC Educational Resources Information Center
Wang, Xiaohui; Bradlow, Eric T.; Wainer, Howard
2002-01-01
Proposes a modified version of commonly employed item response models in a fully Bayesian framework and obtains inferences under the model using Markov chain Monte Carlo techniques. Demonstrates use of the model in a series of simulations and with operational data from the North Carolina Test of Computer Skills and the Test of Spoken English…
Bayesian Network Models for Local Dependence among Observable Outcome Variables
ERIC Educational Resources Information Center
Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli
2009-01-01
Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context--ignores dependence among observables; (b) compensatory context--introduces…
Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis
ERIC Educational Resources Information Center
Ansari, Asim; Iyengar, Raghuram
2006-01-01
We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…
Bayesian generalized linear mixed modeling of Tuberculosis using informative priors
Woldegerima, Woldegebriel Assefa
2017-01-01
TB is rated as one of the world’s deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014. PMID:28257437
Bayesian generalized linear mixed modeling of Tuberculosis using informative priors.
Ojo, Oluwatobi Blessing; Lougue, Siaka; Woldegerima, Woldegebriel Assefa
2017-01-01
TB is rated as one of the world's deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014.
Bonate, Peter L; Sung, Crystal; Welch, Karen; Richards, Susan
2009-10-01
Patients that are exposed to biotechnology-derived therapeutics often develop antibodies to the therapeutic, the magnitude of which is assessed by measuring antibody titers. A statistical approach for analyzing antibody titer data conditional on seroconversion is presented. The proposed method is to first transform the antibody titer data based on a geometric series using a common ratio of 2 and a scale factor of 50 and then analyze the exponent using a zero-inflated or hurdle model assuming a Poisson or negative binomial distribution with random effects to account for patient heterogeneity. Patient specific covariates can be used to model the probability of developing an antibody response, i.e., seroconversion, as well as the magnitude of the antibody titer itself. The method was illustrated using antibody titer data from 87 male seroconverted Fabry patients receiving Fabrazyme. Titers from five clinical trials were collected over 276 weeks of therapy with anti-Fabrazyme IgG titers ranging from 100 to 409,600 after exclusion of seronegative patients. The best model to explain seroconversion was a zero-inflated Poisson (ZIP) model where cumulative dose (under a constant dose regimen of dosing every 2 weeks) influenced the probability of seroconversion. There was an 80% chance of seroconversion when the cumulative dose reached 210 mg (90% confidence interval: 194-226 mg). No difference in antibody titers was noted between Japanese or Western patients. Once seroconverted, antibody titers did not remain constant but decreased in an exponential manner from an initial magnitude to a new lower steady-state value. The expected titer after the new steady-state titer had been achieved was 870 (90% CI: 630-1109). The half-life to the new steady-state value after seroconversion was 44 weeks (90% CI: 17-70 weeks). Time to seroconversion did not appear to be correlated with titer at the time of seroconversion. The method can be adequately used to model antibody titer data.
Bayesian Estimation in the One-Parameter Latent Trait Model.
1980-03-01
3 MASSACHUSETTS LNIV AMHERST LAB OF PSYCHOMETRIC AND -- ETC F/G 12/1 BAYESIAN ESTIMATION IN THE ONE-PARA1ETER LATENT TRAIT MODEL. (U) MAR 80 H...TEST CHART VVNN lfl’ ,. [’ COD BAYESIAN ESTIMATION IN THE ONE-PARAMETER LATENT TRAIT MODEL 0 wtHAR IHARAN SWA I NATHAN AND JANICE A. GIFFORD Research...block numbef) latent trait theory Bayesain estimation 20. ABSTRACT (Continue on reveso aide If neceaar and identlfy by Nock mambe) ,-When several
Bayesian Analysis of Order-Statistics Models for Ranking Data.
ERIC Educational Resources Information Center
Yu, Philip L. H.
2000-01-01
Studied the order-statistics models, extending the usual normal order-statistics model into one in which the underlying random variables followed a multivariate normal distribution. Used a Bayesian approach and the Gibbs sampling technique. Applied the proposed method to analyze presidential election data from the American Psychological…
Bayesian Finite Mixtures for Nonlinear Modeling of Educational Data.
ERIC Educational Resources Information Center
Tirri, Henry; And Others
A Bayesian approach for finding latent classes in data is discussed. The approach uses finite mixture models to describe the underlying structure in the data and demonstrate that the possibility of using full joint probability models raises interesting new prospects for exploratory data analysis. The concepts and methods discussed are illustrated…
A Bayesian Approach for Analyzing Longitudinal Structural Equation Models
ERIC Educational Resources Information Center
Song, Xin-Yuan; Lu, Zhao-Hua; Hser, Yih-Ing; Lee, Sik-Yum
2011-01-01
This article considers a Bayesian approach for analyzing a longitudinal 2-level nonlinear structural equation model with covariates, and mixed continuous and ordered categorical variables. The first-level model is formulated for measures taken at each time point nested within individuals for investigating their characteristics that are dynamically…
Bayesian Semiparametric Structural Equation Models with Latent Variables
ERIC Educational Resources Information Center
Yang, Mingan; Dunson, David B.
2010-01-01
Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions of unknowns, including the latent variables. In…
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models
ERIC Educational Resources Information Center
Price, Larry R.
2012-01-01
The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Bayesian Estimation of the DINA Model with Gibbs Sampling
ERIC Educational Resources Information Center
Culpepper, Steven Andrew
2015-01-01
A Bayesian model formulation of the deterministic inputs, noisy "and" gate (DINA) model is presented. Gibbs sampling is employed to simulate from the joint posterior distribution of item guessing and slipping parameters, subject attribute parameters, and latent class probabilities. The procedure extends concepts in Béguin and Glas,…
Bayesian methods for characterizing unknown parameters of material models
Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.
2016-02-04
A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less
Bayesian methods for characterizing unknown parameters of material models
Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.
2016-02-04
A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed to characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.
Bayesian log-periodic model for financial crashes
NASA Astrophysics Data System (ADS)
Rodríguez-Caballero, Carlos Vladimir; Knapik, Oskar
2014-10-01
This paper introduces a Bayesian approach in econophysics literature about financial bubbles in order to estimate the most probable time for a financial crash to occur. To this end, we propose using noninformative prior distributions to obtain posterior distributions. Since these distributions cannot be performed analytically, we develop a Markov Chain Monte Carlo algorithm to draw from posterior distributions. We consider three Bayesian models that involve normal and Student's t-distributions in the disturbances and an AR(1)-GARCH(1,1) structure only within the first case. In the empirical part of the study, we analyze a well-known example of financial bubble - the S&P 500 1987 crash - to show the usefulness of the three methods under consideration and crashes of Merval-94, Bovespa-97, IPCMX-94, Hang Seng-97 using the simplest method. The novelty of this research is that the Bayesian models provide 95% credible intervals for the estimated crash time.
Molenberghs, Geert; Verbeke, Geert; Efendi, Achmad; Braekers, Roel; Demétrio, Clarice G B
2015-08-01
This paper presents, extends, and studies a model for repeated, overdispersed time-to-event outcomes, subject to censoring. Building upon work by Molenberghs, Verbeke, and Demétrio (2007) and Molenberghs et al. (2010), gamma and normal random effects are included in a Weibull model, to account for overdispersion and between-subject effects, respectively. Unlike these authors, censoring is allowed for, and two estimation methods are presented. The partial marginalization approach to full maximum likelihood of Molenberghs et al. (2010) is contrasted with pseudo-likelihood estimation. A limited simulation study is conducted to examine the relative merits of these estimation methods. The modeling framework is employed to analyze data on recurrent asthma attacks in children on the one hand and on survival in cancer patients on the other.
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies
Measuring Learning Progressions Using Bayesian Modeling in Complex Assessments
ERIC Educational Resources Information Center
Rutstein, Daisy Wise
2012-01-01
This research examines issues regarding model estimation and robustness in the use of Bayesian Inference Networks (BINs) for measuring Learning Progressions (LPs). It provides background information on LPs and how they might be used in practice. Two simulation studies are performed, along with real data examples. The first study examines the case…
Shortlist B: A Bayesian Model of Continuous Speech Recognition
ERIC Educational Resources Information Center
Norris, Dennis; McQueen, James M.
2008-01-01
A Bayesian model of continuous speech recognition is presented. It is based on Shortlist (D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract prelexical and lexical representations, a feedforward…
Modeling Unreliable Observations in Bayesian Networks by Credal Networks
NASA Astrophysics Data System (ADS)
Antonucci, Alessandro; Piatti, Alberto
Bayesian networks are probabilistic graphical models widely employed in AI for the implementation of knowledge-based systems. Standard inference algorithms can update the beliefs about a variable of interest in the network after the observation of some other variables. This is usually achieved under the assumption that the observations could reveal the actual states of the variables in a fully reliable way. We propose a procedure for a more general modeling of the observations, which allows for updating beliefs in different situations, including various cases of unreliable, incomplete, uncertain and also missing observations. This is achieved by augmenting the original Bayesian network with a number of auxiliary variables corresponding to the observations. For a flexible modeling of the observational process, the quantification of the relations between these auxiliary variables and those of the original Bayesian network is done by credal sets, i.e., convex sets of probability mass functions. Without any lack of generality, we show how this can be done by simply estimating the bounds of likelihoods of the observations for the different values of the observed variables. Overall, the Bayesian network is transformed into a credal network, for which a standard updating problem has to be solved. Finally, a number of transformations that might simplify the updating of the resulting credal network is provided.
Empirical evaluation of scoring functions for Bayesian network model selection.
Liu, Zhifa; Malone, Brandon; Yuan, Changhe
2012-01-01
In this work, we empirically evaluate the capability of various scoring functions of Bayesian networks for recovering true underlying structures. Similar investigations have been carried out before, but they typically relied on approximate learning algorithms to learn the network structures. The suboptimal structures found by the approximation methods have unknown quality and may affect the reliability of their conclusions. Our study uses an optimal algorithm to learn Bayesian network structures from datasets generated from a set of gold standard Bayesian networks. Because all optimal algorithms always learn equivalent networks, this ensures that only the choice of scoring function affects the learned networks. Another shortcoming of the previous studies stems from their use of random synthetic networks as test cases. There is no guarantee that these networks reflect real-world data. We use real-world data to generate our gold-standard structures, so our experimental design more closely approximates real-world situations. A major finding of our study suggests that, in contrast to results reported by several prior works, the Minimum Description Length (MDL) (or equivalently, Bayesian information criterion (BIC)) consistently outperforms other scoring functions such as Akaike's information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures. We believe this finding is a result of using both datasets generated from real-world applications rather than from random processes used in previous studies and learning algorithms to select high-scoring structures rather than selecting random models. Other findings of our study support existing work, e.g., large sample sizes result in learning structures closer to the true underlying structure; the BDeu score is sensitive to the parameter settings; and the fNML performs pretty well on small datasets. We also
Empirical evaluation of scoring functions for Bayesian network model selection
2012-01-01
In this work, we empirically evaluate the capability of various scoring functions of Bayesian networks for recovering true underlying structures. Similar investigations have been carried out before, but they typically relied on approximate learning algorithms to learn the network structures. The suboptimal structures found by the approximation methods have unknown quality and may affect the reliability of their conclusions. Our study uses an optimal algorithm to learn Bayesian network structures from datasets generated from a set of gold standard Bayesian networks. Because all optimal algorithms always learn equivalent networks, this ensures that only the choice of scoring function affects the learned networks. Another shortcoming of the previous studies stems from their use of random synthetic networks as test cases. There is no guarantee that these networks reflect real-world data. We use real-world data to generate our gold-standard structures, so our experimental design more closely approximates real-world situations. A major finding of our study suggests that, in contrast to results reported by several prior works, the Minimum Description Length (MDL) (or equivalently, Bayesian information criterion (BIC)) consistently outperforms other scoring functions such as Akaike's information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures. We believe this finding is a result of using both datasets generated from real-world applications rather than from random processes used in previous studies and learning algorithms to select high-scoring structures rather than selecting random models. Other findings of our study support existing work, e.g., large sample sizes result in learning structures closer to the true underlying structure; the BDeu score is sensitive to the parameter settings; and the fNML performs pretty well on small datasets. We also
Examples of Mixed-Effects Modeling with Crossed Random Effects and with Binomial Data
ERIC Educational Resources Information Center
Quene, Hugo; van den Bergh, Huub
2008-01-01
Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not nested. Traditional ANOVAs are compared against…
Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...
Standardized Mean Differences in Two-Level Cross-Classified Random Effects Models
ERIC Educational Resources Information Center
Lai, Mark H. C.; Kwok, Oi-Man
2014-01-01
Multilevel modeling techniques are becoming more popular in handling data with multilevel structure in educational and behavioral research. Recently, researchers have paid more attention to cross-classified data structure that naturally arises in educational settings. However, unlike traditional single-level research, methodological studies about…
Longitudinal Analysis with Regressions among Random Effects: A Latent Variable Modeling Approach
ERIC Educational Resources Information Center
Raykov, Tenko
2007-01-01
A didactic discussion of a latent variable modeling approach is presented that addresses frequent empirical concerns of social, behavioral, and educational researchers involved in longitudinal studies. The method is suitable when the purpose is to analyze repeated measure data along several interrelated dimensions and to explain some of the…
Bayesian neural network for rainfall-runoff modeling
NASA Astrophysics Data System (ADS)
Khan, Mohammad Sajjad; Coulibaly, Paulin
2006-07-01
In this paper, a Bayesian learning approach is introduced to train a multilayer feed-forward network for daily river flow and reservoir inflow simulation in a cold region river basin in Canada. In Bayesian approach, uncertainty about the relationship between inputs and outputs is initially taken care of by an assumed prior distribution of parameters (weights and biases). This prior distribution is updated to posterior distribution using a likelihood function following Bayes' theorem while data are observed. This posterior distribution is called the objective function of a network in the Bayesian learning approach. The objective function is maximized using a suitable optimization technique. Once the network is trained, the predictive distribution of the network outputs is obtained by integrating over the posterior distribution of weights. In this study, Gaussian prior distribution and a Gaussian noise model are used in defining posterior distribution. The network has been optimized using a scaled conjugate gradient technique. Posterior distribution of weights is approximated to Gaussian during prediction. Prediction performance of the Bayesian neural network (BNN) is compared with the results obtained from a standard artificial neural network (ANN) model and a widely used conceptual rainfall-runoff model, namely, HBV-96. The BNN model outperformed the conceptual model and slightly outperformed the standard ANN model in simulating mean, peak, and low river flows and reservoir inflows. The significant contribution of the Bayesian method over the conventional ANN approach, among others, is the uncertainty estimation of the outputs in the form of confidence intervals which are particularly needed in practical water resources applications. Prediction confidence limits (or intervals) indicate the extent to which one can rely on predictions for decision making. It is shown that the BNN can provide reliable streamflow and reservoir inflow forecasts without a loss in model
Bayesian analysis of structural equation models with dichotomous variables.
Lee, Sik-Yum; Song, Xin-Yuan
2003-10-15
Structural equation modelling has been used extensively in the behavioural and social sciences for studying interrelationships among manifest and latent variables. Recently, its uses have been well recognized in medical research. This paper introduces a Bayesian approach to analysing general structural equation models with dichotomous variables. In the posterior analysis, the observed dichotomous data are augmented with the hypothetical missing values, which involve the latent variables in the model and the unobserved continuous measurements underlying the dichotomous data. An algorithm based on the Gibbs sampler is developed for drawing the parameters values and the hypothetical missing values from the joint posterior distributions. Useful statistics, such as the Bayesian estimates and their standard error estimates, and the highest posterior density intervals, can be obtained from the simulated observations. A posterior predictive p-value is used to test the goodness-of-fit of the posited model. The methodology is applied to a study of hypertensive patient non-adherence to medication.
Spatial Bayesian hierarchical modelling of extreme sea states
NASA Astrophysics Data System (ADS)
Clancy, Colm; O'Sullivan, John; Sweeney, Conor; Dias, Frédéric; Parnell, Andrew C.
2016-11-01
A Bayesian hierarchical framework is used to model extreme sea states, incorporating a latent spatial process to more effectively capture the spatial variation of the extremes. The model is applied to a 34-year hindcast of significant wave height off the west coast of Ireland. The generalised Pareto distribution is fitted to declustered peaks over a threshold given by the 99.8th percentile of the data. Return levels of significant wave height are computed and compared against those from a model based on the commonly-used maximum likelihood inference method. The Bayesian spatial model produces smoother maps of return levels. Furthermore, this approach greatly reduces the uncertainty in the estimates, thus providing information on extremes which is more useful for practical applications.
Lu, Xiaosun; Huang, Yangxin
2014-07-20
It is a common practice to analyze complex longitudinal data using nonlinear mixed-effects (NLME) models with normality assumption. The NLME models with normal distributions provide the most popular framework for modeling continuous longitudinal outcomes, assuming individuals are from a homogeneous population and relying on random-effects to accommodate inter-individual variation. However, the following two issues may standout: (i) normality assumption for model errors may cause lack of robustness and subsequently lead to invalid inference and unreasonable estimates, particularly, if the data exhibit skewness and (ii) a homogeneous population assumption may be unrealistically obscuring important features of between-subject and within-subject variations, which may result in unreliable modeling results. There has been relatively few studies concerning longitudinal data with both heterogeneity and skewness features. In the last two decades, the skew distributions have shown beneficial in dealing with asymmetric data in various applications. In this article, our objective is to address the simultaneous impact of both features arisen from longitudinal data by developing a flexible finite mixture of NLME models with skew distributions under Bayesian framework that allows estimates of both model parameters and class membership probabilities for longitudinal data. Simulation studies are conducted to assess the performance of the proposed models and methods, and a real example from an AIDS clinical trial illustrates the methodology by modeling the viral dynamics to compare potential models with different distribution specifications; the analysis results are reported.
Bayesian inference in camera trapping studies for a class of spatial capture-recapture models
Royle, J. Andrew; Karanth, K. Ullas; Gopalaswamy, Arjun M.; Kumar, N. Samba
2009-01-01
We develop a class of models for inference about abundance or density using spatial capture-recapture data from studies based on camera trapping and related methods. The model is a hierarchical model composed of two components: a point process model describing the distribution of individuals in space (or their home range centers) and a model describing the observation of individuals in traps. We suppose that trap- and individual-specific capture probabilities are a function of distance between individual home range centers and trap locations. We show that the models can be regarded as generalized linear mixed models, where the individual home range centers are random effects. We adopt a Bayesian framework for inference under these models using a formulation based on data augmentation. We apply the models to camera trapping data on tigers from the Nagarahole Reserve, India, collected over 48 nights in 2006. For this study, 120 camera locations were used, but cameras were only operational at 30 locations during any given sample occasion. Movement of traps is common in many camera-trapping studies and represents an important feature of the observation model that we address explicitly in our application.
Bayesian and maximum likelihood estimation of hierarchical response time models
Farrell, Simon; Ludwig, Casimir
2008-01-01
Hierarchical (or multilevel) statistical models have become increasingly popular in psychology in the last few years. We consider the application of multilevel modeling to the ex-Gaussian, a popular model of response times. Single-level estimation is compared with hierarchical estimation of parameters of the ex-Gaussian distribution. Additionally, for each approach maximum likelihood (ML) estimation is compared with Bayesian estimation. A set of simulations and analyses of parameter recovery show that although all methods perform adequately well, hierarchical methods are better able to recover the parameters of the ex-Gaussian by reducing the variability in recovered parameters. At each level, little overall difference was observed between the ML and Bayesian methods. PMID:19001592
Bayesian model updating using incomplete modal data without mode matching
NASA Astrophysics Data System (ADS)
Sun, Hao; Büyüköztürk, Oral
2016-04-01
This study investigates a new probabilistic strategy for model updating using incomplete modal data. A hierarchical Bayesian inference is employed to model the updating problem. A Markov chain Monte Carlo technique with adaptive random-work steps is used to draw parameter samples for uncertainty quantification. Mode matching between measured and predicted modal quantities is not required through model reduction. We employ an iterated improved reduced system technique for model reduction. The reduced model retains the dynamic features as close as possible to those of the model before reduction. The proposed algorithm is finally validated by an experimental example.
Bayesian Estimation of Categorical Dynamic Factor Models
ERIC Educational Resources Information Center
Zhang, Zhiyong; Nesselroade, John R.
2007-01-01
Dynamic factor models have been used to analyze continuous time series behavioral data. We extend 2 main dynamic factor model variations--the direct autoregressive factor score (DAFS) model and the white noise factor score (WNFS) model--to categorical DAFS and WNFS models in the framework of the underlying variable method and illustrate them with…
Application of a predictive Bayesian model to environmental accounting.
Anex, R P; Englehardt, J D
2001-03-30
Environmental accounting techniques are intended to capture important environmental costs and benefits that are often overlooked in standard accounting practices. Environmental accounting methods themselves often ignore or inadequately represent large but highly uncertain environmental costs and costs conditioned by specific prior events. Use of a predictive Bayesian model is demonstrated for the assessment of such highly uncertain environmental and contingent costs. The predictive Bayesian approach presented generates probability distributions for the quantity of interest (rather than parameters thereof). A spreadsheet implementation of a previously proposed predictive Bayesian model, extended to represent contingent costs, is described and used to evaluate whether a firm should undertake an accelerated phase-out of its PCB containing transformers. Variability and uncertainty (due to lack of information) in transformer accident frequency and severity are assessed simultaneously using a combination of historical accident data, engineering model-based cost estimates, and subjective judgement. Model results are compared using several different risk measures. Use of the model for incorporation of environmental risk management into a company's overall risk management strategy is discussed.
Application of the Bayesian dynamic survival model in medicine.
He, Jianghua; McGee, Daniel L; Niu, Xufeng
2010-02-10
The Bayesian dynamic survival model (BDSM), a time-varying coefficient survival model from the Bayesian prospective, was proposed in early 1990s but has not been widely used or discussed. In this paper, we describe the model structure of the BDSM and introduce two estimation approaches for BDSMs: the Markov Chain Monte Carlo (MCMC) approach and the linear Bayesian (LB) method. The MCMC approach estimates model parameters through sampling and is computationally intensive. With the newly developed geoadditive survival models and software BayesX, the BDSM is available for general applications. The LB approach is easier in terms of computations but it requires the prespecification of some unknown smoothing parameters. In a simulation study, we use the LB approach to show the effects of smoothing parameters on the performance of the BDSM and propose an ad hoc method for identifying appropriate values for those parameters. We also demonstrate the performance of the MCMC approach compared with the LB approach and a penalized partial likelihood method available in software R packages. A gastric cancer trial is utilized to illustrate the application of the BDSM.
Application of hierarchical Bayesian unmixing models in river sediment source apportionment
NASA Astrophysics Data System (ADS)
Blake, Will; Smith, Hugh; Navas, Ana; Bodé, Samuel; Goddard, Rupert; Zou Kuzyk, Zou; Lennard, Amy; Lobb, David; Owens, Phil; Palazon, Leticia; Petticrew, Ellen; Gaspar, Leticia; Stock, Brian; Boeckx, Pacsal; Semmens, Brice
2016-04-01
Fingerprinting and unmixing concepts are used widely across environmental disciplines for forensic evaluation of pollutant sources. In aquatic and marine systems, this includes tracking the source of organic and inorganic pollutants in water and linking problem sediment to soil erosion and land use sources. It is, however, the particular complexity of ecological systems that has driven creation of the most sophisticated mixing models, primarily to (i) evaluate diet composition in complex ecological food webs, (ii) inform population structure and (iii) explore animal movement. In the context of the new hierarchical Bayesian unmixing model, MIXSIAR, developed to characterise intra-population niche variation in ecological systems, we evaluate the linkage between ecological 'prey' and 'consumer' concepts and river basin sediment 'source' and sediment 'mixtures' to exemplify the value of ecological modelling tools to river basin science. Recent studies have outlined advantages presented by Bayesian unmixing approaches in handling complex source and mixture datasets while dealing appropriately with uncertainty in parameter probability distributions. MixSIAR is unique in that it allows individual fixed and random effects associated with mixture hierarchy, i.e. factors that might exert an influence on model outcome for mixture groups, to be explored within the source-receptor framework. This offers new and powerful ways of interpreting river basin apportionment data. In this contribution, key components of the model are evaluated in the context of common experimental designs for sediment fingerprinting studies namely simple, nested and distributed catchment sampling programmes. Illustrative examples using geochemical and compound specific stable isotope datasets are presented and used to discuss best practice with specific attention to (1) the tracer selection process, (2) incorporation of fixed effects relating to sample timeframe and sediment type in the modelling
Bayesian Inference of High-Dimensional Dynamical Ocean Models
NASA Astrophysics Data System (ADS)
Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.
2015-12-01
This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
Lu, Tao; Wang, Min; Liu, Guangying; Dong, Guang-Hui; Qian, Feng
2016-01-01
It is well known that there is strong relationship between HIV viral load and CD4 cell counts in AIDS studies. However, the relationship between them changes during the course of treatment and may vary among individuals. During treatments, some individuals may experience terminal events such as death. Because the terminal event may be related to the individual's viral load measurements, the terminal mechanism is non-ignorable. Furthermore, there exists competing risks from multiple types of events, such as AIDS-related death and other death. Most joint models for the analysis of longitudinal-survival data developed in literatures have focused on constant coefficients and assume symmetric distribution for the endpoints, which does not meet the needs for investigating the nature of varying relationship between HIV viral load and CD4 cell counts in practice. We develop a mixed-effects varying-coefficient model with skewed distribution coupled with cause-specific varying-coefficient hazard model with random-effects to deal with varying relationship between the two endpoints for longitudinal-competing risks survival data. A fully Bayesian inference procedure is established to estimate parameters in the joint model. The proposed method is applied to a multicenter AIDS cohort study. Various scenarios-based potential models that account for partial data features are compared. Some interesting findings are presented.
Bayesian restoration of ion channel records using hidden Markov models.
Rosales, R; Stark, J A; Fitzgerald, W J; Hladky, S B
2001-03-01
Hidden Markov models have been used to restore recorded signals of single ion channels buried in background noise. Parameter estimation and signal restoration are usually carried out through likelihood maximization by using variants of the Baum-Welch forward-backward procedures. This paper presents an alternative approach for dealing with this inferential task. The inferences are made by using a combination of the framework provided by Bayesian statistics and numerical methods based on Markov chain Monte Carlo stochastic simulation. The reliability of this approach is tested by using synthetic signals of known characteristics. The expectations of the model parameters estimated here are close to those calculated using the Baum-Welch algorithm, but the present methods also yield estimates of their errors. Comparisons of the results of the Bayesian Markov Chain Monte Carlo approach with those obtained by filtering and thresholding demonstrate clearly the superiority of the new methods.
ERIC Educational Resources Information Center
Jackson, Dan
2013-01-01
Statistical inference is problematic in the common situation in meta-analysis where the random effects model is fitted to just a handful of studies. In particular, the asymptotic theory of maximum likelihood provides a poor approximation, and Bayesian methods are sensitive to the prior specification. Hence, less efficient, but easily computed and…
Theory-Based Bayesian Models of Inductive Inference
2010-06-30
Oxford University Press . 28. Griffiths, T. L. and Tenenbaum, J.B. (2007). Two proposals for causal grammar. In A. Gopnik and L. Schulz (eds.). ( ausal Learning. Oxford University Press . 29. Tenenbaum. J. B.. Kemp, C, Shafto. P. (2007). Theory-based Bayesian models for inductive reasoning. In A. Feeney and E. Heit (eds.). Induction. Cambridge University Press. 30. Goodman, N. D., Tenenbaum, J. B., Griffiths. T. L.. & Feldman, J. (2008). Compositionality in rational analysis: Grammar-based induction for concept
Bayesian hierarchical model for large-scale covariance matrix estimation.
Zhu, Dongxiao; Hero, Alfred O
2007-12-01
Many bioinformatics problems implicitly depend on estimating large-scale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to "overfitting." We cast the large-scale covariance matrix estimation problem into the Bayesian hierarchical model framework, and introduce dependency between covariance parameters. We demonstrate the advantages of our approaches over the traditional approaches using simulations and OMICS data analysis.
Slice sampling technique in Bayesian extreme of gold price modelling
NASA Astrophysics Data System (ADS)
Rostami, Mohammad; Adam, Mohd Bakri; Ibrahim, Noor Akma; Yahya, Mohamed Hisham
2013-09-01
In this paper, a simulation study of Bayesian extreme values by using Markov Chain Monte Carlo via slice sampling algorithm is implemented. We compared the accuracy of slice sampling with other methods for a Gumbel model. This study revealed that slice sampling algorithm offers more accurate and closer estimates with less RMSE than other methods . Finally we successfully employed this procedure to estimate the parameters of Malaysia extreme gold price from 2000 to 2011.
How to Address Measurement Noise in Bayesian Model Averaging
NASA Astrophysics Data System (ADS)
Schöniger, A.; Wöhling, T.; Nowak, W.
2014-12-01
When confronted with the challenge of selecting one out of several competing conceptual models for a specific modeling task, Bayesian model averaging is a rigorous choice. It ranks the plausibility of models based on Bayes' theorem, which yields an optimal trade-off between performance and complexity. With the resulting posterior model probabilities, their individual predictions are combined into a robust weighted average and the overall predictive uncertainty (including conceptual uncertainty) can be quantified. This rigorous framework does, however, not yet explicitly consider statistical significance of measurement noise in the calibration data set. This is a major drawback, because model weights might be instable due to the uncertainty in noisy data, which may compromise the reliability of model ranking. We present a new extension to the Bayesian model averaging framework that explicitly accounts for measurement noise as a source of uncertainty for the weights. This enables modelers to assess the reliability of model ranking for a specific application and a given calibration data set. Also, the impact of measurement noise on the overall prediction uncertainty can be determined. Technically, our extension is built within a Monte Carlo framework. We repeatedly perturb the observed data with random realizations of measurement error. Then, we determine the robustness of the resulting model weights against measurement noise. We quantify the variability of posterior model weights as weighting variance. We add this new variance term to the overall prediction uncertainty analysis within the Bayesian model averaging framework to make uncertainty quantification more realistic and "complete". We illustrate the importance of our suggested extension with an application to soil-plant model selection, based on studies by Wöhling et al. (2013, 2014). Results confirm that noise in leaf area index or evaporation rate observations produces a significant amount of weighting
Bayesian Isotonic Regression Dose-response (BIRD) Model.
Li, Wen; Fu, Haoda
2016-12-21
Understanding dose-response relationship is a crucial step in drug development. There are a few parametric methods to estimate dose-response curves, such as the Emax model and the logistic model. These parametric models are easy to interpret and, hence, widely used. However, these models often require the inclusion of patients on high-dose levels; otherwise, the model parameters cannot be reliably estimated. To have robust estimation, nonparametric models are used. However, these models are not able to estimate certain important clinical parameters, such as ED50 and Emax. Furthermore, in many therapeutic areas, dose-response curves can be assumed as non-decreasing functions. This creates an additional challenge for nonparametric methods. In this paper, we propose a new Bayesian isotonic regression dose-response model which features advantages from both parametric and nonparametric models. The ED50 and Emax can be derived from this model. Simulations are provided to evaluate the Bayesian isotonic regression dose-response model performance against two parametric models. We apply this model to a data set from a diabetes dose-finding study.
Bayesian prediction of placebo analgesia in an instrumental learning model
Jung, Won-Mo; Lee, Ye-Seul; Wallraven, Christian; Chae, Younbyoung
2017-01-01
Placebo analgesia can be primarily explained by the Pavlovian conditioning paradigm in which a passively applied cue becomes associated with less pain. In contrast, instrumental conditioning employs an active paradigm that might be more similar to clinical settings. In the present study, an instrumental conditioning paradigm involving a modified trust game in a simulated clinical situation was used to induce placebo analgesia. Additionally, Bayesian modeling was applied to predict the placebo responses of individuals based on their choices. Twenty-four participants engaged in a medical trust game in which decisions to receive treatment from either a doctor (more effective with high cost) or a pharmacy (less effective with low cost) were made after receiving a reference pain stimulus. In the conditioning session, the participants received lower levels of pain following both choices, while high pain stimuli were administered in the test session even after making the decision. The choice-dependent pain in the conditioning session was modulated in terms of both intensity and uncertainty. Participants reported significantly less pain when they chose the doctor or the pharmacy for treatment compared to the control trials. The predicted pain ratings based on Bayesian modeling showed significant correlations with the actual reports from participants for both of the choice categories. The instrumental conditioning paradigm allowed for the active choice of optional cues and was able to induce the placebo analgesia effect. Additionally, Bayesian modeling successfully predicted pain ratings in a simulated clinical situation that fits well with placebo analgesia induced by instrumental conditioning. PMID:28225816
DISSECTING MAGNETAR VARIABILITY WITH BAYESIAN HIERARCHICAL MODELS
Huppenkothen, Daniela; Elenbaas, Chris; Watts, Anna L.; Horst, Alexander J. van der; Brewer, Brendon J.; Hogg, David W.; Murray, Iain; Frean, Marcus; Levin, Yuri; Kouveliotou, Chryssa
2015-09-01
Neutron stars are a prime laboratory for testing physical processes under conditions of strong gravity, high density, and extreme magnetic fields. Among the zoo of neutron star phenomena, magnetars stand out for their bursting behavior, ranging from extremely bright, rare giant flares to numerous, less energetic recurrent bursts. The exact trigger and emission mechanisms for these bursts are not known; favored models involve either a crust fracture and subsequent energy release into the magnetosphere, or explosive reconnection of magnetic field lines. In the absence of a predictive model, understanding the physical processes responsible for magnetar burst variability is difficult. Here, we develop an empirical model that decomposes magnetar bursts into a superposition of small spike-like features with a simple functional form, where the number of model components is itself part of the inference problem. The cascades of spikes that we model might be formed by avalanches of reconnection, or crust rupture aftershocks. Using Markov Chain Monte Carlo sampling augmented with reversible jumps between models with different numbers of parameters, we characterize the posterior distributions of the model parameters and the number of components per burst. We relate these model parameters to physical quantities in the system, and show for the first time that the variability within a burst does not conform to predictions from ideas of self-organized criticality. We also examine how well the properties of the spikes fit the predictions of simplified cascade models for the different trigger mechanisms.
A Semiparametric Bayesian Model for Repeatedly Repeated Binary Outcomes
Quintana, Fernando A.; Müller, Peter; Rosner, Gary L.; Relling, Mary V.
2009-01-01
Summary We discuss the analysis of data from single nucleotide polymorphism (SNP) arrays comparing tumor and normal tissues. The data consist of sequences of indicators for loss of heterozygosity (LOH) and involve three nested levels of repetition: chromosomes for a given patient, regions within chromosomes, and SNPs nested within regions. We propose to analyze these data using a semiparametric model for multi-level repeated binary data. At the top level of the hierarchy we assume a sampling model for the observed binary LOH sequences that arises from a partial exchangeability argument. This implies a mixture of Markov chains model. The mixture is defined with respect to the Markov transition probabilities. We assume a nonparametric prior for the random mixing measure. The resulting model takes the form of a semiparametric random effects model with the matrix of transition probabilities being the random effects. The model includes appropriate dependence assumptions for the two remaining levels of the hierarchy, i.e., for regions within chromosomes and for chromosomes within patient. We use the model to identify regions of increased LOH in a dataset coming from a study of treatment-related leukemia in children with an initial cancer diagnostic. The model successfully identifies the desired regions and performs well compared to other available alternatives. PMID:19746193
Harrison, Xavier A
2015-01-01
Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels), I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data
3-D model-based Bayesian classification
Soenneland, L.; Tenneboe, P.; Gehrmann, T.; Yrke, O.
1994-12-31
The challenging task of the interpreter is to integrate different pieces of information and combine them into an earth model. The sophistication level of this earth model might vary from the simplest geometrical description to the most complex set of reservoir parameters related to the geometrical description. Obviously the sophistication level also depend on the completeness of the available information. The authors describe the interpreter`s task as a mapping between the observation space and the model space. The information available to the interpreter exists in observation space and the task is to infer a model in model-space. It is well-known that this inversion problem is non-unique. Therefore any attempt to find a solution depend son constraints being added in some manner. The solution will obviously depend on which constraints are introduced and it would be desirable to allow the interpreter to modify the constraints in a problem-dependent manner. They will present a probabilistic framework that gives the interpreter the tools to integrate the different types of information and produce constrained solutions. The constraints can be adapted to the problem at hand.
Bayesian Thurstonian models for ranking data using JAGS.
Johnson, Timothy R; Kuhn, Kristine M
2013-09-01
A Thurstonian model for ranking data assumes that observed rankings are consistent with those of a set of underlying continuous variables. This model is appealing since it renders ranking data amenable to familiar models for continuous response variables-namely, linear regression models. To date, however, the use of Thurstonian models for ranking data has been very rare in practice. One reason for this may be that inferences based on these models require specialized technical methods. These methods have been developed to address computational challenges involved in these models but are not easy to implement without considerable technical expertise and are not widely available in software packages. To address this limitation, we show that Bayesian Thurstonian models for ranking data can be very easily implemented with the JAGS software package. We provide JAGS model files for Thurstonian ranking models for general use, discuss their implementation, and illustrate their use in analyses.
Predicting coastal cliff erosion using a Bayesian probabilistic model
Hapke, C.; Plant, N.
2010-01-01
Regional coastal cliff retreat is difficult to model due to the episodic nature of failures and the along-shore variability of retreat events. There is a growing demand, however, for predictive models that can be used to forecast areas vulnerable to coastal erosion hazards. Increasingly, probabilistic models are being employed that require data sets of high temporal density to define the joint probability density function that relates forcing variables (e.g. wave conditions) and initial conditions (e.g. cliff geometry) to erosion events. In this study we use a multi-parameter Bayesian network to investigate correlations between key variables that control and influence variations in cliff retreat processes. The network uses Bayesian statistical methods to estimate event probabilities using existing observations. Within this framework, we forecast the spatial distribution of cliff retreat along two stretches of cliffed coast in Southern California. The input parameters are the height and slope of the cliff, a descriptor of material strength based on the dominant cliff-forming lithology, and the long-term cliff erosion rate that represents prior behavior. The model is forced using predicted wave impact hours. Results demonstrate that the Bayesian approach is well-suited to the forward modeling of coastal cliff retreat, with the correct outcomes forecast in 70-90% of the modeled transects. The model also performs well in identifying specific locations of high cliff erosion, thus providing a foundation for hazard mapping. This approach can be employed to predict cliff erosion at time-scales ranging from storm events to the impacts of sea-level rise at the century-scale. ?? 2010.
DPpackage: Bayesian Semi- and Nonparametric Modeling in R
Jara, Alejandro; Hanson, Timothy E.; Quintana, Fernando A.; Müller, Peter; Rosner, Gary L.
2011-01-01
Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian non- and semi-parametric models in R, DPpackage. Currently DPpackage includes models for marginal and conditional density estimation, ROC curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison, and for eliciting the precision parameter of the Dirichlet process prior. To maximize computational efficiency, the actual sampling for each model is carried out using compiled FORTRAN. PMID:21796263
Bayesian calibration of hyperelastic constitutive models of soft tissue.
Madireddy, Sandeep; Sista, Bhargava; Vemaganti, Kumar
2016-06-01
There is inherent variability in the experimental response used to characterize the hyperelastic mechanical response of soft tissues. This has to be accounted for while estimating the parameters in the constitutive models to obtain reliable estimates of the quantities of interest. The traditional least squares method of parameter estimation does not give due importance to this variability. We use a Bayesian calibration framework based on nested Monte Carlo sampling to account for the variability in the experimental data and its effect on the estimated parameters through a systematic probability-based treatment. We consider three different constitutive models to represent the hyperelastic nature of soft tissue: Mooney-Rivlin model, exponential model, and Ogden model. Three stress-strain data sets corresponding to the deformation of agarose gel, bovine liver tissue, and porcine brain tissue are considered. Bayesian fits and parameter estimates are compared with the corresponding least squares values. Finally, we propagate the uncertainty in the parameters to a quantity of interest (QoI), namely the force-indentation response, to study the effect of model form on the values of the QoI. Our results show that the quality of the fit alone is insufficient to determine the adequacy of the model, and due importance has to be given to the maximum likelihood value, the landscape of the likelihood distribution, and model complexity.
Bayesian analysis of physiologically based toxicokinetic and toxicodynamic models.
Hack, C Eric
2006-04-17
Physiologically based toxicokinetic (PBTK) and toxicodynamic (TD) models of bromate in animals and humans would improve our ability to accurately estimate the toxic doses in humans based on available animal studies. These mathematical models are often highly parameterized and must be calibrated in order for the model predictions of internal dose to adequately fit the experimentally measured doses. Highly parameterized models are difficult to calibrate and it is difficult to obtain accurate estimates of uncertainty or variability in model parameters with commonly used frequentist calibration methods, such as maximum likelihood estimation (MLE) or least squared error approaches. The Bayesian approach called Markov chain Monte Carlo (MCMC) analysis can be used to successfully calibrate these complex models. Prior knowledge about the biological system and associated model parameters is easily incorporated in this approach in the form of prior parameter distributions, and the distributions are refined or updated using experimental data to generate posterior distributions of parameter estimates. The goal of this paper is to give the non-mathematician a brief description of the Bayesian approach and Markov chain Monte Carlo analysis, how this technique is used in risk assessment, and the issues associated with this approach.
Modeling Women's Menstrual Cycles using PICI Gates in Bayesian Network.
Zagorecki, Adam; Łupińska-Dubicka, Anna; Voortman, Mark; Druzdzel, Marek J
2016-03-01
A major difficulty in building Bayesian network (BN) models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a number of parameters that is linear in the number of parents. In this paper, we introduce a new class of parametric models, the Probabilistic Independence of Causal Influences (PICI) models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of efficiently modeling a variety of interactions. A subset of PICI models is decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We present an application of the proposed method to learning dynamic BNs for modeling a woman's menstrual cycle. We show that PICI models are especially useful for parameter learning from small data sets and lead to higher parameter accuracy than when learning CPTs.
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty.
Baele, Guy; Lemey, Philippe; Suchard, Marc A
2016-03-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of "working distributions" to facilitate--or shorten--the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a "working" distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different "working" distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses.
Study of TEC fluctuation via stochastic models and Bayesian inversion
NASA Astrophysics Data System (ADS)
Bires, A.; Roininen, L.; Damtie, B.; Nigussie, M.; Vanhamäki, H.
2016-11-01
We propose stochastic processes to be used to model the total electron content (TEC) observation. Based on this, we model the rate of change of TEC (ROT) variation during ionospheric quiet conditions with stationary processes. During ionospheric disturbed conditions, for example, when irregularity in ionospheric electron density distribution occurs, stationarity assumption over long time periods is no longer valid. In these cases, we make the parameter estimation for short time scales, during which we can assume stationarity. We show the relationship between the new method and commonly used TEC characterization parameters ROT and the ROT Index (ROTI). We construct our parametric model within the framework of Bayesian statistical inverse problems and hence give the solution as an a posteriori probability distribution. Bayesian framework allows us to model measurement errors systematically. Similarly, we mitigate variation of TEC due to factors which are not of ionospheric origin, like due to the motion of satellites relative to the receiver, by incorporating a priori knowledge in the Bayesian model. In practical computations, we draw the so-called maximum a posteriori estimates, which are our ROT and ROTI estimates, from the posterior distribution. Because the algorithm allows to estimate ROTI at each observation time, the estimator does not depend on the period of time for ROTI computation. We verify the method by analyzing TEC data recorded by GPS receiver located in Ethiopia (11.6°N, 37.4°E). The results indicate that the TEC fluctuations caused by the ionospheric irregularity can be effectively detected and quantified from the estimated ROT and ROTI values.
A study of finite mixture model: Bayesian approach on financial time series data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-07-01
Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.
A Bayesian subgroup analysis using collections of ANOVA models.
Liu, Jinzhong; Sivaganesan, Siva; Laud, Purushottam W; Müller, Peter
2017-03-20
We develop a Bayesian approach to subgroup analysis using ANOVA models with multiple covariates, extending an earlier work. We assume a two-arm clinical trial with normally distributed response variable. We also assume that the covariates for subgroup finding are categorical and are a priori specified, and parsimonious easy-to-interpret subgroups are preferable. We represent the subgroups of interest by a collection of models and use a model selection approach to finding subgroups with heterogeneous effects. We develop suitable priors for the model space and use an objective Bayesian approach that yields multiplicity adjusted posterior probabilities for the models. We use a structured algorithm based on the posterior probabilities of the models to determine which subgroup effects to report. Frequentist operating characteristics of the approach are evaluated using simulation. While our approach is applicable in more general cases, we mainly focus on the 2 × 2 case of two covariates each at two levels for ease of presentation. The approach is illustrated using a real data example.
Bayesian joint modeling of longitudinal and spatial survival AIDS data.
Martins, Rui; Silva, Giovani L; Andreozzi, Valeska
2016-08-30
Joint analysis of longitudinal and survival data has received increasing attention in the recent years, especially for analyzing cancer and AIDS data. As both repeated measurements (longitudinal) and time-to-event (survival) outcomes are observed in an individual, a joint modeling is more appropriate because it takes into account the dependence between the two types of responses, which are often analyzed separately. We propose a Bayesian hierarchical model for jointly modeling longitudinal and survival data considering functional time and spatial frailty effects, respectively. That is, the proposed model deals with non-linear longitudinal effects and spatial survival effects accounting for the unobserved heterogeneity among individuals living in the same region. This joint approach is applied to a cohort study of patients with HIV/AIDS in Brazil during the years 2002-2006. Our Bayesian joint model presents considerable improvements in the estimation of survival times of the Brazilian HIV/AIDS patients when compared with those obtained through a separate survival model and shows that the spatial risk of death is the same across the different Brazilian states. Copyright © 2016 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Raudenbush, Stephen W.
2009-01-01
Fixed effects models are often useful in longitudinal studies when the goal is to assess the impact of teacher or school characteristics on student learning. In this article, I introduce an alternative procedure: adaptive centering with random effects. I show that this procedure can replicate the fixed effects analysis while offering several…
Two Bayesian tests of the GLOMOsys Model.
Field, Sarahanne M; Wagenmakers, Eric-Jan; Newell, Ben R; Zeelenberg, René; van Ravenzwaaij, Don
2016-12-01
Priming is arguably one of the key phenomena in contemporary social psychology. Recent retractions and failed replication attempts have led to a division in the field between proponents and skeptics and have reinforced the importance of confirming certain priming effects through replication. In this study, we describe the results of 2 preregistered replication attempts of 1 experiment by Förster and Denzler (2012). In both experiments, participants first processed letters either globally or locally, then were tested using a typicality rating task. Bayes factor hypothesis tests were conducted for both experiments: Experiment 1 (N = 100) yielded an indecisive Bayes factor of 1.38, indicating that the in-lab data are 1.38 times more likely to have occurred under the null hypothesis than under the alternative. Experiment 2 (N = 908) yielded a Bayes factor of 10.84, indicating strong support for the null hypothesis that global priming does not affect participants' mean typicality ratings. The failure to replicate this priming effect challenges existing support for the GLOMO(sys) model. (PsycINFO Database Record
Objective Bayesian Comparison of Constrained Analysis of Variance Models.
Consonni, Guido; Paroli, Roberta
2016-10-04
In the social sciences we are often interested in comparing models specified by parametric equality or inequality constraints. For instance, when examining three group means [Formula: see text] through an analysis of variance (ANOVA), a model may specify that [Formula: see text], while another one may state that [Formula: see text], and finally a third model may instead suggest that all means are unrestricted. This is a challenging problem, because it involves a combination of nonnested models, as well as nested models having the same dimension. We adopt an objective Bayesian approach, requiring no prior specification from the user, and derive the posterior probability of each model under consideration. Our method is based on the intrinsic prior methodology, suitably modified to accommodate equality and inequality constraints. Focussing on normal ANOVA models, a comparative assessment is carried out through simulation studies. We also present an application to real data collected in a psychological experiment.
Bayesian model comparison in cosmology with Population Monte Carlo
NASA Astrophysics Data System (ADS)
Kilbinger, Martin; Wraith, Darren; Robert, Christian P.; Benabed, Karim; Cappé, Olivier; Cardoso, Jean-François; Fort, Gersende; Prunet, Simon; Bouchet, François R.
2010-07-01
We use Bayesian model selection techniques to test extensions of the standard flat Λ cold dark matter (ΛCDM) paradigm. Dark-energy and curvature scenarios, and primordial perturbation models are considered. To that end, we calculate the Bayesian evidence in favour of each model using Population Monte Carlo (PMC), a new adaptive sampling technique which was recently applied in a cosmological context. In contrast to the case of other sampling-based inference techniques such as Markov chain Monte Carlo (MCMC), the Bayesian evidence is immediately available from the PMC sample used for parameter estimation without further computational effort, and it comes with an associated error evaluation. Also, it provides an unbiased estimator of the evidence after any fixed number of iterations and it is naturally parallelizable, in contrast with MCMC and nested sampling methods. By comparison with analytical predictions for simulated data, we show that our results obtained with PMC are reliable and robust. The variability in the evidence evaluation and the stability for various cases are estimated both from simulations and from data. For the cases we consider, the log-evidence is calculated with a precision of better than 0.08. Using a combined set of recent cosmic microwave background, type Ia supernovae and baryonic acoustic oscillation data, we find inconclusive evidence between flat ΛCDM and simple dark-energy models. A curved universe is moderately to strongly disfavoured with respect to a flat cosmology. Using physically well-motivated priors within the slow-roll approximation of inflation, we find a weak preference for a running spectral index. A Harrison-Zel'dovich spectrum is weakly disfavoured. With the current data, tensor modes are not detected; the large prior volume on the tensor-to-scalar ratio r results in moderate evidence in favour of r = 0.
Predictive RANS simulations via Bayesian Model-Scenario Averaging
NASA Astrophysics Data System (ADS)
Edeling, W. N.; Cinnella, P.; Dwight, R. P.
2014-10-01
The turbulence closure model is the dominant source of error in most Reynolds-Averaged Navier-Stokes simulations, yet no reliable estimators for this error component currently exist. Here we develop a stochastic, a posteriori error estimate, calibrated to specific classes of flow. It is based on variability in model closure coefficients across multiple flow scenarios, for multiple closure models. The variability is estimated using Bayesian calibration against experimental data for each scenario, and Bayesian Model-Scenario Averaging (BMSA) is used to collate the resulting posteriors, to obtain a stochastic estimate of a Quantity of Interest (QoI) in an unmeasured (prediction) scenario. The scenario probabilities in BMSA are chosen using a sensor which automatically weights those scenarios in the calibration set which are similar to the prediction scenario. The methodology is applied to the class of turbulent boundary-layers subject to various pressure gradients. For all considered prediction scenarios the standard-deviation of the stochastic estimate is consistent with the measurement ground truth. Furthermore, the mean of the estimate is more consistently accurate than the individual model predictions.
Quantum-Like Bayesian Networks for Modeling Decision Making.
Moreira, Catarina; Wichert, Andreas
2016-01-01
In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios.
Predictive RANS simulations via Bayesian Model-Scenario Averaging
Edeling, W.N.; Cinnella, P.; Dwight, R.P.
2014-10-15
The turbulence closure model is the dominant source of error in most Reynolds-Averaged Navier–Stokes simulations, yet no reliable estimators for this error component currently exist. Here we develop a stochastic, a posteriori error estimate, calibrated to specific classes of flow. It is based on variability in model closure coefficients across multiple flow scenarios, for multiple closure models. The variability is estimated using Bayesian calibration against experimental data for each scenario, and Bayesian Model-Scenario Averaging (BMSA) is used to collate the resulting posteriors, to obtain a stochastic estimate of a Quantity of Interest (QoI) in an unmeasured (prediction) scenario. The scenario probabilities in BMSA are chosen using a sensor which automatically weights those scenarios in the calibration set which are similar to the prediction scenario. The methodology is applied to the class of turbulent boundary-layers subject to various pressure gradients. For all considered prediction scenarios the standard-deviation of the stochastic estimate is consistent with the measurement ground truth. Furthermore, the mean of the estimate is more consistently accurate than the individual model predictions.
Bayesian Modeling of Biomolecular Assemblies with Cryo-EM Maps
Habeck, Michael
2017-01-01
A growing array of experimental techniques allows us to characterize the three-dimensional structure of large biological assemblies at increasingly higher resolution. In addition to X-ray crystallography and nuclear magnetic resonance in solution, new structure determination methods such cryo-electron microscopy (cryo-EM), crosslinking/mass spectrometry and solid-state NMR have emerged. Often it is not sufficient to use a single experimental method, but complementary data need to be collected by using multiple techniques. The integration of all datasets can only be achieved by computational means. This article describes Inferential structure determination, a Bayesian approach to integrative modeling of biomolecular complexes with hybrid structural data. I will introduce probabilistic models for cryo-EM maps and outline Markov chain Monte Carlo algorithms for sampling model structures from the posterior distribution. I will focus on rigid and flexible modeling with cryo-EM data and discuss some of the computational challenges of Bayesian inference in the context of biomolecular modeling. PMID:28382301
Quantum-Like Bayesian Networks for Modeling Decision Making
Moreira, Catarina; Wichert, Andreas
2016-01-01
In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios. PMID:26858669
An intuitive Bayesian spatial model for disease mapping that accounts for scaling.
Riebler, Andrea; Sørbye, Sigrunn H; Simpson, Daniel; Rue, Håvard
2016-08-01
In recent years, disease mapping studies have become a routine application within geographical epidemiology and are typically analysed within a Bayesian hierarchical model formulation. A variety of model formulations for the latent level have been proposed but all come with inherent issues. In the classical BYM (Besag, York and Mollié) model, the spatially structured component cannot be seen independently from the unstructured component. This makes prior definitions for the hyperparameters of the two random effects challenging. There are alternative model formulations that address this confounding; however, the issue on how to choose interpretable hyperpriors is still unsolved. Here, we discuss a recently proposed parameterisation of the BYM model that leads to improved parameter control as the hyperparameters can be seen independently from each other. Furthermore, the need for a scaled spatial component is addressed, which facilitates assignment of interpretable hyperpriors and make these transferable between spatial applications with different graph structures. The hyperparameters themselves are used to define flexible extensions of simple base models. Consequently, penalised complexity priors for these parameters can be derived based on the information-theoretic distance from the flexible model to the base model, giving priors with clear interpretation. We provide implementation details for the new model formulation which preserve sparsity properties, and we investigate systematically the model performance and compare it to existing parameterisations. Through a simulation study, we show that the new model performs well, both showing good learning abilities and good shrinkage behaviour. In terms of model choice criteria, the proposed model performs at least equally well as existing parameterisations, but only the new formulation offers parameters that are interpretable and hyperpriors that have a clear meaning.
Bayesian Variable Selection on Model Spaces Constrained by Heredity Conditions.
Taylor-Rodriguez, Daniel; Womack, Andrew; Bliznyuk, Nikolay
2016-01-01
This paper investigates Bayesian variable selection when there is a hierarchical dependence structure on the inclusion of predictors in the model. In particular, we study the type of dependence found in polynomial response surfaces of orders two and higher, whose model spaces are required to satisfy weak or strong heredity conditions. These conditions restrict the inclusion of higher-order terms depending upon the inclusion of lower-order parent terms. We develop classes of priors on the model space, investigate their theoretical and finite sample properties, and provide a Metropolis-Hastings algorithm for searching the space of models. The tools proposed allow fast and thorough exploration of model spaces that account for hierarchical polynomial structure in the predictors and provide control of the inclusion of false positives in high posterior probability models.
Assessing uncertainty in a stand growth model by Bayesian synthesis
Green, E.J.; MacFarlane, D.W.; Valentine, H.T.; Strawderman, W.E.
1999-11-01
The Bayesian synthesis method (BSYN) was used to bound the uncertainty in projections calculated with PIPESTEM, a mechanistic model of forest growth. The application furnished posterior distributions of (a) the values of the model's parameters, and (b) the values of three of the model's output variables--basal area per unit land area, average tree height, and tree density--at different points in time. Confidence or credible intervals for the output variables were obtained directly from the posterior distributions. The application also provides estimates of correlation among the parameters and output variables. BSYN, which originally was applied to a population dynamics model for bowhead whales, is generally applicable to deterministic models. Extension to two or more linked models is discussed. A simple worked example is included in an appendix.
ERIC Educational Resources Information Center
Wu, Haiyan
2013-01-01
General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…
Assessing global vegetation activity using spatio-temporal Bayesian modelling
NASA Astrophysics Data System (ADS)
Mulder, Vera L.; van Eck, Christel M.; Friedlingstein, Pierre; Regnier, Pierre A. G.
2016-04-01
This work demonstrates the potential of modelling vegetation activity using a hierarchical Bayesian spatio-temporal model. This approach allows modelling changes in vegetation and climate simultaneous in space and time. Changes of vegetation activity such as phenology are modelled as a dynamic process depending on climate variability in both space and time. Additionally, differences in observed vegetation status can be contributed to other abiotic ecosystem properties, e.g. soil and terrain properties. Although these properties do not change in time, they do change in space and may provide valuable information in addition to the climate dynamics. The spatio-temporal Bayesian models were calibrated at a regional scale because the local trends in space and time can be better captured by the model. The regional subsets were defined according to the SREX segmentation, as defined by the IPCC. Each region is considered being relatively homogeneous in terms of large-scale climate and biomes, still capturing small-scale (grid-cell level) variability. Modelling within these regions is hence expected to be less uncertain due to the absence of these large-scale patterns, compared to a global approach. This overall modelling approach allows the comparison of model behavior for the different regions and may provide insights on the main dynamic processes driving the interaction between vegetation and climate within different regions. The data employed in this study encompasses the global datasets for soil properties (SoilGrids), terrain properties (Global Relief Model based on SRTM DEM and ETOPO), monthly time series of satellite-derived vegetation indices (GIMMS NDVI3g) and climate variables (Princeton Meteorological Forcing Dataset). The findings proved the potential of a spatio-temporal Bayesian modelling approach for assessing vegetation dynamics, at a regional scale. The observed interrelationships of the employed data and the different spatial and temporal trends support
Theory-based Bayesian models of inductive learning and reasoning.
Tenenbaum, Joshua B; Griffiths, Thomas L; Kemp, Charles
2006-07-01
Inductive inference allows humans to make powerful generalizations from sparse data when learning about word meanings, unobserved properties, causal relationships, and many other aspects of the world. Traditional accounts of induction emphasize either the power of statistical learning, or the importance of strong constraints from structured domain knowledge, intuitive theories or schemas. We argue that both components are necessary to explain the nature, use and acquisition of human knowledge, and we introduce a theory-based Bayesian framework for modeling inductive learning and reasoning as statistical inferences over structured knowledge representations.
Predicting brain activity using a Bayesian spatial model.
Derado, Gordana; Bowman, F Dubois; Zhang, Lijun
2013-08-01
Increasing the clinical applicability of functional neuroimaging technology is an emerging objective, e.g. for diagnostic and treatment purposes. We propose a novel Bayesian spatial hierarchical framework for predicting follow-up neural activity based on an individual's baseline functional neuroimaging data. Our approach attempts to overcome some shortcomings of the modeling methods used in other neuroimaging settings, by borrowing strength from the spatial correlations present in the data. Our proposed methodology is applicable to data from various imaging modalities including functional magnetic resonance imaging and positron emission tomography, and we provide an illustration here using positron emission tomography data from a study of Alzheimer's disease to predict disease progression.
Approximate Bayesian computation for forward modeling in cosmology
Akeret, Joël; Refregier, Alexandre; Amara, Adam; Seehars, Sebastian; Hasner, Caspar E-mail: alexandre.refregier@phys.ethz.ch E-mail: sebastian.seehars@phys.ethz.ch
2015-08-01
Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to the posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release.
Multiple testing on standardized mortality ratios: a Bayesian hierarchical model for FDR estimation.
Ventrucci, Massimo; Scott, E Marian; Cocchi, Daniela
2011-01-01
The analysis of large data sets of standardized mortality ratios (SMRs), obtained by collecting observed and expected disease counts in a map of contiguous regions, is a first step in descriptive epidemiology to detect potential environmental risk factors. A common situation arises when counts are collected in small areas, that is, where the expected count is very low, and disease risks underlying the map are spatially correlated. Traditional p-value-based methods, which control the false discovery rate (FDR) by means of Poisson p-values, might achieve small sensitivity in identifying risk in small areas. This problem is the focus of the present work, where a Bayesian approach which performs a test to evaluate the null hypothesis of no risk over each SMR and controls the posterior FDR is proposed. A Bayesian hierarchical model including spatial random effects to allow for extra-Poisson variability is implemented providing estimates of the posterior probabilities that the null hypothesis of absence of risk is true. By means of such posterior probabilities, an estimate of the posterior FDR conditional on the data can be computed. A conservative estimation is needed to achieve the control which is checked by simulation. The availability of this estimate allows the practitioner to determine nonarbitrary FDR-based selection rules to identify high-risk areas according to a preset FDR level. Sensitivity and specificity of FDR-based rules are studied via simulation and a comparison with p-value-based rules is also shown. A real data set is analyzed using rules based on several FDR levels.
Model Selection in Historical Research Using Approximate Bayesian Computation
Rubio-Campillo, Xavier
2016-01-01
Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953
Kazembe, Lawrence N.
2013-01-01
Despite remarkable gains in life expectancy and declining mortality in the 21st century, in many places mostly in developing countries, adult mortality has increased in part due to HIV/AIDS or continued abject poverty levels. Moreover many factors including behavioural, socio-economic and demographic variables work simultaneously to impact on risk of mortality. Understanding risk factors of adult mortality is crucial towards designing appropriate public health interventions. In this paper we proposed a structured additive two-part random effects regression model for adult mortality data. Our proposal assumed two processes: (i) whether death occurred in the household (prevalence part), and (ii) number of reported deaths, if death did occur (severity part). The proposed model specification therefore consisted of two generalized linear mixed models (GLMM) with correlated random effects that permitted structured and unstructured spatial components at regional level. Specifically, the first part assumed a GLMM with a logistic link and the second part explored a count model following either a Poisson or negative binomial distribution. The model was used to analyse adult mortality data of 25,793 individuals from the 2006/2007 Namibian DHS data. Inference is based on the Bayesian framework with appropriate priors discussed. PMID:24066052
Kazembe, Lawrence N
2013-01-01
Despite remarkable gains in life expectancy and declining mortality in the 21st century, in many places mostly in developing countries, adult mortality has increased in part due to HIV/AIDS or continued abject poverty levels. Moreover many factors including behavioural, socio-economic and demographic variables work simultaneously to impact on risk of mortality. Understanding risk factors of adult mortality is crucial towards designing appropriate public health interventions. In this paper we proposed a structured additive two-part random effects regression model for adult mortality data. Our proposal assumed two processes: (i) whether death occurred in the household (prevalence part), and (ii) number of reported deaths, if death did occur (severity part). The proposed model specification therefore consisted of two generalized linear mixed models (GLMM) with correlated random effects that permitted structured and unstructured spatial components at regional level. Specifically, the first part assumed a GLMM with a logistic link and the second part explored a count model following either a Poisson or negative binomial distribution. The model was used to analyse adult mortality data of 25,793 individuals from the 2006/2007 Namibian DHS data. Inference is based on the Bayesian framework with appropriate priors discussed.
Efficient multilevel brain tumor segmentation with integrated bayesian model classification.
Corso, J J; Sharon, E; Dube, S; El-Saden, S; Sinha, U; Yuille, A
2008-05-01
We present a new method for automatic segmentation of heterogeneous image data that takes a step toward bridging the gap between bottom-up affinity-based segmentation methods and top-down generative model based approaches. The main contribution of the paper is a Bayesian formulation for incorporating soft model assignments into the calculation of affinities, which are conventionally model free. We integrate the resulting model-aware affinities into the multilevel segmentation by weighted aggregation algorithm, and apply the technique to the task of detecting and segmenting brain tumor and edema in multichannel magnetic resonance (MR) volumes. The computationally efficient method runs orders of magnitude faster than current state-of-the-art techniques giving comparable or improved results. Our quantitative results indicate the benefit of incorporating model-aware affinities into the segmentation process for the difficult case of glioblastoma multiforme brain tumor.
Preferential sampling and Bayesian geostatistics: Statistical modeling and examples.
Cecconi, Lorenzo; Grisotto, Laura; Catelan, Dolores; Lagazio, Corrado; Berrocal, Veronica; Biggeri, Annibale
2016-08-01
Preferential sampling refers to any situation in which the spatial process and the sampling locations are not stochastically independent. In this paper, we present two examples of geostatistical analysis in which the usual assumption of stochastic independence between the point process and the measurement process is violated. To account for preferential sampling, we specify a flexible and general Bayesian geostatistical model that includes a shared spatial random component. We apply the proposed model to two different case studies that allow us to highlight three different modeling and inferential aspects of geostatistical modeling under preferential sampling: (1) continuous or finite spatial sampling frame; (2) underlying causal model and relevant covariates; and (3) inferential goals related to mean prediction surface or prediction uncertainty.
Bayesian inferences for beta semiparametric-mixed models to analyze longitudinal neuroimaging data.
Wang, Xiao-Feng; Li, Yingxing
2014-07-01
Diffusion tensor imaging (DTI) is a quantitative magnetic resonance imaging technique that measures the three-dimensional diffusion of water molecules within tissue through the application of multiple diffusion gradients. This technique is rapidly increasing in popularity for studying white matter properties and structural connectivity in the living human brain. One of the major outcomes derived from the DTI process is known as fractional anisotropy, a continuous measure restricted on the interval (0,1). Motivated from a longitudinal DTI study of multiple sclerosis, we use a beta semiparametric-mixed regression model for the neuroimaging data. This work extends the generalized additive model methodology with beta distribution family and random effects. We describe two estimation methods with penalized splines, which are formalized under a Bayesian inferential perspective. The first one is carried out by Markov chain Monte Carlo (MCMC) simulations while the second one uses a relatively new technique called integrated nested Laplace approximation (INLA). Simulations and the neuroimaging data analysis show that the estimates obtained from both approaches are stable and similar, while the INLA method provides an efficient alternative to the computationally expensive MCMC method.
Bayesian Learning of a Language Model from Continuous Speech
NASA Astrophysics Data System (ADS)
Neubig, Graham; Mimura, Masato; Mori, Shinsuke; Kawahara, Tatsuya
We propose a novel scheme to learn a language model (LM) for automatic speech recognition (ASR) directly from continuous speech. In the proposed method, we first generate phoneme lattices using an acoustic model with no linguistic constraints, then perform training over these phoneme lattices, simultaneously learning both lexical units and an LM. As a statistical framework for this learning problem, we use non-parametric Bayesian statistics, which make it possible to balance the learned model's complexity (such as the size of the learned vocabulary) and expressive power, and provide a principled learning algorithm through the use of Gibbs sampling. Implementation is performed using weighted finite state transducers (WFSTs), which allow for the simple handling of lattice input. Experimental results on natural, adult-directed speech demonstrate that LMs built using only continuous speech are able to significantly reduce ASR phoneme error rates. The proposed technique of joint Bayesian learning of lexical units and an LM over lattices is shown to significantly contribute to this improvement.
Enhancing debris flow modeling parameters integrating Bayesian networks
NASA Astrophysics Data System (ADS)
Graf, C.; Stoffel, M.; Grêt-Regamey, A.
2009-04-01
Applied debris-flow modeling requires suitably constraint input parameter sets. Depending on the used model, there is a series of parameters to define before running the model. Normally, the data base describing the event, the initiation conditions, the flow behavior, the deposition process and mainly the potential range of possible debris flow events in a certain torrent is limited. There are only some scarce places in the world, where we fortunately can find valuable data sets describing event history of debris flow channels delivering information on spatial and temporal distribution of former flow paths and deposition zones. Tree-ring records in combination with detailed geomorphic mapping for instance provide such data sets over a long time span. Considering the significant loss potential associated with debris-flow disasters, it is crucial that decisions made in regard to hazard mitigation are based on a consistent assessment of the risks. This in turn necessitates a proper assessment of the uncertainties involved in the modeling of the debris-flow frequencies and intensities, the possible run out extent, as well as the estimations of the damage potential. In this study, we link a Bayesian network to a Geographic Information System in order to assess debris-flow risk. We identify the major sources of uncertainty and show the potential of Bayesian inference techniques to improve the debris-flow model. We model the flow paths and deposition zones of a highly active debris-flow channel in the Swiss Alps using the numerical 2-D model RAMMS. Because uncertainties in run-out areas cause large changes in risk estimations, we use the data of flow path and deposition zone information of reconstructed debris-flow events derived from dendrogeomorphological analysis covering more than 400 years to update the input parameters of the RAMMS model. The probabilistic model, which consistently incorporates this available information, can serve as a basis for spatial risk
Markov chain Monte Carlo simulation for Bayesian Hidden Markov Models
NASA Astrophysics Data System (ADS)
Chan, Lay Guat; Ibrahim, Adriana Irawati Nur Binti
2016-10-01
A hidden Markov model (HMM) is a mixture model which has a Markov chain with finite states as its mixing distribution. HMMs have been applied to a variety of fields, such as speech and face recognitions. The main purpose of this study is to investigate the Bayesian approach to HMMs. Using this approach, we can simulate from the parameters' posterior distribution using some Markov chain Monte Carlo (MCMC) sampling methods. HMMs seem to be useful, but there are some limitations. Therefore, by using the Mixture of Dirichlet processes Hidden Markov Model (MDPHMM) based on Yau et. al (2011), we hope to overcome these limitations. We shall conduct a simulation study using MCMC methods to investigate the performance of this model.
NASA Astrophysics Data System (ADS)
Mendes, B. S.; Draper, D.
2008-12-01
The issue of model uncertainty and model choice is central in any groundwater modeling effort [Neuman and Wierenga, 2003]; among the several approaches to the problem we favour using Bayesian statistics because it is a method that integrates in a natural way uncertainties (arising from any source) and experimental data. In this work, we experiment with several Bayesian approaches to model choice, focusing primarily on demonstrating the usefulness of the Reversible Jump Markov Chain Monte Carlo (RJMCMC) simulation method [Green, 1995]; this is an extension of the now- common MCMC methods. Standard MCMC techniques approximate posterior distributions for quantities of interest, often by creating a random walk in parameter space; RJMCMC allows the random walk to take place between parameter spaces with different dimensionalities. This fact allows us to explore state spaces that are associated with different deterministic models for experimental data. Our work is exploratory in nature; we restrict our study to comparing two simple transport models applied to a data set gathered to estimate the breakthrough curve for a tracer compound in groundwater. One model has a mean surface based on a simple advection dispersion differential equation; the second model's mean surface is also governed by a differential equation but in two dimensions. We focus on artificial data sets (in which truth is known) to see if model identification is done correctly, but we also address the issues of over and under-paramerization, and we compare RJMCMC's performance with other traditional methods for model selection and propagation of model uncertainty, including Bayesian model averaging, BIC and DIC.References Neuman and Wierenga (2003). A Comprehensive Strategy of Hydrogeologic Modeling and Uncertainty Analysis for Nuclear Facilities and Sites. NUREG/CR-6805, Division of Systems Analysis and Regulatory Effectiveness Office of Nuclear Regulatory Research, U. S. Nuclear Regulatory Commission
Bayesian Dose-Response Modeling in Sparse Data
NASA Astrophysics Data System (ADS)
Kim, Steven B.
This book discusses Bayesian dose-response modeling in small samples applied to two different settings. The first setting is early phase clinical trials, and the second setting is toxicology studies in cancer risk assessment. In early phase clinical trials, experimental units are humans who are actual patients. Prior to a clinical trial, opinions from multiple subject area experts are generally more informative than the opinion of a single expert, but we may face a dilemma when they have disagreeing prior opinions. In this regard, we consider compromising the disagreement and compare two different approaches for making a decision. In addition to combining multiple opinions, we also address balancing two levels of ethics in early phase clinical trials. The first level is individual-level ethics which reflects the perspective of trial participants. The second level is population-level ethics which reflects the perspective of future patients. We extensively compare two existing statistical methods which focus on each perspective and propose a new method which balances the two conflicting perspectives. In toxicology studies, experimental units are living animals. Here we focus on a potential non-monotonic dose-response relationship which is known as hormesis. Briefly, hormesis is a phenomenon which can be characterized by a beneficial effect at low doses and a harmful effect at high doses. In cancer risk assessments, the estimation of a parameter, which is known as a benchmark dose, can be highly sensitive to a class of assumptions, monotonicity or hormesis. In this regard, we propose a robust approach which considers both monotonicity and hormesis as a possibility. In addition, We discuss statistical hypothesis testing for hormesis and consider various experimental designs for detecting hormesis based on Bayesian decision theory. Past experiments have not been optimally designed for testing for hormesis, and some Bayesian optimal designs may not be optimal under a
Perceptual decision making: drift-diffusion model is equivalent to a Bayesian model
Bitzer, Sebastian; Park, Hame; Blankenburg, Felix; Kiebel, Stefan J.
2014-01-01
Behavioral data obtained with perceptual decision making experiments are typically analyzed with the drift-diffusion model. This parsimonious model accumulates noisy pieces of evidence toward a decision bound to explain the accuracy and reaction times of subjects. Recently, Bayesian models have been proposed to explain how the brain extracts information from noisy input as typically presented in perceptual decision making tasks. It has long been known that the drift-diffusion model is tightly linked with such functional Bayesian models but the precise relationship of the two mechanisms was never made explicit. Using a Bayesian model, we derived the equations which relate parameter values between these models. In practice we show that this equivalence is useful when fitting multi-subject data. We further show that the Bayesian model suggests different decision variables which all predict equal responses and discuss how these may be discriminated based on neural correlates of accumulated evidence. In addition, we discuss extensions to the Bayesian model which would be difficult to derive for the drift-diffusion model. We suggest that these and other extensions may be highly useful for deriving new experiments which test novel hypotheses. PMID:24616689
Semiparametric Bayesian local functional models for diffusion tensor tract statistics☆
Hua, Zhaowei; Dunson, David B.; Gilmore, John H.; Styner, Martin A.; Zhu, Hongtu
2012-01-01
We propose a semiparametric Bayesian local functional model (BFM) for the analysis of multiple diffusion properties (e.g., fractional anisotropy) along white matter fiber bundles with a set of covariates of interest, such as age and gender. BFM accounts for heterogeneity in the shape of the fiber bundle diffusion properties among subjects, while allowing the impact of the covariates to vary across subjects. A nonparametric Bayesian LPP2 prior facilitates global and local borrowings of information among subjects, while an infinite factor model flexibly represents low-dimensional structure. Local hypothesis testing and credible bands are developed to identify fiber segments, along which multiple diffusion properties are significantly associated with covariates of interest, while controlling for multiple comparisons. Moreover, BFM naturally group subjects into more homogeneous clusters. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFM. We apply BFM to investigate the development of white matter diffusivities along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment in new born infants. PMID:22732565
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Advanced REACH Tool: A Bayesian Model for Occupational Exposure Assessment
McNally, Kevin; Warren, Nicholas; Fransman, Wouter; Entink, Rinke Klein; Schinkel, Jody; van Tongeren, Martie; Cherrie, John W.; Kromhout, Hans; Schneider, Thomas; Tielemans, Erik
2014-01-01
This paper describes a Bayesian model for the assessment of inhalation exposures in an occupational setting; the methodology underpins a freely available web-based application for exposure assessment, the Advanced REACH Tool (ART). The ART is a higher tier exposure tool that combines disparate sources of information within a Bayesian statistical framework. The information is obtained from expert knowledge expressed in a calibrated mechanistic model of exposure assessment, data on inter- and intra-individual variability in exposures from the literature, and context-specific exposure measurements. The ART provides central estimates and credible intervals for different percentiles of the exposure distribution, for full-shift and long-term average exposures. The ART can produce exposure estimates in the absence of measurements, but the precision of the estimates improves as more data become available. The methodology presented in this paper is able to utilize partially analogous data, a novel approach designed to make efficient use of a sparsely populated measurement database although some additional research is still required before practical implementation. The methodology is demonstrated using two worked examples: an exposure to copper pyrithione in the spraying of antifouling paints and an exposure to ethyl acetate in shoe repair. PMID:24665110
Bayesian Energy Landscape Tilting: Towards Concordant Models of Molecular Ensembles
Beauchamp, Kyle A.; Pande, Vijay S.; Das, Rhiju
2014-01-01
Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and 3J measurements gives convergent values of the peptide’s α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT’s principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data. PMID:24655513
Modeling the Climatology of Tornado Occurrence with Bayesian Inference
NASA Astrophysics Data System (ADS)
Cheng, Vincent Y. S.
Our mechanistic understanding of tornadic environments has significantly improved by the recent technological enhancements in the detection of tornadoes as well as the advances of numerical weather predictive modeling. Nonetheless, despite the decades of active research, prediction of tornado occurrence remains one of the most difficult problems in meteorological and climate science. In our efforts to develop predictive tools for tornado occurrence, there are a number of issues to overcome, such as the treatment of inconsistent tornado records, the consideration of suitable combination of atmospheric predictors, and the selection of appropriate resolution to accommodate the variability in time and space. In this dissertation, I address each of these topics by undertaking three empirical (statistical) modeling studies, where I examine the signature of different atmospheric factors influencing the tornado occurrence, the sampling biases in tornado observations, and the optimal spatiotemporal resolution for studying tornado occurrence. In the first study, I develop a novel Bayesian statistical framework to assess the probability of tornado occurrence in Canada, in which the sampling bias of tornado observations and the linkage between lightning climatology and tornadogenesis are considered. The results produced reasonable probability estimates of tornado occurrence for the under-sampled areas in the model domain. The same study also delineated the geographical variability in the lightning-tornado relationship across Canada. In the second study, I present a novel modeling framework to examine the relative importance of several key atmospheric variables (e.g., convective available potential energy, 0-3 km storm-relative helicity, 0-6 km bulk wind difference, 0-tropopause vertical wind shear) on tornado activity in North America. I found that the variable quantifying the updraft strength is more important during the warm season, whereas the effects of wind
Tests of Bayesian model selection techniques for gravitational wave astronomy
Cornish, Neil J.; Littenberg, Tyson B.
2007-10-15
The analysis of gravitational wave data involves many model selection problems. The most important example is the detection problem of selecting between the data being consistent with instrument noise alone, or instrument noise and a gravitational wave signal. The analysis of data from ground based gravitational wave detectors is mostly conducted using classical statistics, and methods such as the Neyman-Peterson criteria are used for model selection. Future space based detectors, such as the Laser Interferometer Space Antenna (LISA), are expected to produce rich data streams containing the signals from many millions of sources. Determining the number of sources that are resolvable, and the most appropriate description of each source poses a challenging model selection problem that may best be addressed in a Bayesian framework. An important class of LISA sources are the millions of low-mass binary systems within our own galaxy, tens of thousands of which will be detectable. Not only are the number of sources unknown, but so are the number of parameters required to model the waveforms. For example, a significant subset of the resolvable galactic binaries will exhibit orbital frequency evolution, while a smaller number will have measurable eccentricity. In the Bayesian approach to model selection one needs to compute the Bayes factor between competing models. Here we explore various methods for computing Bayes factors in the context of determining which galactic binaries have measurable frequency evolution. The methods explored include a reverse jump Markov chain Monte Carlo algorithm, Savage-Dickie density ratios, the Schwarz-Bayes information criterion, and the Laplace approximation to the model evidence. We find good agreement between all of the approaches.
Development of a Bayesian Belief Network Runway Incursion Model
NASA Technical Reports Server (NTRS)
Green, Lawrence L.
2014-01-01
In a previous paper, a statistical analysis of runway incursion (RI) events was conducted to ascertain their relevance to the top ten Technical Challenges (TC) of the National Aeronautics and Space Administration (NASA) Aviation Safety Program (AvSP). The study revealed connections to perhaps several of the AvSP top ten TC. That data also identified several primary causes and contributing factors for RI events that served as the basis for developing a system-level Bayesian Belief Network (BBN) model for RI events. The system-level BBN model will allow NASA to generically model the causes of RI events and to assess the effectiveness of technology products being developed under NASA funding. These products are intended to reduce the frequency of RI events in particular, and to improve runway safety in general. The development, structure and assessment of that BBN for RI events by a Subject Matter Expert panel are documented in this paper.
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Aggregated Residential Load Modeling Using Dynamic Bayesian Networks
Vlachopoulou, Maria; Chin, George; Fuller, Jason C.; Lu, Shuai
2014-09-28
Abstract—It is already obvious that the future power grid will have to address higher demand for power and energy, and to incorporate renewable resources of different energy generation patterns. Demand response (DR) schemes could successfully be used to manage and balance power supply and demand under operating conditions of the future power grid. To achieve that, more advanced tools for DR management of operations and planning are necessary that can estimate the available capacity from DR resources. In this research, a Dynamic Bayesian Network (DBN) is derived, trained, and tested that can model aggregated load of Heating, Ventilation, and Air Conditioning (HVAC) systems. DBNs can provide flexible and powerful tools for both operations and planing, due to their unique analytical capabilities. The DBN model accuracy and flexibility of use is demonstrated by testing the model under different operational scenarios.
Bayesian Inference for Duplication–Mutation with Complementarity Network Models
Persing, Adam; Beskos, Alexandros; Heine, Kari; De Iorio, Maria
2015-01-01
Abstract We observe an undirected graph G without multiple edges and self-loops, which is to represent a protein–protein interaction (PPI) network. We assume that G evolved under the duplication–mutation with complementarity (DMC) model from a seed graph, G0, and we also observe the binary forest Γ that represents the duplication history of G. A posterior density for the DMC model parameters is established, and we outline a sampling strategy by which one can perform Bayesian inference; that sampling strategy employs a particle marginal Metropolis–Hastings (PMMH) algorithm. We test our methodology on numerical examples to demonstrate a high accuracy and precision in the inference of the DMC model's mutation and homodimerization parameters. PMID:26355682
Advances in Bayesian Model Based Clustering Using Particle Learning
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837
ERIC Educational Resources Information Center
Levy, Roy
2014-01-01
Digital games offer an appealing environment for assessing student proficiencies, including skills and misconceptions in a diagnostic setting. This paper proposes a dynamic Bayesian network modeling approach for observations of student performance from an educational video game. A Bayesian approach to model construction, calibration, and use in…
Bayesian Framework for Water Quality Model Uncertainty Estimation and Risk Management
A formal Bayesian methodology is presented for integrated model calibration and risk-based water quality management using Bayesian Monte Carlo simulation and maximum likelihood estimation (BMCML). The primary focus is on lucid integration of model calibration with risk-based wat...
Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum
2006-01-01
A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…
Nonparametric Bayesian inference of the microcanonical stochastic block model
NASA Astrophysics Data System (ADS)
Peixoto, Tiago P.
2017-01-01
A principled approach to characterize the hidden modular structure of networks is to formulate generative models and then infer their parameters from data. When the desired structure is composed of modules or "communities," a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints, i.e., the generated networks are not allowed to violate the patterns imposed by the model. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: (1) deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, which not only remove limitations that seriously degrade the inference on large networks but also reveal structures at multiple scales; (2) a very efficient inference algorithm that scales well not only for networks with a large number of nodes and edges but also with an unlimited number of modules. We show also how this approach can be used to sample modular hierarchies from the posterior distribution, as well as to perform model selection. We discuss and analyze the differences between sampling from the posterior and simply finding the single parameter estimate that maximizes it. Furthermore, we expose a direct equivalence between our microcanonical approach and alternative derivations based on the canonical SBM.
Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola
Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope
2010-01-01
The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775
A Semiparametric Bayesian Model for Detecting Synchrony Among Multiple Neurons
Shahbaba, Babak; Zhou, Bo; Lan, Shiwei; Ombao, Hernando; Moorman, David; Behseta, Sam
2015-01-01
We propose a scalable semiparametric Bayesian model to capture dependencies among multiple neurons by detecting their co-firing (possibly with some lag time) patterns over time. After discretizing time so there is at most one spike at each interval, the resulting sequence of 1’s (spike) and 0’s (silence) for each neuron is modeled using the logistic function of a continuous latent variable with a Gaussian process prior. For multiple neurons, the corresponding marginal distributions are coupled to their joint probability distribution using a parametric copula model. The advantages of our approach are as follows: the nonparametric component (i.e., the Gaussian process model) provides a flexible framework for modeling the underlying firing rates; the parametric component (i.e., the copula model) allows us to make inference regarding both contemporaneous and lagged relationships among neurons; using the copula model, we construct multivariate probabilistic models by separating the modeling of univariate marginal distributions from the modeling of dependence structure among variables; our method is easy to implement using a computationally efficient sampling algorithm that can be easily extended to high dimensional problems. Using simulated data, we show that our approach could correctly capture temporal dependencies in firing rates and identify synchronous neurons. We also apply our model to spike train data obtained from prefrontal cortical areas. PMID:24922500
A Bayesian Measurment Error Model for Misaligned Radiographic Data
Lennox, Kristin P.; Glascoe, Lee G.
2013-09-06
An understanding of the inherent variability in micro-computed tomography (micro-CT) data is essential to tasks such as statistical process control and the validation of radiographic simulation tools. The data present unique challenges to variability analysis due to the relatively low resolution of radiographs, and also due to minor variations from run to run which can result in misalignment or magnification changes between repeated measurements of a sample. Positioning changes artificially inflate the variability of the data in ways that mask true physical phenomena. We present a novel Bayesian nonparametric regression model that incorporates both additive and multiplicative measurement error in addition to heteroscedasticity to address this problem. We also use this model to assess the effects of sample thickness and sample position on measurement variability for an aluminum specimen. Supplementary materials for this article are available online.
Modeling the user preference on broadcasting contents using Bayesian networks
NASA Astrophysics Data System (ADS)
Kang, Sanggil; Lim, Jeongyeon; Kim, Munchurl
2004-01-01
In this paper, we introduce a new supervised learning method of a Bayesian network for user preference models. Unlike other preference models, our method traces the trend of a user preference as time passes. It allows us to do online learning so we do not need the exhaustive data collection. The tracing of the trend can be done by modifying the frequency of attributes in order to force the old preference to be correlated with the current preference under the assumption that the current preference is correlated with the near future preference. The objective of our learning method is to force the mutual information to be reinforced by modifying the frequency of the attributes in the old preference by providing weights to the attributes. With developing mathematical derivation of our learning method, experimental results on the learning and reasoning performance on TV genre preference using a real set of TV program watching history data.
A Bayesian model of context-sensitive value attribution.
Rigoli, Francesco; Friston, Karl J; Martinelli, Cristina; Selaković, Mirjana; Shergill, Sukhwinder S; Dolan, Raymond J
2016-06-22
Substantial evidence indicates that incentive value depends on an anticipation of rewards within a given context. However, the computations underlying this context sensitivity remain unknown. To address this question, we introduce a normative (Bayesian) account of how rewards map to incentive values. This assumes that the brain inverts a model of how rewards are generated. Key features of our account include (i) an influence of prior beliefs about the context in which rewards are delivered (weighted by their reliability in a Bayes-optimal fashion), (ii) the notion that incentive values correspond to precision-weighted prediction errors, (iii) and contextual information unfolding at different hierarchical levels. This formulation implies that incentive value is intrinsically context-dependent. We provide empirical support for this model by showing that incentive value is influenced by context variability and by hierarchically nested contexts. The perspective we introduce generates new empirical predictions that might help explaining psychopathologies, such as addiction.
A Bayesian Measurment Error Model for Misaligned Radiographic Data
Lennox, Kristin P.; Glascoe, Lee G.
2013-09-06
An understanding of the inherent variability in micro-computed tomography (micro-CT) data is essential to tasks such as statistical process control and the validation of radiographic simulation tools. The data present unique challenges to variability analysis due to the relatively low resolution of radiographs, and also due to minor variations from run to run which can result in misalignment or magnification changes between repeated measurements of a sample. Positioning changes artificially inflate the variability of the data in ways that mask true physical phenomena. We present a novel Bayesian nonparametric regression model that incorporates both additive and multiplicative measurement error inmore » addition to heteroscedasticity to address this problem. We also use this model to assess the effects of sample thickness and sample position on measurement variability for an aluminum specimen. Supplementary materials for this article are available online.« less
Scheuerell, Mark D; Buhle, Eric R; Semmens, Brice X; Ford, Michael J; Cooney, Tom; Carmichael, Richard W
2015-01-01
Myriad human activities increasingly threaten the existence of many species. A variety of conservation interventions such as habitat restoration, protected areas, and captive breeding have been used to prevent extinctions. Evaluating the effectiveness of these interventions requires appropriate statistical methods, given the quantity and quality of available data. Historically, analysis of variance has been used with some form of predetermined before-after control-impact design to estimate the effects of large-scale experiments or conservation interventions. However, ad hoc retrospective study designs or the presence of random effects at multiple scales may preclude the use of these tools. We evaluated the effects of a large-scale supplementation program on the density of adult Chinook salmon Oncorhynchus tshawytscha from the Snake River basin in the northwestern United States currently listed under the U.S. Endangered Species Act. We analyzed 43 years of data from 22 populations, accounting for random effects across time and space using a form of Bayesian hierarchical time-series model common in analyses of financial markets. We found that varying degrees of supplementation over a period of 25 years increased the density of natural-origin adults, on average, by 0–8% relative to nonsupplementation years. Thirty-nine of the 43 year effects were at least two times larger in magnitude than the mean supplementation effect, suggesting common environmental variables play a more important role in driving interannual variability in adult density. Additional residual variation in density varied considerably across the region, but there was no systematic difference between supplemented and reference populations. Our results demonstrate the power of hierarchical Bayesian models to detect the diffuse effects of management interventions and to quantitatively describe the variability of intervention success. Nevertheless, our study could not address whether ecological
Yu, Rongjie; Abdel-Aty, Mohamed
2014-01-01
Severe crashes are causing serious social and economic loss, and because of this, reducing crash injury severity has become one of the key objectives of the high speed facilities' (freeway and expressway) management. Traditional crash injury severity analysis utilized data mainly from crash reports concerning the crash occurrence information, drivers' characteristics and roadway geometric related variables. In this study, real-time traffic and weather data were introduced to analyze the crash injury severity. The space mean speeds captured by the Automatic Vehicle Identification (AVI) system on the two roadways were used as explanatory variables in this study; and data from a mountainous freeway (I-70 in Colorado) and an urban expressway (State Road 408 in Orlando) have been used to identify the analysis result's consistence. Binary probit (BP) models were estimated to classify the non-severe (property damage only) crashes and severe (injury and fatality) crashes. Firstly, Bayesian BP models' results were compared to the results from Maximum Likelihood Estimation BP models and it was concluded that Bayesian inference was superior with more significant variables. Then different levels of hierarchical Bayesian BP models were developed with random effects accounting for the unobserved heterogeneity at segment level and crash individual level, respectively. Modeling results from both studied locations demonstrate that large variations of speed prior to the crash occurrence would increase the likelihood of severe crash occurrence. Moreover, with considering unobserved heterogeneity in the Bayesian BP models, the model goodness-of-fit has improved substantially. Finally, possible future applications of the model results and the hierarchical Bayesian probit models were discussed.
Diagnosing Hybrid Systems: a Bayesian Model Selection Approach
NASA Technical Reports Server (NTRS)
McIlraith, Sheila A.
2005-01-01
In this paper we examine the problem of monitoring and diagnosing noisy complex dynamical systems that are modeled as hybrid systems-models of continuous behavior, interleaved by discrete transitions. In particular, we examine continuous systems with embedded supervisory controllers that experience abrupt, partial or full failure of component devices. Building on our previous work in this area (MBCG99;MBCG00), our specific focus in this paper ins on the mathematical formulation of the hybrid monitoring and diagnosis task as a Bayesian model tracking algorithm. The nonlinear dynamics of many hybrid systems present challenges to probabilistic tracking. Further, probabilistic tracking of a system for the purposes of diagnosis is problematic because the models of the system corresponding to failure modes are numerous and generally very unlikely. To focus tracking on these unlikely models and to reduce the number of potential models under consideration, we exploit logic-based techniques for qualitative model-based diagnosis to conjecture a limited initial set of consistent candidate models. In this paper we discuss alternative tracking techniques that are relevant to different classes of hybrid systems, focusing specifically on a method for tracking multiple models of nonlinear behavior simultaneously using factored sampling and conditional density propagation. To illustrate and motivate the approach described in this paper we examine the problem of monitoring and diganosing NASA's Sprint AERCam, a small spherical robotic camera unit with 12 thrusters that enable both linear and rotational motion.
A Bayesian hierarchical model for categorical data with nonignorable nonresponse.
Green, Paul E; Park, Taesung
2003-12-01
Log-linear models have been shown to be useful for smoothing contingency tables when categorical outcomes are subject to nonignorable nonresponse. A log-linear model can be fit to an augmented data table that includes an indicator variable designating whether subjects are respondents or nonrespondents. Maximum likelihood estimates calculated from the augmented data table are known to suffer from instability due to boundary solutions. Park and Brown (1994, Journal of the American Statistical Association 89, 44-52) and Park (1998, Biometrics 54, 1579-1590) developed empirical Bayes models that tend to smooth estimates away from the boundary. In those approaches, estimates for nonrespondents were calculated using an EM algorithm by maximizing a posterior distribution. As an extension of their earlier work, we develop a Bayesian hierarchical model that incorporates a log-linear model in the prior specification. In addition, due to uncertainty in the variable selection process associated with just one log-linear model, we simultaneously consider a finite number of models using a stochastic search variable selection (SSVS) procedure due to George and McCulloch (1997, Statistica Sinica 7, 339-373). The integration of the SSVS procedure into a Markov chain Monte Carlo (MCMC) sampler is straightforward, and leads to estimates of cell frequencies for the nonrespondents that are averages resulting from several log-linear models. The methods are demonstrated with a data example involving serum creatinine levels of patients who survived renal transplants. A simulation study is conducted to investigate properties of the model.
Bayesian network models for error detection in radiotherapy plans.
Kalet, Alan M; Gennari, John H; Ford, Eric C; Phillips, Mark H
2015-04-07
The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network's conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures.
Bayesian network models for error detection in radiotherapy plans
NASA Astrophysics Data System (ADS)
Kalet, Alan M.; Gennari, John H.; Ford, Eric C.; Phillips, Mark H.
2015-04-01
The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network’s conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures.
A Bayesian Attractor Model for Perceptual Decision Making
Bitzer, Sebastian; Bruineberg, Jelle; Kiebel, Stefan J.
2015-01-01
Even for simple perceptual decisions, the mechanisms that the brain employs are still under debate. Although current consensus states that the brain accumulates evidence extracted from noisy sensory information, open questions remain about how this simple model relates to other perceptual phenomena such as flexibility in decisions, decision-dependent modulation of sensory gain, or confidence about a decision. We propose a novel approach of how perceptual decisions are made by combining two influential formalisms into a new model. Specifically, we embed an attractor model of decision making into a probabilistic framework that models decision making as Bayesian inference. We show that the new model can explain decision making behaviour by fitting it to experimental data. In addition, the new model combines for the first time three important features: First, the model can update decisions in response to switches in the underlying stimulus. Second, the probabilistic formulation accounts for top-down effects that may explain recent experimental findings of decision-related gain modulation of sensory neurons. Finally, the model computes an explicit measure of confidence which we relate to recent experimental evidence for confidence computations in perceptual decision tasks. PMID:26267143
Bayesian analysis of a reduced-form air quality model.
Foley, Kristen M; Reich, Brian J; Napelenok, Sergey L
2012-07-17
Numerical air quality models are being used for assessing emission control strategies for improving ambient pollution levels across the globe. This paper applies probabilistic modeling to evaluate the effectiveness of emission reduction scenarios aimed at lowering ground-level ozone concentrations. A Bayesian hierarchical model is used to combine air quality model output and monitoring data in order to characterize the impact of emissions reductions while accounting for different degrees of uncertainty in the modeled emissions inputs. The probabilistic model predictions are weighted based on population density in order to better quantify the societal benefits/disbenefits of four hypothetical emission reduction scenarios in which domain-wide NO(x) emissions from various sectors are reduced individually and then simultaneously. Cross validation analysis shows the statistical model performs well compared to observed ozone levels. Accounting for the variability and uncertainty in the emissions and atmospheric systems being modeled is shown to impact how emission reduction scenarios would be ranked, compared to standard methodology.
Cernicchiaro, N; Renter, D G; Xiang, S; White, B J; Bello, N M
2013-06-01
Variability in ADG of feedlot cattle can affect profits, thus making overall returns more unstable. Hence, knowledge of the factors that contribute to heterogeneity of variances in animal performance can help feedlot managers evaluate risks and minimize profit volatility when making managerial and economic decisions in commercial feedlots. The objectives of the present study were to evaluate heteroskedasticity, defined as heterogeneity of variances, in ADG of cohorts of commercial feedlot cattle, and to identify cattle demographic factors at feedlot arrival as potential sources of variance heterogeneity, accounting for cohort- and feedlot-level information in the data structure. An operational dataset compiled from 24,050 cohorts from 25 U. S. commercial feedlots in 2005 and 2006 was used for this study. Inference was based on a hierarchical Bayesian model implemented with Markov chain Monte Carlo, whereby cohorts were modeled at the residual level and feedlot-year clusters were modeled as random effects. Forward model selection based on deviance information criteria was used to screen potentially important explanatory variables for heteroskedasticity at cohort- and feedlot-year levels. The Bayesian modeling framework was preferred as it naturally accommodates the inherently hierarchical structure of feedlot data whereby cohorts are nested within feedlot-year clusters. Evidence for heterogeneity of variance components of ADG was substantial and primarily concentrated at the cohort level. Feedlot-year specific effects were, by far, the greatest contributors to ADG heteroskedasticity among cohorts, with an estimated ∼12-fold change in dispersion between most and least extreme feedlot-year clusters. In addition, identifiable demographic factors associated with greater heterogeneity of cohort-level variance included smaller cohort sizes, fewer days on feed, and greater arrival BW, as well as feedlot arrival during summer months. These results support that
A Bayesian modelling framework for tornado occurrences in North America
NASA Astrophysics Data System (ADS)
Cheng, Vincent Y. S.; Arhonditsis, George B.; Sills, David M. L.; Gough, William A.; Auld, Heather
2015-03-01
Tornadoes represent one of nature’s most hazardous phenomena that have been responsible for significant destruction and devastating fatalities. Here we present a Bayesian modelling approach for elucidating the spatiotemporal patterns of tornado activity in North America. Our analysis shows a significant increase in the Canadian Prairies and the Northern Great Plains during the summer, indicating a clear transition of tornado activity from the United States to Canada. The linkage between monthly-averaged atmospheric variables and likelihood of tornado events is characterized by distinct seasonality; the convective available potential energy is the predominant factor in the summer; vertical wind shear appears to have a strong signature primarily in the winter and secondarily in the summer; and storm relative environmental helicity is most influential in the spring. The present probabilistic mapping can be used to draw inference on the likelihood of tornado occurrence in any location in North America within a selected time period of the year.
Designing and testing inflationary models with Bayesian networks
Price, Layne C.; Peiris, Hiranya V.; Frazer, Jonathan; Easther, Richard E-mail: h.peiris@ucl.ac.uk E-mail: r.easther@auckland.ac.nz
2016-02-01
Even simple inflationary scenarios have many free parameters. Beyond the variables appearing in the inflationary action, these include dynamical initial conditions, the number of fields, and couplings to other sectors. These quantities are often ignored but cosmological observables can depend on the unknown parameters. We use Bayesian networks to account for a large set of inflationary parameters, deriving generative models for the primordial spectra that are conditioned on a hierarchical set of prior probabilities describing the initial conditions, reheating physics, and other free parameters. We use N{sub f}-quadratic inflation as an illustrative example, finding that the number of e-folds N{sub *} between horizon exit for the pivot scale and the end of inflation is typically the most important parameter, even when the number of fields, their masses and initial conditions are unknown, along with possible conditional dependencies between these parameters.
A Bayesian modelling framework for tornado occurrences in North America.
Cheng, Vincent Y S; Arhonditsis, George B; Sills, David M L; Gough, William A; Auld, Heather
2015-03-25
Tornadoes represent one of nature's most hazardous phenomena that have been responsible for significant destruction and devastating fatalities. Here we present a Bayesian modelling approach for elucidating the spatiotemporal patterns of tornado activity in North America. Our analysis shows a significant increase in the Canadian Prairies and the Northern Great Plains during the summer, indicating a clear transition of tornado activity from the United States to Canada. The linkage between monthly-averaged atmospheric variables and likelihood of tornado events is characterized by distinct seasonality; the convective available potential energy is the predominant factor in the summer; vertical wind shear appears to have a strong signature primarily in the winter and secondarily in the summer; and storm relative environmental helicity is most influential in the spring. The present probabilistic mapping can be used to draw inference on the likelihood of tornado occurrence in any location in North America within a selected time period of the year.
Reginal Frequency Analysis Based on Scaling Properties and Bayesian Models
NASA Astrophysics Data System (ADS)
Kwon, Hyun-Han; Lee, Jeong-Ju; Moon, Young-Il
2010-05-01
A regional frequency analysis based on Hierarchical Bayesian Network (HBN) and scaling theory was developmed. Many recording rain gauges over South Korea were used for the analysis. First, a scaling approach combined with extreme distribution was employed to derive regional formula for frequency analysis. Second, HBN model was used to represent additional information about the regional structure of the scaling parameters, especially the location parameter and shape parameter. The location and shape parameters of the extreme distribution were estimated by utilizing scaling properties in a regression framework, and the scaling parameters linking the parameters (location and shape) to various duration times were simultaneously estimated. It was found that the regional frequency analysis combined with HBN and scaling properties show promising results in terms of establishing regional IDF curves.
Geostatistical models are appropriate for spatially distributed data measured at irregularly spaced locations. We propose an efficient Markov chain Monte Carlo (MCMC) algorithm for fitting Bayesian geostatistical models with substantial numbers of unknown parameters to sizable...
Toward diagnostic model calibration and evaluation: Approximate Bayesian computation
NASA Astrophysics Data System (ADS)
Vrugt, Jasper A.; Sadegh, Mojtaba
2013-07-01
The ever increasing pace of computational power, along with continued advances in measurement technologies and improvements in process understanding has stimulated the development of increasingly complex hydrologic models that simulate soil moisture flow, groundwater recharge, surface runoff, root water uptake, and river discharge at different spatial and temporal scales. Reconciling these high-order system models with perpetually larger volumes of field data is becoming more and more difficult, particularly because classical likelihood-based fitting methods lack the power to detect and pinpoint deficiencies in the model structure. Gupta et al. (2008) has recently proposed steps (amongst others) toward the development of a more robust and powerful method of model evaluation. Their diagnostic approach uses signature behaviors and patterns observed in the input-output data to illuminate to what degree a representation of the real world has been adequately achieved and how the model should be improved for the purpose of learning and scientific discovery. In this paper, we introduce approximate Bayesian computation (ABC) as a vehicle for diagnostic model evaluation. This statistical methodology relaxes the need for an explicit likelihood function in favor of one or multiple different summary statistics rooted in hydrologic theory that together have a clearer and more compelling diagnostic power than some average measure of the size of the error residuals. Two illustrative case studies are used to demonstrate that ABC is relatively easy to implement, and readily employs signature based indices to analyze and pinpoint which part of the model is malfunctioning and in need of further improvement.
Bridging groundwater models and decision support with a Bayesian network
Fienen, Michael N.; Masterson, John P.; Plant, Nathaniel G.; Gutierrez, Benjamin T.; Thieler, E. Robert
2013-01-01
Resource managers need to make decisions to plan for future environmental conditions, particularly sea level rise, in the face of substantial uncertainty. Many interacting processes factor in to the decisions they face. Advances in process models and the quantification of uncertainty have made models a valuable tool for this purpose. Long-simulation runtimes and, often, numerical instability make linking process models impractical in many cases. A method for emulating the important connections between model input and forecasts, while propagating uncertainty, has the potential to provide a bridge between complicated numerical process models and the efficiency and stability needed for decision making. We explore this using a Bayesian network (BN) to emulate a groundwater flow model. We expand on previous approaches to validating a BN by calculating forecasting skill using cross validation of a groundwater model of Assateague Island in Virginia and Maryland, USA. This BN emulation was shown to capture the important groundwater-flow characteristics and uncertainty of the groundwater system because of its connection to island morphology and sea level. Forecast power metrics associated with the validation of multiple alternative BN designs guided the selection of an optimal level of BN complexity. Assateague island is an ideal test case for exploring a forecasting tool based on current conditions because the unique hydrogeomorphological variability of the island includes a range of settings indicative of past, current, and future conditions. The resulting BN is a valuable tool for exploring the response of groundwater conditions to sea level rise in decision support.
A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.
ERIC Educational Resources Information Center
Glas, Cees A. W.; Meijer, Rob R.
A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…
ERIC Educational Resources Information Center
West, Patti; Rutstein, Daisy Wise; Mislevy, Robert J.; Liu, Junhui; Choi, Younyoung; Levy, Roy; Crawford, Aaron; DiCerbo, Kristen E.; Chappel, Kristina; Behrens, John T.
2010-01-01
A major issue in the study of learning progressions (LPs) is linking student performance on assessment tasks to the progressions. This report describes the challenges faced in making this linkage using Bayesian networks to model LPs in the field of computer networking. The ideas are illustrated with exemplar Bayesian networks built on Cisco…
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan; Tang, Nian-Sheng
2007-01-01
The analysis of interaction among latent variables has received much attention. This article introduces a Bayesian approach to analyze a general structural equation model that accommodates the general nonlinear terms of latent variables and covariates. This approach produces a Bayesian estimate that has the same statistical optimal properties as a…
Bayesian Multiscale Modeling of Closed Curves in Point Clouds.
Gu, Kelvin; Pati, Debdeep; Dunson, David B
2014-10-01
Modeling object boundaries based on image or point cloud data is frequently necessary in medical and scientific applications ranging from detecting tumor contours for targeted radiation therapy, to the classification of organisms based on their structural information. In low-contrast images or sparse and noisy point clouds, there is often insufficient data to recover local segments of the boundary in isolation. Thus, it becomes critical to model the entire boundary in the form of a closed curve. To achieve this, we develop a Bayesian hierarchical model that expresses highly diverse 2D objects in the form of closed curves. The model is based on a novel multiscale deformation process. By relating multiple objects through a hierarchical formulation, we can successfully recover missing boundaries by borrowing structural information from similar objects at the appropriate scale. Furthermore, the model's latent parameters help interpret the population, indicating dimensions of significant structural variability and also specifying a 'central curve' that summarizes the collection. Theoretical properties of our prior are studied in specific cases and efficient Markov chain Monte Carlo methods are developed, evaluated through simulation examples and applied to panorex teeth images for modeling teeth contours and also to a brain tumor contour detection problem.
A Bayesian hierarchical model for wind gust prediction
NASA Astrophysics Data System (ADS)
Friederichs, Petra; Oesting, Marco; Schlather, Martin
2014-05-01
A postprocessing method for ensemble wind gust forecasts given by a mesoscale limited area numerical weather prediction (NWP) model is presented, which is based on extreme value theory. A process layer for the parameters of a generalized extreme value distribution (GEV) is introduced using a Bayesian hierarchical model (BHM). Incorporating the information of the COMSO-DE forecasts, the process parameters model the spatial response surfaces of the GEV parameters as Gaussian random fields. The spatial BHM provides area wide forecasts of wind gusts in terms of a conditional GEV. It models the marginal distribution of the spatial gust process and provides not only forecasts of the conditional GEV at locations without observations, but also uncertainty information about the estimates. A disadvantages of BHM model is that it assumes conditional independent observations. In order to incorporate the dependence between gusts at neighboring locations as well as the spatial random fields of observed and forecasted maximal wind gusts, we propose to model them jointly by a bivariate Brown-Resnick process.
Nonlinear regression modeling of nutrient loads in streams: A Bayesian approach
Qian, S.S.; Reckhow, K.H.; Zhai, J.; McMahon, G.
2005-01-01
A Bayesian nonlinear regression modeling method is introduced and compared with the least squares method for modeling nutrient loads in stream networks. The objective of the study is to better model spatial correlation in river basin hydrology and land use for improving the model as a forecasting tool. The Bayesian modeling approach is introduced in three steps, each with a more complicated model and data error structure. The approach is illustrated using a data set from three large river basins in eastern North Carolina. Results indicate that the Bayesian model better accounts for model and data uncertainties than does the conventional least squares approach. Applications of the Bayesian models for ambient water quality standards compliance and TMDL assessment are discussed. Copyright 2005 by the American Geophysical Union.
Tran, Van; Liu, Danping; Pradhan, Anuj K; Li, Kaigang; Bingham, C Raymond; Simons-Morton, Bruce G; Albert, Paul S
2015-01-01
Signalized intersection management is a common measure of risky driving in simulator studies. In a recent randomized trial, investigators were interested in whether teenage males exposed to a risk-accepting passenger took more intersection risks in a driving simulator compared with those exposed to a risk-averse peer passenger. Analyses in this trial are complicated by the longitudinal or repeated measures that are semi-continuous with clumping at zero. Specifically, the dependent variable in a randomized trial looking at the effect of risk-accepting versus risk-averse peer passengers on teenage simulator driving is comprised of two components. The discrete component measures whether the teen driver stops for a yellow light, and the continuous component measures the time the teen driver, who does not stop, spends in the intersection during a red light. To convey both components of this measure, we apply a two-part regression with correlated random effects model (CREM), consisting of a logistic regression to model whether the driver stops for a yellow light and a linear regression to model the time spent in the intersection during a red light. These two components are related through the correlation of their random effects. Using this novel analysis, we found that those exposed to a risk-averse passenger have a higher proportion of stopping at yellow lights and a longer mean time in the intersection during a red light when they did not stop at the light compared to those exposed to a risk-accepting passenger, consistent with the study hypotheses and previous analyses. Examining the statistical properties of the CREM approach through simulations, we found that in most situations, the CREM achieves greater power than competing approaches. We also examined whether the treatment effect changes across the length of the drive and provided a sample size recommendation for detecting such phenomenon in subsequent trials. Our findings suggest that CREM provides an efficient
Bayesian calibration of the Community Land Model using surrogates
Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Swiler, Laura Painton
2014-02-01
We present results from the Bayesian calibration of hydrological parameters of the Community Land Model (CLM), which is often used in climate simulations and Earth system models. A statistical inverse problem is formulated for three hydrological parameters, conditional on observations of latent heat surface fluxes over 48 months. Our calibration method uses polynomial and Gaussian process surrogates of the CLM, and solves the parameter estimation problem using a Markov chain Monte Carlo sampler. Posterior probability densities for the parameters are developed for two sites with different soil and vegetation covers. Our method also allows us to examine the structural error in CLM under two error models. We find that surrogate models can be created for CLM in most cases. The posterior distributions are more predictive than the default parameter values in CLM. Climatologically averaging the observations does not modify the parameters' distributions significantly. The structural error model reveals a correlation time-scale which can be used to identify the physical process that could be contributing to it. While the calibrated CLM has a higher predictive skill, the calibration is under-dispersive.
Bayesian Calibration of the Community Land Model using Surrogates
Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Sargsyan, K.; Swiler, Laura P.
2015-01-01
We present results from the Bayesian calibration of hydrological parameters of the Community Land Model (CLM), which is often used in climate simulations and Earth system models. A statistical inverse problem is formulated for three hydrological parameters, conditioned on observations of latent heat surface fluxes over 48 months. Our calibration method uses polynomial and Gaussian process surrogates of the CLM, and solves the parameter estimation problem using a Markov chain Monte Carlo sampler. Posterior probability densities for the parameters are developed for two sites with different soil and vegetation covers. Our method also allows us to examine the structural error in CLM under two error models. We find that accurate surrogate models can be created for CLM in most cases. The posterior distributions lead to better prediction than the default parameter values in CLM. Climatologically averaging the observations does not modify the parameters’ distributions significantly. The structural error model reveals a correlation time-scale which can potentially be used to identify physical processes that could be contributing to it. While the calibrated CLM has a higher predictive skill, the calibration is under-dispersive.
Bayesian Gaussian Mixture Models for High-Density Genotyping Arrays
Sabatti, Chiara; Lange, Kenneth
2011-01-01
Affymetrix's SNP (single-nucleotide polymorphism) genotyping chips have increased the scope and decreased the cost of gene-mapping studies. Because each SNP is queried by multiple DNA probes, the chips present interesting challenges in genotype calling. Traditional clustering methods distinguish the three genotypes of an SNP fairly well given a large enough sample of unrelated individuals or a training sample of known genotypes. This article describes our attempt to improve genotype calling by constructing Gaussian mixture models with empirically derived priors. The priors stabilize parameter estimation and borrow information collectively gathered on tens of thousands of SNPs. When data from related family members are available, our models capture the correlations in signals between relatives. With these advantages in mind, we apply the models to Affymetrix probe intensity data on 10,000 SNPs gathered on 63 genotyped individuals spread over eight pedigrees. We integrate the genotype-calling model with pedigree analysis and examine a sequence of symmetry hypotheses involving the correlated probe signals. The symmetry hypotheses raise novel mathematical issues of parameterization. Using the Bayesian information criterion, we select the best combination of symmetry assumptions. Compared to Affymetrix's software, our model leads to a reduction in no-calls with little sacrifice in overall calling accuracy. PMID:21572926
Bayesian Multiscale Modeling of Closed Curves in Point Clouds
Gu, Kelvin; Pati, Debdeep; Dunson, David B.
2014-01-01
Modeling object boundaries based on image or point cloud data is frequently necessary in medical and scientific applications ranging from detecting tumor contours for targeted radiation therapy, to the classification of organisms based on their structural information. In low-contrast images or sparse and noisy point clouds, there is often insufficient data to recover local segments of the boundary in isolation. Thus, it becomes critical to model the entire boundary in the form of a closed curve. To achieve this, we develop a Bayesian hierarchical model that expresses highly diverse 2D objects in the form of closed curves. The model is based on a novel multiscale deformation process. By relating multiple objects through a hierarchical formulation, we can successfully recover missing boundaries by borrowing structural information from similar objects at the appropriate scale. Furthermore, the model’s latent parameters help interpret the population, indicating dimensions of significant structural variability and also specifying a ‘central curve’ that summarizes the collection. Theoretical properties of our prior are studied in specific cases and efficient Markov chain Monte Carlo methods are developed, evaluated through simulation examples and applied to panorex teeth images for modeling teeth contours and also to a brain tumor contour detection problem. PMID:25544786
Ensemble bayesian model averaging using markov chain Monte Carlo sampling
Vrugt, Jasper A; Diks, Cees G H; Clark, Martyn P
2008-01-01
Bayesian model averaging (BMA) has recently been proposed as a statistical method to calibrate forecast ensembles from numerical weather models. Successful implementation of BMA however, requires accurate estimates of the weights and variances of the individual competing models in the ensemble. In their seminal paper (Raftery etal. Mon Weather Rev 133: 1155-1174, 2(05)) has recommended the Expectation-Maximization (EM) algorithm for BMA model training, even though global convergence of this algorithm cannot be guaranteed. In this paper, we compare the performance of the EM algorithm and the recently developed Differential Evolution Adaptive Metropolis (DREAM) Markov Chain Monte Carlo (MCMC) algorithm for estimating the BMA weights and variances. Simulation experiments using 48-hour ensemble data of surface temperature and multi-model stream-flow forecasts show that both methods produce similar results, and that their performance is unaffected by the length of the training data set. However, MCMC simulation with DREAM is capable of efficiently handling a wide variety of BMA predictive distributions, and provides useful information about the uncertainty associated with the estimated BMA weights and variances.
A Bayesian network model for predicting aquatic toxicity mode ...
The mode of toxic action (MoA) has been recognized as a key determinant of chemical toxicity, but development of predictive MoA classification models in aquatic toxicology has been limited. We developed a Bayesian network model to classify aquatic toxicity MoA using a recently published dataset containing over one thousand chemicals with MoA assignments for aquatic animal toxicity. Two dimensional theoretical chemical descriptors were generated for each chemical using the Toxicity Estimation Software Tool. The model was developed through augmented Markov blanket discovery from the dataset of 1098 chemicals with the MoA broad classifications as a target node. From cross validation, the overall precision for the model was 80.2%. The best precision was for the AChEI MoA (93.5%) where 257 chemicals out of 275 were correctly classified. Model precision was poorest for the reactivity MoA (48.5%) where 48 out of 99 reactive chemicals were correctly classified. Narcosis represented the largest class within the MoA dataset and had a precision and reliability of 80.0%, reflecting the global precision across all of the MoAs. False negatives for narcosis most often fell into electron transport inhibition, neurotoxicity or reactivity MoAs. False negatives for all other MoAs were most often narcosis. A probabilistic sensitivity analysis was undertaken for each MoA to examine the sensitivity to individual and multiple descriptor findings. The results show that the Markov blank
A Bayesian Model of Category-Specific Emotional Brain Responses
Wager, Tor D.; Kang, Jian; Johnson, Timothy D.; Nichols, Thomas E.; Satpute, Ajay B.; Barrett, Lisa Feldman
2015-01-01
Understanding emotion is critical for a science of healthy and disordered brain function, but the neurophysiological basis of emotional experience is still poorly understood. We analyzed human brain activity patterns from 148 studies of emotion categories (2159 total participants) using a novel hierarchical Bayesian model. The model allowed us to classify which of five categories—fear, anger, disgust, sadness, or happiness—is engaged by a study with 66% accuracy (43-86% across categories). Analyses of the activity patterns encoded in the model revealed that each emotion category is associated with unique, prototypical patterns of activity across multiple brain systems including the cortex, thalamus, amygdala, and other structures. The results indicate that emotion categories are not contained within any one region or system, but are represented as configurations across multiple brain networks. The model provides a precise summary of the prototypical patterns for each emotion category, and demonstrates that a sufficient characterization of emotion categories relies on (a) differential patterns of involvement in neocortical systems that differ between humans and other species, and (b) distinctive patterns of cortical-subcortical interactions. Thus, these findings are incompatible with several contemporary theories of emotion, including those that emphasize emotion-dedicated brain systems and those that propose emotion is localized primarily in subcortical activity. They are consistent with componential and constructionist views, which propose that emotions are differentiated by a combination of perceptual, mnemonic, prospective, and motivational elements. Such brain-based models of emotion provide a foundation for new translational and clinical approaches. PMID:25853490
Bayesian Belief Networks Approach for Modeling Irrigation Behavior
NASA Astrophysics Data System (ADS)
Andriyas, S.; McKee, M.
2012-12-01
Canal operators need information to manage water deliveries to irrigators. Short-term irrigation demand forecasts can potentially valuable information for a canal operator who must manage an on-demand system. Such forecasts could be generated by using information about the decision-making processes of irrigators. Bayesian models of irrigation behavior can provide insight into the likely criteria which farmers use to make irrigation decisions. This paper develops a Bayesian belief network (BBN) to learn irrigation decision-making behavior of farmers and utilizes the resulting model to make forecasts of future irrigation decisions based on factor interaction and posterior probabilities. Models for studying irrigation behavior have been rarely explored in the past. The model discussed here was built from a combination of data about biotic, climatic, and edaphic conditions under which observed irrigation decisions were made. The paper includes a case study using data collected from the Canal B region of the Sevier River, near Delta, Utah. Alfalfa, barley and corn are the main crops of the location. The model has been tested with a portion of the data to affirm the model predictive capabilities. Irrigation rules were deduced in the process of learning and verified in the testing phase. It was found that most of the farmers used consistent rules throughout all years and across different types of crops. Soil moisture stress, which indicates the level of water available to the plant in the soil profile, was found to be one of the most significant likely driving forces for irrigation. Irrigations appeared to be triggered by a farmer's perception of soil stress, or by a perception of combined factors such as information about a neighbor irrigating or an apparent preference to irrigate on a weekend. Soil stress resulted in irrigation probabilities of 94.4% for alfalfa. With additional factors like weekend and irrigating when a neighbor irrigates, alfalfa irrigation
Using Bayesian Stable Isotope Mixing Models to Enhance Marine Ecosystem Models
The use of stable isotopes in food web studies has proven to be a valuable tool for ecologists. We investigated the use of Bayesian stable isotope mixing models as constraints for an ecosystem model of a temperate seagrass system on the Atlantic coast of France. δ13C and δ15N i...
Technology Transfer Automated Retrieval System (TEKTRAN)
In this paper, the Genetic Algorithms (GA) and Bayesian model averaging (BMA) were combined to simultaneously conduct calibration and uncertainty analysis for the Soil and Water Assessment Tool (SWAT). In this hybrid method, several SWAT models with different structures are first selected; next GA i...
Forecasting unconventional resource productivity - A spatial Bayesian model
NASA Astrophysics Data System (ADS)
Montgomery, J.; O'sullivan, F.
2015-12-01
Today's low prices mean that unconventional oil and gas development requires ever greater efficiency and better development decision-making. Inter and intra-field variability in well productivity, which is a major contemporary driver of uncertainty regarding resource size and its economics is driven by factors including geological conditions, well and completion design (which companies vary as they seek to optimize their performance), and uncertainty about the nature of fracture propagation. Geological conditions are often not be well understood early on in development campaigns, but nevertheless critical assessments and decisions must be made regarding the value of drilling an area and the placement of wells. In these situations, location provides a reasonable proxy for geology and the "rock quality." We propose a spatial Bayesian model for forecasting acreage quality, which improves decision-making by leveraging available production data and provides a framework for statistically studying the influence of different parameters on well productivity. Our approach consists of subdividing a field into sections and forming prior distributions for productivity in each section based on knowledge about the overall field. Production data from wells is used to update these estimates in a Bayesian fashion, improving model accuracy far more rapidly and with less sensitivity to outliers than a model that simply establishes an "average" productivity in each section. Additionally, forecasts using this model capture the importance of uncertainty—either due to a lack of information or for areas that demonstrate greater geological risk. We demonstrate the forecasting utility of this method using public data and also provide examples of how information from this model can be combined with knowledge about a field's geology or changes in technology to better quantify development risk. This approach represents an important shift in the way that production data is used to guide
Improving default risk prediction using Bayesian model uncertainty techniques.
Kazemi, Reza; Mosleh, Ali
2012-11-01
Credit risk is the potential exposure of a creditor to an obligor's failure or refusal to repay the debt in principal or interest. The potential of exposure is measured in terms of probability of default. Many models have been developed to estimate credit risk, with rating agencies dating back to the 19th century. They provide their assessment of probability of default and transition probabilities of various firms in their annual reports. Regulatory capital requirements for credit risk outlined by the Basel Committee on Banking Supervision have made it essential for banks and financial institutions to develop sophisticated models in an attempt to measure credit risk with higher accuracy. The Bayesian framework proposed in this article uses the techniques developed in physical sciences and engineering for dealing with model uncertainty and expert accuracy to obtain improved estimates of credit risk and associated uncertainties. The approach uses estimates from one or more rating agencies and incorporates their historical accuracy (past performance data) in estimating future default risk and transition probabilities. Several examples demonstrate that the proposed methodology can assess default probability with accuracy exceeding the estimations of all the individual models. Moreover, the methodology accounts for potentially significant departures from "nominal predictions" due to "upsetting events" such as the 2008 global banking crisis.
Bayesian spatially dependent variable selection for small area health modeling.
Choi, Jungsoon; Lawson, Andrew B
2016-06-16
Statistical methods for spatial health data to identify the significant covariates associated with the health outcomes are of critical importance. Most studies have developed variable selection approaches in which the covariates included appear within the spatial domain and their effects are fixed across space. However, the impact of covariates on health outcomes may change across space and ignoring this behavior in spatial epidemiology may cause the wrong interpretation of the relations. Thus, the development of a statistical framework for spatial variable selection is important to allow for the estimation of the space-varying patterns of covariate effects as well as the early detection of disease over space. In this paper, we develop flexible spatial variable selection approaches to find the spatially-varying subsets of covariates with significant effects. A Bayesian hierarchical latent model framework is applied to account for spatially-varying covariate effects. We present a simulation example to examine the performance of the proposed models with the competing models. We apply our models to a county-level low birth weight incidence dataset in Georgia.
A Bayesian model for the analysis of transgenerational epigenetic variation.
Varona, Luis; Munilla, Sebastián; Mouresan, Elena Flavia; González-Rodríguez, Aldemar; Moreno, Carlos; Altarriba, Juan
2015-01-23
Epigenetics has become one of the major areas of biological research. However, the degree of phenotypic variability that is explained by epigenetic processes still remains unclear. From a quantitative genetics perspective, the estimation of variance components is achieved by means of the information provided by the resemblance between relatives. In a previous study, this resemblance was described as a function of the epigenetic variance component and a reset coefficient that indicates the rate of dissipation of epigenetic marks across generations. Given these assumptions, we propose a Bayesian mixed model methodology that allows the estimation of epigenetic variance from a genealogical and phenotypic database. The methodology is based on the development of a T: matrix of epigenetic relationships that depends on the reset coefficient. In addition, we present a simple procedure for the calculation of the inverse of this matrix ( T-1: ) and a Gibbs sampler algorithm that obtains posterior estimates of all the unknowns in the model. The new procedure was used with two simulated data sets and with a beef cattle database. In the simulated populations, the results of the analysis provided marginal posterior distributions that included the population parameters in the regions of highest posterior density. In the case of the beef cattle dataset, the posterior estimate of transgenerational epigenetic variability was very low and a model comparison test indicated that a model that did not included it was the most plausible.
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range
Bayesian modelling of compositional heterogeneity in molecular phylogenetics.
Heaps, Sarah E; Nye, Tom M W; Boys, Richard J; Williams, Tom A; Embley, T Martin
2014-10-01
In molecular phylogenetics, standard models of sequence evolution generally assume that sequence composition remains constant over evolutionary time. However, this assumption is violated in many datasets which show substantial heterogeneity in sequence composition across taxa. We propose a model which allows compositional heterogeneity across branches, and formulate the model in a Bayesian framework. Specifically, the root and each branch of the tree is associated with its own composition vector whilst a global matrix of exchangeability parameters applies everywhere on the tree. We encourage borrowing of strength between branches by developing two possible priors for the composition vectors: one in which information can be exchanged equally amongst all branches of the tree and another in which more information is exchanged between neighbouring branches than between distant branches. We also propose a Markov chain Monte Carlo (MCMC) algorithm for posterior inference which uses data augmentation of substitutional histories to yield a simple complete data likelihood function that factorises over branches and allows Gibbs updates for most parameters. Standard phylogenetic models are not informative about the root position. Therefore a significant advantage of the proposed model is that it allows inference about rooted trees. The position of the root is fundamental to the biological interpretation of trees, both for polarising trait evolution and for establishing the order of divergence among lineages. Furthermore, unlike some other related models from the literature, inference in the model we propose can be carried out through a simple MCMC scheme which does not require problematic dimension-changing moves. We investigate the performance of the model and priors in analyses of two alignments for which there is strong biological opinion about the tree topology and root position.
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2017-01-18
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd.
CRAFFT: An Activity Prediction Model based on Bayesian Networks.
Nazerfard, Ehsan; Cook, Diane J
2015-04-01
Recent advances in the areas of pervasive computing, data mining, and machine learning offer unique opportunities to provide health monitoring and assistance for individuals facing difficulties to live independently in their homes. Several components have to work together to provide health monitoring for smart home residents including, but not limited to, activity recognition, activity discovery, activity prediction, and prompting system. Compared to the significant research done to discover and recognize activities, less attention has been given to predict the future activities that the resident is likely to perform. Activity prediction components can play a major role in design of a smart home. For instance, by taking advantage of an activity prediction module, a smart home can learn context-aware rules to prompt individuals to initiate important activities. In this paper, we propose an activity prediction model using Bayesian networks together with a novel two-step inference process to predict both the next activity features and the next activity label. We also propose an approach to predict the start time of the next activity which is based on modeling the relative start time of the predicted activity using the continuous normal distribution and outlier detection. To validate our proposed models, we used real data collected from physical smart environments.
A Bayesian Semiparametric Model for Radiation Dose-Response Estimation.
Furukawa, Kyoji; Misumi, Munechika; Cologne, John B; Cullings, Harry M
2016-06-01
In evaluating the risk of exposure to health hazards, characterizing the dose-response relationship and estimating acceptable exposure levels are the primary goals. In analyses of health risks associated with exposure to ionizing radiation, while there is a clear agreement that moderate to high radiation doses cause harmful effects in humans, little has been known about the possible biological effects at low doses, for example, below 0.1 Gy, which is the dose range relevant to most radiation exposures of concern today. A conventional approach to radiation dose-response estimation based on simple parametric forms, such as the linear nonthreshold model, can be misleading in evaluating the risk and, in particular, its uncertainty at low doses. As an alternative approach, we consider a Bayesian semiparametric model that has a connected piece-wise-linear dose-response function with prior distributions having an autoregressive structure among the random slope coefficients defined over closely spaced dose categories. With a simulation study and application to analysis of cancer incidence data among Japanese atomic bomb survivors, we show that this approach can produce smooth and flexible dose-response estimation while reasonably handling the risk uncertainty at low doses and elsewhere. With relatively few assumptions and modeling options to be made by the analyst, the method can be particularly useful in assessing risks associated with low-dose radiation exposures.
Hierarchical Bayesian cognitive processing models to analyze clinical trial data.
Shankle, William R; Hara, Junko; Mangrola, Tushar; Hendrix, Suzanne; Alva, Gus; Lee, Michael D
2013-07-01
Identifying disease-modifying treatment effects in earlier stages of Alzheimer's disease (AD)-when changes are subtle-will require improved trial design and more sensitive analytical methods. We applied hierarchical Bayesian analysis with cognitive processing (HBCP) models to the Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog) and MCI (mild cognitive impairment) Screen word list memory task data from 14 Alzheimer's disease AD patients of the Myriad Pharmaceuticals' phase III clinical trial of Flurizan (a γ-secretase modulator) versus placebo. The original analysis of 1649 patients found no treatment group differences. HBCP analysis and the original ADAS-Cog analysis were performed on the small sample. HBCP analysis detected impaired memory storage during delayed recall, whereas the original ADAS-Cog analytical method did not. The HBCP model identified a harmful treatment effect in a small sample, which has been independently confirmed from the results of other γ-secretase inhibitor. The original analytical method applied to the ADAS-Cog data did not detect this harmful treatment effect on either the full or the small sample. These findings suggest that HBCP models can detect treatment effects more sensitively than currently used analytical methods required by the Food and Drug Administration, and they do so using small patient samples.
Bayesian network model of crowd emotion and negative behavior
NASA Astrophysics Data System (ADS)
Ramli, Nurulhuda; Ghani, Noraida Abdul; Hatta, Zulkarnain Ahmad; Hashim, Intan Hashimah Mohd; Sulong, Jasni; Mahudin, Nor Diana Mohd; Rahman, Shukran Abd; Saad, Zarina Mat
2014-12-01
The effects of overcrowding have become a major concern for event organizers. One aspect of this concern has been the idea that overcrowding can enhance the occurrence of serious incidents during events. As one of the largest Muslim religious gathering attended by pilgrims from all over the world, Hajj has become extremely overcrowded with many incidents being reported. The purpose of this study is to analyze the nature of human emotion and negative behavior resulting from overcrowding during Hajj events from data gathered in Malaysian Hajj Experience Survey in 2013. The sample comprised of 147 Malaysian pilgrims (70 males and 77 females). Utilizing a probabilistic model called Bayesian network, this paper models the dependence structure between different emotions and negative behaviors of pilgrims in the crowd. The model included the following variables of emotion: negative, negative comfortable, positive, positive comfortable and positive spiritual and variables of negative behaviors; aggressive and hazardous acts. The study demonstrated that emotions of negative, negative comfortable, positive spiritual and positive emotion have a direct influence on aggressive behavior whereas emotion of negative comfortable, positive spiritual and positive have a direct influence on hazardous acts behavior. The sensitivity analysis showed that a low level of negative and negative comfortable emotions leads to a lower level of aggressive and hazardous behavior. Findings of the study can be further improved to identify the exact cause and risk factors of crowd-related incidents in preventing crowd disasters during the mass gathering events.
Greiner, Matthias; Smid, Joost; Havelaar, Arie H; Müller-Graf, Christine
2013-05-15
Quantitative microbiological risk assessment (QMRA) models are used to reflect knowledge about complex real-world scenarios for the propagation of microbiological hazards along the feed and food chain. The aim is to provide insight into interdependencies among model parameters, typically with an interest to characterise the effect of risk mitigation measures. A particular requirement is to achieve clarity about the reliability of conclusions from the model in the presence of uncertainty. To this end, Monte Carlo (MC) simulation modelling has become a standard in so-called probabilistic risk assessment. In this paper, we elaborate on the application of Bayesian computational statistics in the context of QMRA. It is useful to explore the analogy between MC modelling and Bayesian inference (BI). This pertains in particular to the procedures for deriving prior distributions for model parameters. We illustrate using a simple example that the inability to cope with feedback among model parameters is a major limitation of MC modelling. However, BI models can be easily integrated into MC modelling to overcome this limitation. We refer a BI submodel integrated into a MC model to as a "Bayes domain". We also demonstrate that an entire QMRA model can be formulated as Bayesian graphical model (BGM) and discuss the advantages of this approach. Finally, we show example graphs of MC, BI and BGM models, highlighting the similarities among the three approaches.
Enhancing the Modeling of PFOA Pharmacokinetics with Bayesian Analysis
The detail sufficient to describe the pharmacokinetics (PK) for perfluorooctanoic acid (PFOA) and the methods necessary to combine information from multiple data sets are both subjects of ongoing investigation. Bayesian analysis provides tools to accommodate these goals. We exa...
Predicting individual brain functional connectivity using a Bayesian hierarchical model.
Dai, Tian; Guo, Ying
2017-02-15
Network-oriented analysis of functional magnetic resonance imaging (fMRI), especially resting-state fMRI, has revealed important association between abnormal connectivity and brain disorders such as schizophrenia, major depression and Alzheimer's disease. Imaging-based brain connectivity measures have become a useful tool for investigating the pathophysiology, progression and treatment response of psychiatric disorders and neurodegenerative diseases. Recent studies have started to explore the possibility of using functional neuroimaging to help predict disease progression and guide treatment selection for individual patients. These studies provide the impetus to develop statistical methodology that would help provide predictive information on disease progression-related or treatment-related changes in neural connectivity. To this end, we propose a prediction method based on Bayesian hierarchical model that uses individual's baseline fMRI scans, coupled with relevant subject characteristics, to predict the individual's future functional connectivity. A key advantage of the proposed method is that it can improve the accuracy of individualized prediction of connectivity by combining information from both group-level connectivity patterns that are common to subjects with similar characteristics as well as individual-level connectivity features that are particular to the specific subject. Furthermore, our method also offers statistical inference tools such as predictive intervals that help quantify the uncertainty or variability of the predicted outcomes. The proposed prediction method could be a useful approach to predict the changes in individual patient's brain connectivity with the progression of a disease. It can also be used to predict a patient's post-treatment brain connectivity after a specified treatment regimen. Another utility of the proposed method is that it can be applied to test-retest imaging data to develop a more reliable estimator for individual
Hydraulic Conductivity Estimation using Bayesian Model Averaging and Generalized Parameterization
NASA Astrophysics Data System (ADS)
Tsai, F. T.; Li, X.
2006-12-01
Non-uniqueness in parameterization scheme is an inherent problem in groundwater inverse modeling due to limited data. To cope with the non-uniqueness problem of parameterization, we introduce a Bayesian Model Averaging (BMA) method to integrate a set of selected parameterization methods. The estimation uncertainty in BMA includes the uncertainty in individual parameterization methods as the within-parameterization variance and the uncertainty from using different parameterization methods as the between-parameterization variance. Moreover, the generalized parameterization (GP) method is considered in the geostatistical framework in this study. The GP method aims at increasing the flexibility of parameterization through the combination of a zonation structure and an interpolation method. The use of BMP with GP avoids over-confidence in a single parameterization method. A normalized least-squares estimation (NLSE) is adopted to calculate the posterior probability for each GP. We employee the adjoint state method for the sensitivity analysis on the weighting coefficients in the GP method. The adjoint state method is also applied to the NLSE problem. The proposed methodology is implemented to the Alamitos Barrier Project (ABP) in California, where the spatially distributed hydraulic conductivity is estimated. The optimal weighting coefficients embedded in GP are identified through the maximum likelihood estimation (MLE) where the misfits between the observed and calculated groundwater heads are minimized. The conditional mean and conditional variance of the estimated hydraulic conductivity distribution using BMA are obtained to assess the estimation uncertainty.
A Flexible Bayesian Model for Testing for Transmission Ratio Distortion
Casellas, Joaquim; Manunza, Arianna; Mercader, Anna; Quintanilla, Raquel; Amills, Marcel
2014-01-01
Current statistical approaches to investigate the nature and magnitude of transmission ratio distortion (TRD) are scarce and restricted to the most common experimental designs such as F2 populations and backcrosses. In this article, we describe a new Bayesian approach to check TRD within a given biallelic genetic marker in a diploid species, providing a highly flexible framework that can accommodate any kind of population structure. This model relies on the genotype of each offspring and thus integrates all available information from either the parents’ genotypes or population-specific allele frequencies and yields TRD estimates that can be corroborated by the calculation of a Bayes factor (BF). This approach has been evaluated on simulated data sets with appealing statistical performance. As a proof of concept, we have also tested TRD in a porcine population with five half-sib families and 352 offspring. All boars and piglets were genotyped with the Porcine SNP60 BeadChip, whereas genotypes from the sows were not available. The SNP-by-SNP screening of the pig genome revealed 84 SNPs with decisive evidences of TRD (BF > 100) after accounting for multiple testing. Many of these regions contained genes related to biological processes (e.g., nucleosome assembly and co-organization, DNA conformation and packaging, and DNA complex assembly) that are critically associated with embryonic viability. The implementation of this method, which overcomes many of the limitations of previous approaches, should contribute to fostering research on TRD in both model and nonmodel organisms. PMID:25271302
Bayesian Modeling of Haplotype Effects in Multiparent Populations
Zhang, Zhaojun; Wang, Wei; Valdar, William
2014-01-01
A general Bayesian model, Diploffect, is described for estimating the effects of founder haplotypes at quantitative trait loci (QTL) detected in multiparental genetic populations; such populations include the Collaborative Cross (CC), Heterogeneous Socks (HS), and many others for which local genetic variation is well described by an underlying, usually probabilistically inferred, haplotype mosaic. Our aim is to provide a framework for coherent estimation of haplotype and diplotype (haplotype pair) effects that takes into account the following: uncertainty in haplotype composition for each individual; uncertainty arising from small sample sizes and infrequently observed haplotype combinations; possible effects of dominance (for noninbred subjects); genetic background; and that provides a means to incorporate data that may be incomplete or has a hierarchical structure. Using the results of a probabilistic haplotype reconstruction as prior information, we obtain posterior distributions at the QTL for both haplotype effects and haplotype composition. Two alternative computational approaches are supplied: a Markov chain Monte Carlo sampler and a procedure based on importance sampling of integrated nested Laplace approximations. Using simulations of QTL in the incipient CC (pre-CC) and Northport HS populations, we compare the accuracy of Diploffect, approximations to it, and more commonly used approaches based on Haley–Knott regression, describing trade-offs between these methods. We also estimate effects for three QTL previously identified in those populations, obtaining posterior intervals that describe how the phenotype might be affected by diplotype substitutions at the modeled locus. PMID:25236455
Bayesian Modeling and Chronological Precision for Polynesian Settlement of Tonga
Weisler, Marshall; Zhao, Jian-xin
2015-01-01
First settlement of Polynesia, and population expansion throughout the ancestral Polynesian homeland are foundation events for global history. A precise chronology is paramount to informed archaeological interpretation of these events and their consequences. Recently applied chronometric hygiene protocols excluding radiocarbon dates on wood charcoal without species identification all but eliminates this chronology as it has been built for the Kingdom of Tonga, the initial islands to be settled in Polynesia. In this paper we re-examine and redevelop this chronology through application of Bayesian models to the questioned suite of radiocarbon dates, but also incorporating short-lived wood charcoal dates from archived samples and high precision U/Th dates on coral artifacts. These models provide generation level precision allowing us to track population migration from first Lapita occupation on the island of Tongatapu through Tonga’s central and northern island groups. They further illustrate an exceptionally short duration for the initial colonizing Lapita phase and a somewhat abrupt transition to ancestral Polynesian society as it is currently defined. PMID:25799460
Bayesian image reconstruction - The pixon and optimal image modeling
NASA Technical Reports Server (NTRS)
Pina, R. K.; Puetter, R. C.
1993-01-01
In this paper we describe the optimal image model, maximum residual likelihood method (OptMRL) for image reconstruction. OptMRL is a Bayesian image reconstruction technique for removing point-spread function blurring. OptMRL uses both a goodness-of-fit criterion (GOF) and an 'image prior', i.e., a function which quantifies the a priori probability of the image. Unlike standard maximum entropy methods, which typically reconstruct the image on the data pixel grid, OptMRL varies the image model in order to find the optimal functional basis with which to represent the image. We show how an optimal basis for image representation can be selected and in doing so, develop the concept of the 'pixon' which is a generalized image cell from which this basis is constructed. By allowing both the image and the image representation to be variable, the OptMRL method greatly increases the volume of solution space over which the image is optimized. Hence the likelihood of the final reconstructed image is greatly increased. For the goodness-of-fit criterion, OptMRL uses the maximum residual likelihood probability distribution introduced previously by Pina and Puetter (1992). This GOF probability distribution, which is based on the spatial autocorrelation of the residuals, has the advantage that it ensures spatially uncorrelated image reconstruction residuals.
Bayesian image reconstruction - The pixon and optimal image modeling
NASA Astrophysics Data System (ADS)
Pina, R. K.; Puetter, R. C.
1993-06-01
In this paper we describe the optimal image model, maximum residual likelihood method (OptMRL) for image reconstruction. OptMRL is a Bayesian image reconstruction technique for removing point-spread function blurring. OptMRL uses both a goodness-of-fit criterion (GOF) and an 'image prior', i.e., a function which quantifies the a priori probability of the image. Unlike standard maximum entropy methods, which typically reconstruct the image on the data pixel grid, OptMRL varies the image model in order to find the optimal functional basis with which to represent the image. We show how an optimal basis for image representation can be selected and in doing so, develop the concept of the 'pixon' which is a generalized image cell from which this basis is constructed. By allowing both the image and the image representation to be variable, the OptMRL method greatly increases the volume of solution space over which the image is optimized. Hence the likelihood of the final reconstructed image is greatly increased. For the goodness-of-fit criterion, OptMRL uses the maximum residual likelihood probability distribution introduced previously by Pina and Puetter (1992). This GOF probability distribution, which is based on the spatial autocorrelation of the residuals, has the advantage that it ensures spatially uncorrelated image reconstruction residuals.
Bayesian network models in brain functional connectivity analysis
Zhang, Sheng; Li, Chiang-shan R.
2013-01-01
Much effort has been made to better understand the complex integration of distinct parts of the human brain using functional magnetic resonance imaging (fMRI). Altered functional connectivity between brain regions is associated with many neurological and mental illnesses, such as Alzheimer and Parkinson diseases, addiction, and depression. In computational science, Bayesian networks (BN) have been used in a broad range of studies to model complex data set in the presence of uncertainty and when expert prior knowledge is needed. However, little is done to explore the use of BN in connectivity analysis of fMRI data. In this paper, we present an up-to-date literature review and methodological details of connectivity analyses using BN, while highlighting caveats in a real-world application. We present a BN model of fMRI dataset obtained from sixty healthy subjects performing the stop-signal task (SST), a paradigm widely used to investigate response inhibition. Connectivity results are validated with the extant literature including our previous studies. By exploring the link strength of the learned BN’s and correlating them to behavioral performance measures, this novel use of BN in connectivity analysis provides new insights to the functional neural pathways underlying response inhibition. PMID:24319317
Bayesian modeling of temporal dependence in large sparse contingency tables
Kunihama, Tsuyoshi; Dunson, David B.
2013-01-01
In many applications, it is of interest to study trends over time in relationships among categorical variables, such as age group, ethnicity, religious affiliation, political party and preference for particular policies. At each time point, a sample of individuals provide responses to a set of questions, with different individuals sampled at each time. In such settings, there tends to be abundant missing data and the variables being measured may change over time. At each time point, one obtains a large sparse contingency table, with the number of cells often much larger than the number of individuals being surveyed. To borrow information across time in modeling large sparse contingency tables, we propose a Bayesian autoregressive tensor factorization approach. The proposed model relies on a probabilistic Parafac factorization of the joint pmf characterizing the categorical data distribution at each time point, with autocorrelation included across times. Efficient computational methods are developed relying on MCMC. The methods are evaluated through simulation examples and applied to social survey data. PMID:24482548
A Bayesian model of context-sensitive value attribution
Rigoli, Francesco; Friston, Karl J; Martinelli, Cristina; Selaković, Mirjana; Shergill, Sukhwinder S; Dolan, Raymond J
2016-01-01
Substantial evidence indicates that incentive value depends on an anticipation of rewards within a given context. However, the computations underlying this context sensitivity remain unknown. To address this question, we introduce a normative (Bayesian) account of how rewards map to incentive values. This assumes that the brain inverts a model of how rewards are generated. Key features of our account include (i) an influence of prior beliefs about the context in which rewards are delivered (weighted by their reliability in a Bayes-optimal fashion), (ii) the notion that incentive values correspond to precision-weighted prediction errors, (iii) and contextual information unfolding at different hierarchical levels. This formulation implies that incentive value is intrinsically context-dependent. We provide empirical support for this model by showing that incentive value is influenced by context variability and by hierarchically nested contexts. The perspective we introduce generates new empirical predictions that might help explaining psychopathologies, such as addiction. DOI: http://dx.doi.org/10.7554/eLife.16127.001 PMID:27328323
Emulation Modeling with Bayesian Networks for Efficient Decision Support
NASA Astrophysics Data System (ADS)
Fienen, M. N.; Masterson, J.; Plant, N. G.; Gutierrez, B. T.; Thieler, E. R.
2012-12-01
Bayesian decision networks (BDN) have long been used to provide decision support in systems that require explicit consideration of uncertainty; applications range from ecology to medical diagnostics and terrorism threat assessments. Until recently, however, few studies have applied BDNs to the study of groundwater systems. BDNs are particularly useful for representing real-world system variability by synthesizing a range of hydrogeologic situations within a single simulation. Because BDN output is cast in terms of probability—an output desired by decision makers—they explicitly incorporate the uncertainty of a system. BDNs can thus serve as a more efficient alternative to other uncertainty characterization methods such as computationally demanding Monte Carlo analyses and others methods restricted to linear model analyses. We present a unique application of a BDN to a groundwater modeling analysis of the hydrologic response of Assateague Island, Maryland to sea-level rise. Using both input and output variables of the modeled groundwater response to different sea-level (SLR) rise scenarios, the BDN predicts the probability of changes in the depth to fresh water, which exerts an important influence on physical and biological island evolution. Input variables included barrier-island width, maximum island elevation, and aquifer recharge. The variability of these inputs and their corresponding outputs are sampled along cross sections in a single model run to form an ensemble of input/output pairs. The BDN outputs, which are the posterior distributions of water table conditions for the sea-level rise scenarios, are evaluated through error analysis and cross-validation to assess both fit to training data and predictive power. The key benefit for using BDNs in groundwater modeling analyses is that they provide a method for distilling complex model results into predictions with associated uncertainty, which is useful to decision makers. Future efforts incorporate
Tauber, Sean; Navarro, Daniel J; Perfors, Amy; Steyvers, Mark
2017-03-30
Recent debates in the psychological literature have raised questions about the assumptions that underpin Bayesian models of cognition and what inferences they license about human cognition. In this paper we revisit this topic, arguing that there are 2 qualitatively different ways in which a Bayesian model could be constructed. The most common approach uses a Bayesian model as a normative standard upon which to license a claim about optimality. In the alternative approach, a descriptive Bayesian model need not correspond to any claim that the underlying cognition is optimal or rational, and is used solely as a tool for instantiating a substantive psychological theory. We present 3 case studies in which these 2 perspectives lead to different computational models and license different conclusions about human cognition. We demonstrate how the descriptive Bayesian approach can be used to answer different sorts of questions than the optimal approach, especially when combined with principled tools for model evaluation and model selection. More generally we argue for the importance of making a clear distinction between the 2 perspectives. Considerable confusion results when descriptive models and optimal models are conflated, and if Bayesians are to avoid contributing to this confusion it is important to avoid making normative claims when none are intended. (PsycINFO Database Record
Carroll, Carlos; Johnson, Devin S
2008-08-01
Regional conservation planning increasingly draws on habitat suitability models to support decisions regarding land allocation and management. Nevertheless, statistical techniques commonly used for developing such models may give misleading results because they fail to account for 3 factors common in data sets of species distribution: spatial autocorrelation, the large number of sites where the species is absent (zero inflation), and uneven survey effort. We used spatial autoregressive models fit with Bayesian Markov Chain Monte Carlo techniques to assess the relationship between older coniferous forest and the abundance of Northern Spotted Owl nest and activity sites throughout the species' range. The spatial random-effect term incorporated in the autoregressive models successfully accounted for zero inflation and reduced the effect of survey bias on estimates of species-habitat associations. Our results support the hypothesis that the relationship between owl distribution and older forest varies with latitude. A quadratic relationship between owl abundance and older forest was evident in the southern portion of the range, and a pseudothreshold relationship was evident in the northern portion of the range. Our results suggest that proposed changes to the network of owl habitat reserves would reduce the proportion of the population protected by up to one-third, and that proposed guidelines for forest management within reserves underestimate the proportion of older forest associated with maximum owl abundance and inappropriately generalize threshold relationships among subregions. Bayesian spatial models can greatly enhance the utility of habitat analysis for conservation planning because they add the statistical flexibility necessary for analyzing regional survey data while retaining the interpretability of simpler models.
A Bayesian view on acoustic model-based techniques for robust speech recognition
NASA Astrophysics Data System (ADS)
Maas, Roland; Huemmer, Christian; Sehr, Armin; Kellermann, Walter
2015-12-01
This article provides a unifying Bayesian view on various approaches for acoustic model adaptation, missing feature, and uncertainty decoding that are well-known in the literature of robust automatic speech recognition. The representatives of these classes can often be deduced from a Bayesian network that extends the conventional hidden Markov models used in speech recognition. These extensions, in turn, can in many cases be motivated from an underlying observation model that relates clean and distorted feature vectors. By identifying and converting the observation models into a Bayesian network representation, we formulate the corresponding compensation rules. We thus summarize the various approaches as approximations or modifications of the same Bayesian decoding rule leading to a unified view on known derivations as well as to new formulations for certain approaches.
Lee, Sik-Yum; Song, Xin-Yuan
2004-05-01
Missing data are very common in behavioural and psychological research. In this paper, we develop a Bayesian approach in the context of a general nonlinear structural equation model with missing continuous and ordinal categorical data. In the development, the missing data are treated as latent quantities, and provision for the incompleteness of the data is made by a hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm. We show by means of a simulation study that the Bayesian estimates are accurate. A Bayesian model comparison procedure based on the Bayes factor and path sampling is proposed. The required observations from the posterior distribution for computing the Bayes factor are simulated by the hybrid algorithm in Bayesian estimation. Our simulation results indicate that the correct model is selected more frequently when the incomplete records are used in the analysis than when they are ignored. The methodology is further illustrated with a real data set from a study concerned with an AIDS preventative intervention for Filipina sex workers.
Bayesian Safety Risk Modeling of Human-Flightdeck Automation Interaction
NASA Technical Reports Server (NTRS)
Ancel, Ersin; Shih, Ann T.
2015-01-01
Usage of automatic systems in airliners has increased fuel efficiency, added extra capabilities, enhanced safety and reliability, as well as provide improved passenger comfort since its introduction in the late 80's. However, original automation benefits, including reduced flight crew workload, human errors or training requirements, were not achieved as originally expected. Instead, automation introduced new failure modes, redistributed, and sometimes increased workload, brought in new cognitive and attention demands, and increased training requirements. Modern airliners have numerous flight modes, providing more flexibility (and inherently more complexity) to the flight crew. However, the price to pay for the increased flexibility is the need for increased mode awareness, as well as the need to supervise, understand, and predict automated system behavior. Also, over-reliance on automation is linked to manual flight skill degradation and complacency in commercial pilots. As a result, recent accidents involving human errors are often caused by the interactions between humans and the automated systems (e.g., the breakdown in man-machine coordination), deteriorated manual flying skills, and/or loss of situational awareness due to heavy dependence on automated systems. This paper describes the development of the increased complexity and reliance on automation baseline model, named FLAP for FLightdeck Automation Problems. The model development process starts with a comprehensive literature review followed by the construction of a framework comprised of high-level causal factors leading to an automation-related flight anomaly. The framework was then converted into a Bayesian Belief Network (BBN) using the Hugin Software v7.8. The effects of automation on flight crew are incorporated into the model, including flight skill degradation, increased cognitive demand and training requirements along with their interactions. Besides flight crew deficiencies, automation system
A Bayesian hierarchical surrogate outcome model for multiple sclerosis.
Pozzi, Luca; Schmidli, Heinz; Ohlssen, David I
2016-07-01
The development of novel therapies in multiple sclerosis (MS) is one area where a range of surrogate outcomes are used in various stages of clinical research. While the aim of treatments in MS is to prevent disability, a clinical trial for evaluating a drugs effect on disability progression would require a large sample of patients with many years of follow-up. The early stage of MS is characterized by relapses. To reduce study size and duration, clinical relapses are accepted as primary endpoints in phase III trials. For phase II studies, the primary outcomes are typically lesion counts based on magnetic resonance imaging (MRI), as these are considerably more sensitive than clinical measures for detecting MS activity. Recently, Sormani and colleagues in 'Surrogate endpoints for EDSS worsening in multiple sclerosis' provided a systematic review and used weighted regression analyses to examine the role of either MRI lesions or relapses as trial level surrogate outcomes for disability. We build on this work by developing a Bayesian three-level model, accommodating the two surrogates and the disability endpoint, and properly taking into account that treatment effects are estimated with errors. Specifically, a combination of treatment effects based on MRI lesion count outcomes and clinical relapse was used to develop a study-level surrogate outcome model for the corresponding treatment effects based on disability progression. While the primary aim for developing this model was to support decision-making in drug development, the proposed model may also be considered for future validation. Copyright © 2016 John Wiley & Sons, Ltd.
A Bayesian network model for predicting aquatic toxicity mode ...
The mode of toxic action (MoA) has been recognized as a key determinant of chemical toxicity but MoA classification in aquatic toxicology has been limited. We developed a Bayesian network model to classify aquatic toxicity mode of action using a recently published dataset containing over one thousand chemicals with MoA assignments for aquatic animal toxicity. Two dimensional theoretical chemical descriptors were generated for each chemical using the Toxicity Estimation Software Tool. The model was developed through augmented Markov blanket discovery from the data set with the MoA broad classifications as a target node. From cross validation, the overall precision for the model was 80.2% with a R2 of 0.959. The best precision was for the AChEI MoA (93.5%) where 257 chemicals out of 275 were correctly classified. Model precision was poorest for the reactivity MoA (48.5%) where 48 out of 99 reactive chemicals were correctly classified. Narcosis represented the largest class within the MoA dataset and had a precision and reliability of 80.0%, reflecting the global precision across all of the MoAs. False negatives for narcosis most often fell into electron transport inhibition, neurotoxicity or reactivity MoAs. False negatives for all other MoAs were most often narcosis. A probabilistic sensitivity analysis was undertaken for each MoA to examine the sensitivity to individual and multiple descriptor findings. The results show that the Markov blanket of a structurally
Bayesian Hidden Markov Modeling of Array CGH Data.
Guha, Subharup; Li, Yi; Neuberg, Donna
2008-06-01
Genomic alterations have been linked to the development and progression of cancer. The technique of comparative genomic hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data.We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Because the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme, and breast cancer are analyzed, and comparisons are made with some widely used algorithms to illustrate the reliability and success of the technique.
Estimating seabed scattering mechanisms via Bayesian model selection.
Steininger, Gavin; Dosso, Stan E; Holland, Charles W; Dettmer, Jan
2014-10-01
A quantitative inversion procedure is developed and applied to determine the dominant scattering mechanism (surface roughness and/or volume scattering) from seabed scattering-strength data. The classification system is based on trans-dimensional Bayesian inversion with the deviance information criterion used to select the dominant scattering mechanism. Scattering is modeled using first-order perturbation theory as due to one of three mechanisms: Interface scattering from a rough seafloor, volume scattering from a heterogeneous sediment layer, or mixed scattering combining both interface and volume scattering. The classification system is applied to six simulated test cases where it correctly identifies the true dominant scattering mechanism as having greater support from the data in five cases; the remaining case is indecisive. The approach is also applied to measured backscatter-strength data where volume scattering is determined as the dominant scattering mechanism. Comparison of inversion results with core data indicates the method yields both a reasonable volume heterogeneity size distribution and a good estimate of the sub-bottom depths at which scatterers occur.
MARTINEZ, Josue G.; BOHN, Kirsten M.; CARROLL, Raymond J.
2013-01-01
We describe a new approach to analyze chirp syllables of free-tailed bats from two regions of Texas in which they are predominant: Austin and College Station. Our goal is to characterize any systematic regional differences in the mating chirps and assess whether individual bats have signature chirps. The data are analyzed by modeling spectrograms of the chirps as responses in a Bayesian functional mixed model. Given the variable chirp lengths, we compute the spectrograms on a relative time scale interpretable as the relative chirp position, using a variable window overlap based on chirp length. We use 2D wavelet transforms to capture correlation within the spectrogram in our modeling and obtain adaptive regularization of the estimates and inference for the regions-specific spectrograms. Our model includes random effect spectrograms at the bat level to account for correlation among chirps from the same bat, and to assess relative variability in chirp spectrograms within and between bats. The modeling of spectrograms using functional mixed models is a general approach for the analysis of replicated nonstationary time series, such as our acoustical signals, to relate aspects of the signals to various predictors, while accounting for between-signal structure. This can be done on raw spectrograms when all signals are of the same length, and can be done using spectrograms defined on a relative time scale for signals of variable length in settings where the idea of defining correspondence across signals based on relative position is sensible. PMID:23997376
Martinez, Josue G; Bohn, Kirsten M; Carroll, Raymond J; Morris, Jeffrey S
2013-06-01
We describe a new approach to analyze chirp syllables of free-tailed bats from two regions of Texas in which they are predominant: Austin and College Station. Our goal is to characterize any systematic regional differences in the mating chirps and assess whether individual bats have signature chirps. The data are analyzed by modeling spectrograms of the chirps as responses in a Bayesian functional mixed model. Given the variable chirp lengths, we compute the spectrograms on a relative time scale interpretable as the relative chirp position, using a variable window overlap based on chirp length. We use 2D wavelet transforms to capture correlation within the spectrogram in our modeling and obtain adaptive regularization of the estimates and inference for the regions-specific spectrograms. Our model includes random effect spectrograms at the bat level to account for correlation among chirps from the same bat, and to assess relative variability in chirp spectrograms within and between bats. The modeling of spectrograms using functional mixed models is a general approach for the analysis of replicated nonstationary time series, such as our acoustical signals, to relate aspects of the signals to various predictors, while accounting for between-signal structure. This can be done on raw spectrograms when all signals are of the same length, and can be done using spectrograms defined on a relative time scale for signals of variable length in settings where the idea of defining correspondence across signals based on relative position is sensible.
Zhang, Nanhua; Shi, Ran
2015-01-01
Summary There has been an increasing interest in the analysis of spatially distributed multivariate binary data motivated by a wide range of research problems. Two types of correlations are usually involved: the correlation between the multiple outcomes at one location and the spatial correlation between the locations for one particular outcome. The commonly used regression models only consider one type of correlations while ignoring or modeling inappropriately the other one. To address this limitation, we adopt a Bayesian nonparametric approach to jointly modeling multivariate spatial binary data by integrating both types of correlations. A multivariate probit model is employed to link the binary outcomes to Gaussian latent variables; and Gaussian processes are applied to specify the spatially correlated random effects. We develop an efficient Markov chain Monte Carlo algorithm for the posterior computation. We illustrate the proposed model on simulation studies and a multidrug-resistant tuberculosis case study. PMID:24975716
Chow, Sy- Miin; Lu, Zhaohua; Zhu, Hongtu; Sherwood, Andrew
2014-01-01
The past decade has evidenced the increased prevalence of irregularly spaced longitudinal data in social sciences. Clearly lacking, however, are modeling tools that allow researchers to fit dynamic models to irregularly spaced data, particularly data that show nonlinearity and heterogeneity in dynamical structures. We consider the issue of fitting multivariate nonlinear differential equation models with random effects and unknown initial conditions to irregularly spaced data. A stochastic approximation expectation–maximization algorithm is proposed and its performance is evaluated using a benchmark nonlinear dynamical systems model, namely, the Van der Pol oscillator equations. The empirical utility of the proposed technique is illustrated using a set of 24-h ambulatory cardiovascular data from 168 men and women. Pertinent methodological challenges and unresolved issues are discussed. PMID:25416456
Chow, Sy-Miin; Lu, Zhaohua; Sherwood, Andrew; Zhu, Hongtu
2016-03-01
The past decade has evidenced the increased prevalence of irregularly spaced longitudinal data in social sciences. Clearly lacking, however, are modeling tools that allow researchers to fit dynamic models to irregularly spaced data, particularly data that show nonlinearity and heterogeneity in dynamical structures. We consider the issue of fitting multivariate nonlinear differential equation models with random effects and unknown initial conditions to irregularly spaced data. A stochastic approximation expectation-maximization algorithm is proposed and its performance is evaluated using a benchmark nonlinear dynamical systems model, namely, the Van der Pol oscillator equations. The empirical utility of the proposed technique is illustrated using a set of 24-h ambulatory cardiovascular data from 168 men and women. Pertinent methodological challenges and unresolved issues are discussed.
Bayesian parameter inference and model selection by population annealing in systems biology.
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named "posterior parameter ensemble". We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor.
Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832
Learning Bayesian Networks from Correlated Data
NASA Astrophysics Data System (ADS)
Bae, Harold; Monti, Stefano; Montano, Monty; Steinberg, Martin H.; Perls, Thomas T.; Sebastiani, Paola
2016-05-01
Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational studies are designed using some form of clustered sampling that introduces correlations between observations within the same cluster and ignoring this correlation typically inflates the rate of false positive associations. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. We compare different learning metrics using simulations and illustrate the method in two real examples: an analysis of genetic and non-genetic factors associated with human longevity from a family-based study, and an example of risk factors for complications of sickle cell anemia from a longitudinal study with repeated measures.
Bayesian Proteoform Modeling Improves Protein Quantification of Global Proteomic Measurements
Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Datta, Susmita; Payne, Samuel H.; Kang, Jiyun; Bramer, Lisa M.; Nicora, Carrie D.; Shukla, Anil K.; Metz, Thomas O.; Rodland, Karin D.; Smith, Richard D.; Tardiff, Mark F.; McDermott, Jason E.; Pounds, Joel G.; Waters, Katrina M.
2014-12-01
As the capability of mass spectrometry-based proteomics has matured, tens of thousands of peptides can be measured simultaneously, which has the benefit of offering a systems view of protein expression. However, a major challenge is that with an increase in throughput, protein quantification estimation from the native measured peptides has become a computational task. A limitation to existing computationally-driven protein quantification methods is that most ignore protein variation, such as alternate splicing of the RNA transcript and post-translational modifications or other possible proteoforms, which will affect a significant fraction of the proteome. The consequence of this assumption is that statistical inference at the protein level, and consequently downstream analyses, such as network and pathway modeling, have only limited power for biomarker discovery. Here, we describe a Bayesian model (BP-Quant) that uses statistically derived peptides signatures to identify peptides that are outside the dominant pattern, or the existence of multiple over-expressed patterns to improve relative protein abundance estimates. It is a research-driven approach that utilizes the objectives of the experiment, defined in the context of a standard statistical hypothesis, to identify a set of peptides exhibiting similar statistical behavior relating to a protein. This approach infers that changes in relative protein abundance can be used as a surrogate for changes in function, without necessarily taking into account the effect of differential post-translational modifications, processing, or splicing in altering protein function. We verify the approach using a dilution study from mouse plasma samples and demonstrate that BP-Quant achieves similar accuracy as the current state-of-the-art methods at proteoform identification with significantly better specificity. BP-Quant is available as a MatLab ® and R packages at https://github.com/PNNL-Comp-Mass-Spec/BP-Quant.
Bayesian Model Comparison for the Order Restricted RC Association Model
ERIC Educational Resources Information Center
Iliopoulos, G.; Kateri, M.; Ntzoufras, I.
2009-01-01
Association models constitute an attractive alternative to the usual log-linear models for modeling the dependence between classification variables. They impose special structure on the underlying association by assigning scores on the levels of each classification variable, which can be fixed or parametric. Under the general row-column (RC)…
Ice Shelf Modeling: A Cross-Polar Bayesian Statistical Approach
NASA Astrophysics Data System (ADS)
Kirchner, N.; Furrer, R.; Jakobsson, M.; Zwally, H. J.
2010-12-01
Ice streams interlink glacial terrestrial and marine environments: embedded in a grounded inland ice such as the Antarctic Ice Sheet or the paleo ice sheets covering extensive parts of the Eurasian and Amerasian Arctic respectively, ice streams are major drainage agents facilitating the discharge of substantial portions of continental ice into the ocean. At their seaward side, ice streams can either extend onto the ocean as floating ice tongues (such as the Drygalsky Ice Tongue/East Antarctica), or feed large ice shelves (as is the case for e.g. the Siple Coast and the Ross Ice Shelf/West Antarctica). The flow behavior of ice streams has been recognized to be intimately linked with configurational changes in their attached ice shelves; in particular, ice shelf disintegration is associated with rapid ice stream retreat and increased mass discharge from the continental ice mass, contributing eventually to sea level rise. Investigations of ice stream retreat mechanism are however incomplete if based on terrestrial records only: rather, the dynamics of ice shelves (and, eventually, the impact of the ocean on the latter) must be accounted for. However, since floating ice shelves leave hardly any traces behind when melting, uncertainty regarding the spatio-temporal distribution and evolution of ice shelves in times prior to instrumented and recorded observation is high, calling thus for a statistical modeling approach. Complementing ongoing large-scale numerical modeling efforts (Pollard & DeConto, 2009), we model the configuration of ice shelves by using a Bayesian Hiearchial Modeling (BHM) approach. We adopt a cross-polar perspective accounting for the fact that currently, ice shelves exist mainly along the coastline of Antarctica (and are virtually non-existing in the Arctic), while Arctic Ocean ice shelves repeatedly impacted the Arctic ocean basin during former glacial periods. Modeled Arctic ocean ice shelf configurations are compared with geological spatial
Inherently irrational? A computational model of escalation of commitment as Bayesian Updating.
Gilroy, Shawn P; Hantula, Donald A
2016-06-01
Monte Carlo simulations were performed to analyze the degree to which two-, three- and four-step learning histories of losses and gains correlated with escalation and persistence in extended extinction (continuous loss) conditions. Simulated learning histories were randomly generated at varying lengths and compositions and warranted probabilities were determined using Bayesian Updating methods. Bayesian Updating predicted instances where particular learning sequences were more likely to engender escalation and persistence under extinction conditions. All simulations revealed greater rates of escalation and persistence in the presence of heterogeneous (e.g., both Wins and Losses) lag sequences, with substantially increased rates of escalation when lags comprised predominantly of losses were followed by wins. These methods were then applied to human investment choices in earlier experiments. The Bayesian Updating models corresponded with data obtained from these experiments. These findings suggest that Bayesian Updating can be utilized as a model for understanding how and when individual commitment may escalate and persist despite continued failures.
Hierarchical Bayesian Model Averaging for Chance Constrained Remediation Designs
NASA Astrophysics Data System (ADS)
Chitsazan, N.; Tsai, F. T.
2012-12-01
Groundwater remediation designs are heavily relying on simulation models which are subjected to various sources of uncertainty in their predictions. To develop a robust remediation design, it is crucial to understand the effect of uncertainty sources. In this research, we introduce a hierarchical Bayesian model averaging (HBMA) framework to segregate and prioritize sources of uncertainty in a multi-layer frame, where each layer targets a source of uncertainty. The HBMA framework provides an insight to uncertainty priorities and propagation. In addition, HBMA allows evaluating model weights in different hierarchy levels and assessing the relative importance of models in each level. To account for uncertainty, we employ a chance constrained (CC) programming for stochastic remediation design. Chance constrained programming was implemented traditionally to account for parameter uncertainty. Recently, many studies suggested that model structure uncertainty is not negligible compared to parameter uncertainty. Using chance constrained programming along with HBMA can provide a rigorous tool for groundwater remediation designs under uncertainty. In this research, the HBMA-CC was applied to a remediation design in a synthetic aquifer. The design was to develop a scavenger well approach to mitigate saltwater intrusion toward production wells. HBMA was employed to assess uncertainties from model structure, parameter estimation and kriging interpolation. An improved harmony search optimization method was used to find the optimal location of the scavenger well. We evaluated prediction variances of chloride concentration at the production wells through the HBMA framework. The results showed that choosing the single best model may lead to a significant error in evaluating prediction variances for two reasons. First, considering the single best model, variances that stem from uncertainty in the model structure will be ignored. Second, considering the best model with non
Applications of Bayesian model averaging to the curvature and size of the Universe
NASA Astrophysics Data System (ADS)
Vardanyan, Mihran; Trotta, Roberto; Silk, Joseph
2011-05-01
Bayesian model averaging is a procedure to obtain parameter constraints that account for the uncertainty about the correct cosmological model. We use recent cosmological observations and Bayesian model averaging to derive tight limits on the curvature parameter, as well as robust lower bounds on the curvature radius of the Universe and its minimum size, while allowing for the possibility of an evolving dark energy component. Because flat models are favoured by Bayesian model selection, we find that model-averaged constraints on the curvature and size of the Universe can be considerably stronger than non-model-averaged ones. For the most conservative prior choice (based on inflationary considerations), our procedure improves on non-model-averaged constraints on the curvature by a factor of ˜2. The curvature scale of the Universe is conservatively constrained to be Rc > 42 Gpc (99 per cent), corresponding to a lower limit to the number of Hubble spheres in the Universe NU > 251 (99 per cent).
Bayesian estimation of regularization parameters for deformable surface models
Cunningham, G.S.; Lehovich, A.; Hanson, K.M.
1999-02-20
In this article the authors build on their past attempts to reconstruct a 3D, time-varying bolus of radiotracer from first-pass data obtained by the dynamic SPECT imager, FASTSPECT, built by the University of Arizona. The object imaged is a CardioWest total artificial heart. The bolus is entirely contained in one ventricle and its associated inlet and outlet tubes. The model for the radiotracer distribution at a given time is a closed surface parameterized by 482 vertices that are connected to make 960 triangles, with nonuniform intensity variations of radiotracer allowed inside the surface on a voxel-to-voxel basis. The total curvature of the surface is minimized through the use of a weighted prior in the Bayesian framework, as is the weighted norm of the gradient of the voxellated grid. MAP estimates for the vertices, interior intensity voxels and background count level are produced. The strength of the priors, or hyperparameters, are determined by maximizing the probability of the data given the hyperparameters, called the evidence. The evidence is calculated by first assuming that the posterior is approximately normal in the values of the vertices and voxels, and then by evaluating the integral of the multi-dimensional normal distribution. This integral (which requires evaluating the determinant of a covariance matrix) is computed by applying a recent algorithm from Bai et. al. that calculates the needed determinant efficiently. They demonstrate that the radiotracer is highly inhomogeneous in early time frames, as suspected in earlier reconstruction attempts that assumed a uniform intensity of radiotracer within the closed surface, and that the optimal choice of hyperparameters is substantially different for different time frames.
NASA Astrophysics Data System (ADS)
Mateus, Luis; Stollenwerk, Nico; Zambrini, Jean Claude
2012-09-01
We compare two stochastic epidemiological models in a Bayesian framework, both models performing on the same simulated data set. In some cases of data obtained under one model with specific parameter values the model comparison favours the model not underlying the simulated data.
Strelioff, Christopher C; Crutchfield, James P; Hübler, Alfred W
2007-07-01
Markov chains are a natural and well understood tool for describing one-dimensional patterns in time or space. We show how to infer kth order Markov chains, for arbitrary k , from finite data by applying Bayesian methods to both parameter estimation and model-order selection. Extending existing results for multinomial models of discrete data, we connect inference to statistical mechanics through information-theoretic (type theory) techniques. We establish a direct relationship between Bayesian evidence and the partition function which allows for straightforward calculation of the expectation and variance of the conditional relative entropy and the source entropy rate. Finally, we introduce a method that uses finite data-size scaling with model-order comparison to infer the structure of out-of-class processes.
Yi, Nengjun; Shriner, Daniel; Banerjee, Samprit; Mehta, Tapan; Pomp, Daniel; Yandell, Brian S.
2007-01-01
We extend our Bayesian model selection framework for mapping epistatic QTL in experimental crosses to include environmental effects and gene–environment interactions. We propose a new, fast Markov chain Monte Carlo algorithm to explore the posterior distribution of unknowns. In addition, we take advantage of any prior knowledge about genetic architecture to increase posterior probability on more probable models. These enhancements have significant computational advantages in models with many effects. We illustrate the proposed method by detecting new epistatic and gene–sex interactions for obesity-related traits in two real data sets of mice. Our method has been implemented in the freely available package R/qtlbim (http://www.qtlbim.org) to facilitate the general usage of the Bayesian methodology for genomewide interacting QTL analysis. PMID:17483424
Model Criticism of Bayesian Networks with Latent Variables.
ERIC Educational Resources Information Center
Williamson, David M.; Mislevy, Robert J.; Almond, Russell G.
This study investigated statistical methods for identifying errors in Bayesian networks (BN) with latent variables, as found in intelligent cognitive assessments. BN, commonly used in artificial intelligence systems, are promising mechanisms for scoring constructed-response examinations. The success of an intelligent assessment or tutoring system…
Bayesian Modeling in Institutional Research: An Example of Nonlinear Classification
ERIC Educational Resources Information Center
Xu, Yonghong Jade; Ishitani, Terry T.
2008-01-01
In recent years, rapid advancement has taken place in computing technology that allows institutional researchers to efficiently and effectively address data of increasing volume and structural complexity (Luan, 2002). In this chapter, the authors propose a new data analytical technique, Bayesian belief networks (BBN), to add to the toolbox for…
A Comparison of Imputation Methods for Bayesian Factor Analysis Models
ERIC Educational Resources Information Center
Merkle, Edgar C.
2011-01-01
Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…
Parameterizing Bayesian network Representations of Social-Behavioral Models by Expert Elicitation
Walsh, Stephen J.; Dalton, Angela C.; Whitney, Paul D.; White, Amanda M.
2010-05-23
Bayesian networks provide a general framework with which to model many natural phenomena. The mathematical nature of Bayesian networks enables a plethora of model validation and calibration techniques: e.g parameter estimation, goodness of fit tests, and diagnostic checking of the model assumptions. However, they are not free of shortcomings. Parameter estimation from relevant extant data is a common approach to calibrating the model parameters. In practice it is not uncommon to find oneself lacking adequate data to reliably estimate all model parameters. In this paper we present the early development of a novel application of conjoint analysis as a method for eliciting and modeling expert opinions and using the results in a methodology for calibrating the parameters of a Bayesian network.
Shen, Yanna; Cooper, Gregory F
2012-09-01
This paper investigates Bayesian modeling of known and unknown causes of events in the context of disease-outbreak detection. We introduce a multivariate Bayesian approach that models multiple evidential features of every person in the population. This approach models and detects (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A contribution of this paper is that it introduces a multivariate Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has general applicability in domains where the space of known causes is incomplete.
NASA Astrophysics Data System (ADS)
Placek, Ben; Knuth, Kevin H.; Angerhausen, Daniel
2014-11-01
EXONEST is an algorithm dedicated to detecting and characterizing the photometric signatures of exoplanets, which include reflection and thermal emission, Doppler boosting, and ellipsoidal variations. Using Bayesian inference, we can test between competing models that describe the data as well as estimate model parameters. We demonstrate this approach by testing circular versus eccentric planetary orbital models, as well as testing for the presence or absence of four photometric effects. In addition to using Bayesian model selection, a unique aspect of EXONEST is the potential capability to distinguish between reflective and thermal contributions to the light curve. A case study is presented using Kepler data recorded from the transiting planet KOI-13b. By considering only the nontransiting portions of the light curve, we demonstrate that it is possible to estimate the photometrically relevant model parameters of KOI-13b. Furthermore, Bayesian model testing confirms that the orbit of KOI-13b has a detectable eccentricity.
Placek, Ben; Knuth, Kevin H.; Angerhausen, Daniel E-mail: kknuth@albany.edu
2014-11-10
EXONEST is an algorithm dedicated to detecting and characterizing the photometric signatures of exoplanets, which include reflection and thermal emission, Doppler boosting, and ellipsoidal variations. Using Bayesian inference, we can test between competing models that describe the data as well as estimate model parameters. We demonstrate this approach by testing circular versus eccentric planetary orbital models, as well as testing for the presence or absence of four photometric effects. In addition to using Bayesian model selection, a unique aspect of EXONEST is the potential capability to distinguish between reflective and thermal contributions to the light curve. A case study is presented using Kepler data recorded from the transiting planet KOI-13b. By considering only the nontransiting portions of the light curve, we demonstrate that it is possible to estimate the photometrically relevant model parameters of KOI-13b. Furthermore, Bayesian model testing confirms that the orbit of KOI-13b has a detectable eccentricity.
A Bayesian hierarchical diffusion model decomposition of performance in Approach–Avoidance Tasks
Krypotos, Angelos-Miltiadis; Beckers, Tom; Kindt, Merel; Wagenmakers, Eric-Jan
2015-01-01
Common methods for analysing response time (RT) tasks, frequently used across different disciplines of psychology, suffer from a number of limitations such as the failure to directly measure the underlying latent processes of interest and the inability to take into account the uncertainty associated with each individual's point estimate of performance. Here, we discuss a Bayesian hierarchical diffusion model and apply it to RT data. This model allows researchers to decompose performance into meaningful psychological processes and to account optimally for individual differences and commonalities, even with relatively sparse data. We highlight the advantages of the Bayesian hierarchical diffusion model decomposition by applying it to performance on Approach–Avoidance Tasks, widely used in the emotion and psychopathology literature. Model fits for two experimental data-sets demonstrate that the model performs well. The Bayesian hierarchical diffusion model overcomes important limitations of current analysis procedures and provides deeper insight in latent psychological processes of interest. PMID:25491372
Number-Knower Levels in Young Children: Insights from Bayesian Modeling
ERIC Educational Resources Information Center
Lee, Michael D.; Sarnecka, Barbara W.
2011-01-01
Lee and Sarnecka (2010) developed a Bayesian model of young children's behavior on the Give-N test of number knowledge. This paper presents two new extensions of the model, and applies the model to new data. In the first extension, the model is used to evaluate competing theories about the conceptual knowledge underlying children's behavior. One,…
A Bayesian approach to model structural error and input variability in groundwater modeling
NASA Astrophysics Data System (ADS)
Xu, T.; Valocchi, A. J.; Lin, Y. F. F.; Liang, F.
2015-12-01
Effective water resource management typically relies on numerical models to analyze groundwater flow and solute transport processes. Model structural error (due to simplification and/or misrepresentation of the "true" environmental system) and input forcing variability (which commonly arises since some inputs are uncontrolled or estimated with high uncertainty) are ubiquitous in groundwater models. Calibration that overlooks errors in model structure and input data can lead to biased parameter estimates and compromised predictions. We present a fully Bayesian approach for a complete assessment of uncertainty for spatially distributed groundwater models. The approach explicitly recognizes stochastic input and uses data-driven error models based on nonparametric kernel methods to account for model structural error. We employ exploratory data analysis to assist in specifying informative prior for error models to improve identifiability. The inference is facilitated by an efficient sampling algorithm based on DREAM-ZS and a parameter subspace multiple-try strategy to reduce the required number of forward simulations of the groundwater model. We demonstrate the Bayesian approach through a synthetic case study of surface-ground water interaction under changing pumping conditions. It is found that explicit treatment of errors in model structure and input data (groundwater pumping rate) has substantial impact on the posterior distribution of groundwater model parameters. Using error models reduces predictive bias caused by parameter compensation. In addition, input variability increases parametric and predictive uncertainty. The Bayesian approach allows for a comparison among the contributions from various error sources, which could inform future model improvement and data collection efforts on how to best direct resources towards reducing predictive uncertainty.
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision. PMID:27303323
Abanto-Valle, C. A.; Bandyopadhyay, D.; Lachos, V. H.; Enriquez, I.
2009-01-01
A Bayesian analysis of stochastic volatility (SV) models using the class of symmetric scale mixtures of normal (SMN) distributions is considered. In the face of non-normality, this provides an appealing robust alternative to the routine use of the normal distribution. Specific distributions examined include the normal, student-t, slash and the variance gamma distributions. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo (MCMC) algorithm is introduced for parameter estimation. Moreover, the mixing parameters obtained as a by-product of the scale mixture representation can be used to identify outliers. The methods developed are applied to analyze daily stock returns data on S&P500 index. Bayesian model selection criteria as well as out-of- sample forecasting results reveal that the SV models based on heavy-tailed SMN distributions provide significant improvement in model fit as well as prediction to the S&P500 index data over the usual normal model. PMID:20730043
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Wöhling, Thomas; Samaniego, Luis; Nowak, Wolfgang
2014-12-01
Bayesian model selection or averaging objectively ranks a number of plausible, competing conceptual models based on Bayes' theorem. It implicitly performs an optimal trade-off between performance in fitting available data and minimum model complexity. The procedure requires determining Bayesian model evidence (BME), which is the likelihood of the observed data integrated over each model's parameter space. The computation of this integral is highly challenging because it is as high-dimensional as the number of model parameters. Three classes of techniques to compute BME are available, each with its own challenges and limitations: (1) Exact and fast analytical solutions are limited by strong assumptions. (2) Numerical evaluation quickly becomes unfeasible for expensive models. (3) Approximations known as information criteria (ICs) such as the AIC, BIC, or KIC (Akaike, Bayesian, or Kashyap information criterion, respectively) yield contradicting results with regard to model ranking. Our study features a theory-based intercomparison of these techniques. We further assess their accuracy in a simplistic synthetic example where for some scenarios an exact analytical solution exists. In more challenging scenarios, we use a brute-force Monte Carlo integration method as reference. We continue this analysis with a real-world application of hydrological model selection. This is a first-time benchmarking of the various methods for BME evaluation against true solutions. Results show that BME values from ICs are often heavily biased and that the choice of approximation method substantially influences the accuracy of model ranking. For reliable model selection, bias-free numerical methods should be preferred over ICs whenever computationally feasible.
Schöniger, Anneli; Wöhling, Thomas; Samaniego, Luis; Nowak, Wolfgang
2014-01-01
Bayesian model selection or averaging objectively ranks a number of plausible, competing conceptual models based on Bayes' theorem. It implicitly performs an optimal trade-off between performance in fitting available data and minimum model complexity. The procedure requires determining Bayesian model evidence (BME), which is the likelihood of the observed data integrated over each model's parameter space. The computation of this integral is highly challenging because it is as high-dimensional as the number of model parameters. Three classes of techniques to compute BME are available, each with its own challenges and limitations: (1) Exact and fast analytical solutions are limited by strong assumptions. (2) Numerical evaluation quickly becomes unfeasible for expensive models. (3) Approximations known as information criteria (ICs) such as the AIC, BIC, or KIC (Akaike, Bayesian, or Kashyap information criterion, respectively) yield contradicting results with regard to model ranking. Our study features a theory-based intercomparison of these techniques. We further assess their accuracy in a simplistic synthetic example where for some scenarios an exact analytical solution exists. In more challenging scenarios, we use a brute-force Monte Carlo integration method as reference. We continue this analysis with a real-world application of hydrological model selection. This is a first-time benchmarking of the various methods for BME evaluation against true solutions. Results show that BME values from ICs are often heavily biased and that the choice of approximation method substantially influences the accuracy of model ranking. For reliable model selection, bias-free numerical methods should be preferred over ICs whenever computationally feasible. PMID:25745272
Bayesian shared frailty models for regional inference about wildlife survival
Heisey, D.M.
2012-01-01
One can joke that 'exciting statistics' is an oxymoron, but it is neither a joke nor an exaggeration to say that these are exciting times to be involved in statistical ecology. As Halstead et al.'s (2012) paper nicely exemplifies, recently developed Bayesian analyses can now be used to extract insights from data using techniques that would have been unavailable to the ecological researcher just a decade ago. Some object to this, implying that the subjective priors of the Bayesian approach is the pathway to perdition (e.g. Lele & Dennis, 2009). It is reasonable to ask whether these new approaches are really giving us anything that we could not obtain with traditional tried-and-true frequentist approaches. I believe the answer is a clear yes.
Truth, models, model sets, AIC, and multimodel inference: a Bayesian perspective
Barker, Richard J.; Link, William A.
2015-01-01
Statistical inference begins with viewing data as realizations of stochastic processes. Mathematical models provide partial descriptions of these processes; inference is the process of using the data to obtain a more complete description of the stochastic processes. Wildlife and ecological scientists have become increasingly concerned with the conditional nature of model-based inference: what if the model is wrong? Over the last 2 decades, Akaike's Information Criterion (AIC) has been widely and increasingly used in wildlife statistics for 2 related purposes, first for model choice and second to quantify model uncertainty. We argue that for the second of these purposes, the Bayesian paradigm provides the natural framework for describing uncertainty associated with model choice and provides the most easily communicated basis for model weighting. Moreover, Bayesian arguments provide the sole justification for interpreting model weights (including AIC weights) as coherent (mathematically self consistent) model probabilities. This interpretation requires treating the model as an exact description of the data-generating mechanism. We discuss the implications of this assumption, and conclude that more emphasis is needed on model checking to provide confidence in the quality of inference.
ERIC Educational Resources Information Center
Karl, Andrew T.; Yang, Yan; Lohr, Sharon L.
2013-01-01
Value-added models have been widely used to assess the contributions of individual teachers and schools to students' academic growth based on longitudinal student achievement outcomes. There is concern, however, that ignoring the presence of missing values, which are common in longitudinal studies, can bias teachers' value-added scores.…
Bayesian Comparison of Alternative Graded Response Models for Performance Assessment Applications
ERIC Educational Resources Information Center
Zhu, Xiaowen; Stone, Clement A.
2012-01-01
This study examined the relative effectiveness of Bayesian model comparison methods in selecting an appropriate graded response (GR) model for performance assessment applications. Three popular methods were considered: deviance information criterion (DIC), conditional predictive ordinate (CPO), and posterior predictive model checking (PPMC). Using…
ERIC Educational Resources Information Center
Natesan, Prathiba; Limbers, Christine; Varni, James W.
2010-01-01
The present study presents the formulation of graded response models in the multilevel framework (as nonlinear mixed models) and demonstrates their use in estimating item parameters and investigating the group-level effects for specific covariates using Bayesian estimation. The graded response multilevel model (GRMM) combines the formulation of…
NASA Astrophysics Data System (ADS)
Werner, Johannes; Tingley, Martin
2015-04-01
Reconstructions of late-Holocene climate rely heavily upon proxies that are assumed to be accurately dated by layer counting, such as measurement on tree rings, ice cores, and varved lake sediments. Considerable advances may be achievable if time uncertain proxies could be included within these multiproxy reconstructions, and if time uncertainties were recognized and correctly modeled for proxies commonly treated as free of age model errors. Current approaches to accounting for time uncertainty are generally limited to repeating the reconstruction using each of an ensemble of age models, thereby inflating the final estimated uncertainty - in effect, each possible age model is given equal weighting. Uncertainties can be reduced by exploiting the inferred space-time covariance structure of the climate to re-weight the possible age models. Here we demonstrate how Bayesian Hierarchical climate reconstruction models can be augmented to account for time uncertain proxies. Critically, while a priori all age models are given equal probability of being correct, the probabilities associated with the age models are formally updated within the Bayesian framework, thereby reducing uncertainties. Numerical experiments show that updating the age model probabilities decreases uncertainty in the climate reconstruction, as compared with the current de-facto standard of sampling over all age models, provided there is sufficient information from other data sources in the region of the time-uncertain proxy. This approach can readily be generalized to non-layer counted proxies, such as those derived from marine sediments. Werner and Tingley, Climate of the Past Discussions (2014)
NASA Astrophysics Data System (ADS)
Werner, J. P.; Tingley, M. P.
2015-03-01
Reconstructions of the late-Holocene climate rely heavily upon proxies that are assumed to be accurately dated by layer counting, such as measurements of tree rings, ice cores, and varved lake sediments. Considerable advances could be achieved if time-uncertain proxies were able to be included within these multiproxy reconstructions, and if time uncertainties were recognized and correctly modeled for proxies commonly treated as free of age model errors. Current approaches for accounting for time uncertainty are generally limited to repeating the reconstruction using each one of an ensemble of age models, thereby inflating the final estimated uncertainty - in effect, each possible age model is given equal weighting. Uncertainties can be reduced by exploiting the inferred space-time covariance structure of the climate to re-weight the possible age models. Here, we demonstrate how Bayesian hierarchical climate reconstruction models can be augmented to account for time-uncertain proxies. Critically, although a priori all age models are given equal probability of being correct, the probabilities associated with the age models are formally updated within the Bayesian framework, thereby reducing uncertainties. Numerical experiments show that updating the age model probabilities decreases uncertainty in the resulting reconstructions, as compared with the current de facto standard of sampling over all age models, provided there is sufficient information from other data sources in the spatial region of the time-uncertain proxy. This approach can readily be generalized to non-layer-counted proxies, such as those derived from marine sediments.
NASA Astrophysics Data System (ADS)
Werner, J. P.; Tingley, M. P.
2014-12-01
Reconstructions of late-Holocene climate rely heavily upon proxies that are assumed to be accurately dated by layer counting, such as measurement on tree rings, ice cores, and varved lake sediments. Considerable advances may be achievable if time uncertain proxies could be included within these multiproxy reconstructions, and if time uncertainties were recognized and correctly modeled for proxies commonly treated as free of age model errors. Current approaches to accounting for time uncertainty are generally limited to repeating the reconstruction using each of an ensemble of age models, thereby inflating the final estimated uncertainty - in effect, each possible age model is given equal weighting. Uncertainties can be reduced by exploiting the inferred space-time covariance structure of the climate to re-weight the possible age models. Here we demonstrate how Bayesian Hierarchical climate reconstruction models can be augmented to account for time uncertain proxies. Critically, while a priori all age models are given equal probability of being correct, the probabilities associated with the age models are formally updated within the Bayesian framework, thereby reducing uncertainties. Numerical experiments show that updating the age-model probabilities decreases uncertainty in the climate reconstruction, as compared with the current de-facto standard of sampling over all age models, provided there is sufficient information from other data sources in the region of the time-uncertain proxy. This approach can readily be generalized to non-layer counted proxies, such as those derived from marine sediments.
D. L. Kelly
2007-06-01
Markov chain Monte Carlo (MCMC) techniques represent an extremely flexible and powerful approach to Bayesian modeling. This work illustrates the application of such techniques to time-dependent reliability of components with repair. The WinBUGS package is used to illustrate, via examples, how Bayesian techniques can be used for parametric statistical modeling of time-dependent component reliability. Additionally, the crucial, but often overlooked subject of model validation is discussed, and summary statistics for judging the model’s ability to replicate the observed data are developed, based on the posterior predictive distribution for the parameters of interest.
Bayesian model selection for a finite element model of a large civil aircraft
Hemez, F. M.; Rutherford, A. C.
2004-01-01
Nine aircraft stiffness parameters have been varied and used as inputs to a finite element model of an aircraft to generate natural frequency and deflection features (Goge, 2003). This data set (147 input parameter configurations and associated outputs) is now used to generate a metamodel, or a fast running surrogate model, using Bayesian model selection methods. Once a forward relationship is defined, the metamodel may be used in an inverse sense. That is, knowing the measured output frequencies and deflections, what were the input stiffness parameters that caused them?
Ma, Xiaoye; Chen, Yong; Cole, Stephen R; Chu, Haitao
2016-12-01
To account for between-study heterogeneity in meta-analysis of diagnostic accuracy studies, bivariate random effects models have been recommended to jointly model the sensitivities and specificities. As study design and population vary, the definition of disease status or severity could differ across studies. Consequently, sensitivity and specificity may be correlated with disease prevalence. To account for this dependence, a trivariate random effects model had been proposed. However, the proposed approach can only include cohort studies with information estimating study-specific disease prevalence. In addition, some diagnostic accuracy studies only select a subset of samples to be verified by the reference test. It is known that ignoring unverified subjects may lead to partial verification bias in the estimation of prevalence, sensitivities, and specificities in a single study. However, the impact of this bias on a meta-analysis has not been investigated. In this paper, we propose a novel hybrid Bayesian hierarchical model combining cohort and case-control studies and correcting partial verification bias at the same time. We investigate the performance of the proposed methods through a set of simulation studies. Two case studies on assessing the diagnostic accuracy of gadolinium-enhanced magnetic resonance imaging in detecting lymph node metastases and of adrenal fluorine-18 fluorodeoxyglucose positron emission tomography in characterizing adrenal masses are presented.
Mansourian, Marjan; Mahdiyeh, Zahra; Park, Jongbae J; Haghjooyejavanmard, Shaghayegh
2013-01-01
Background: To investigate the respective contribution of various biologic and psychosocial factors, especially Health Related Quality of Life (HRQOL) as a main outcome, in the natural history of acute low back pain (LBP) and to evaluate the impact of this condition on HRQOL. Methods: In a prospective cohort study For 24 weeks, 150 patients were assessed at an outpatient clinic in Korea consulting for low back and confirmed disc herniation duration at inclusion and treated with treatment package comprised of herbal medicines, acupuncture, bee venom acupuncture, and a Korean version of spinal manipulation (Chuna). Study participants were evaluated at baseline and every 4 weeks for 24 weeks. Low back intensity levels were measured on a visual analog scale (0-10), back function was evaluated with the Oswestry Disability Index (0-100), disability assessed by HRQOL assessed by the short form 36 health survey (0-100 in 8 different sub-categories). Results: Out of 150 patients, 128 completed the 24 weeks of traditional therapy. Patients reported improvements SF-36 outcome measures. At the completion of the study, low back pain scores improved by a mean of 3.3 (95% CI = 2.8 to 3.8). According to the results of our modeling, low back intensity level, back function and BMI measures had significant effects on quality of life during study. Interpreting the coefficients of modeling, the impact of the decreasing acute LBP episode on HRQOL by VAS and ODI outcomes, was high and important. Conclusions: This study highlights the large contribution of integrative package therapy as an effective preventive method for improving LBP patient's HRQOL. PMID:23626884
Simplifying Probability Elicitation and Uncertainty Modeling in Bayesian Networks
Paulson, Patrick R; Carroll, Thomas E; Sivaraman, Chitra; Neorr, Peter A; Unwin, Stephen D; Hossain, Shamina S
2011-04-16
In this paper we contribute two methods that simplify the demands of knowledge elicitation for particular types of Bayesian networks. The first method simplify the task of providing probabilities when the states that a random variable takes can be described by a new, fully ordered state set in which a state implies all the preceding states. The second method leverages Dempster-Shafer theory of evidence to provide a way for the expert to express the degree of ignorance that they feel about the estimates being provided.
Modelling the presence of disease under spatial misalignment using Bayesian latent Gaussian models.
Barber, Xavier; Conesa, David; Lladosa, Silvia; López-Quílez, Antonio
2016-04-18
Modelling patterns of the spatial incidence of diseases using local environmental factors has been a growing problem in the last few years. Geostatistical models have become popular lately because they allow estimating and predicting the underlying disease risk and relating it with possible risk factors. Our approach to these models is based on the fact that the presence/absence of a disease can be expressed with a hierarchical Bayesian spatial model that incorporates the information provided by the geographical and environmental characteristics of the region of interest. Nevertheless, our main interest here is to tackle the misalignment problem arising when information about possible covariates are partially (or totally) different than those of the observed locations and those in which we want to predict. As a result, we present two different models depending on the fact that there is uncertainty on the covariates or not. In both cases, Bayesian inference on the parameters and prediction of presence/absence in new locations are made by considering the model as a latent Gaussian model, which allows the use of the integrated nested Laplace approximation. In particular, the spatial effect is implemented with the stochastic partial differential equation approach. The methodology is evaluated on the presence of the Fasciola hepatica in Galicia, a North-West region of Spain.
Bayesian conditional-independence modeling of the AIDS epidemic in England and Wales
NASA Astrophysics Data System (ADS)
Gilks, Walter R.; De Angelis, Daniela; Day, Nicholas E.
We describe the use of conditional-independence modeling, Bayesian inference and Markov chain Monte Carlo, to model and project the HIV-AIDS epidemic in homosexual/bisexual males in England and Wales. Complexity in this analysis arises through selectively missing data, indirectly observed underlying processes, and measurement error. Our emphasis is on presentation and discussion of the concepts, not on the technicalities of this analysis, which can be found elsewhere [D. De Angelis, W.R. Gilks, N.E. Day, Bayesian projection of the the acquired immune deficiency syndrome epidemic (with discussion), Applied Statistics, in press].
Parameter Expanded Algorithms for Bayesian Latent Variable Modeling of Genetic Pleiotropy Data.
Xu, Lizhen; Craiu, Radu V; Sun, Lei; Paterson, Andrew D
2016-01-01
Motivated by genetic association studies of pleiotropy, we propose a Bayesian latent variable approach to jointly study multiple outcomes. The models studied here can incorporate both continuous and binary responses, and can account for serial and cluster correlations. We consider Bayesian estimation for the model parameters, and we develop a novel MCMC algorithm that builds upon hierarchical centering and parameter expansion techniques to efficiently sample from the posterior distribution. We evaluate the proposed method via extensive simulations and demonstrate its utility with an application to aa association study of various complication outcomes related to type 1 diabetes. This article has supplementary material online.
Bayesian non-parametric inference for stochastic epidemic models using Gaussian Processes
Xu, Xiaoguang; Kypraios, Theodore; O'Neill, Philip D.
2016-01-01
This paper considers novel Bayesian non-parametric methods for stochastic epidemic models. Many standard modeling and data analysis methods use underlying assumptions (e.g. concerning the rate at which new cases of disease will occur) which are rarely challenged or tested in practice. To relax these assumptions, we develop a Bayesian non-parametric approach using Gaussian Processes, specifically to estimate the infection process. The methods are illustrated with both simulated and real data sets, the former illustrating that the methods can recover the true infection process quite well in practice, and the latter illustrating that the methods can be successfully applied in different settings. PMID:26993062
Model Diagnostics for Bayesian Networks. Research Report. ETS RR-04-17
ERIC Educational Resources Information Center
Sinharay, Sandip
2004-01-01
Assessing fit of psychometric models has always been an issue of enormous interest, but there exists no unanimously agreed upon item fit diagnostic for the models. Bayesian networks, frequently used in educational assessments (see, for example, Mislevy, Almond, Yan, & Steinberg, 2001) primarily for learning about students' knowledge and…
A Bayesian Multi-Level Factor Analytic Model of Consumer Price Sensitivities across Categories
ERIC Educational Resources Information Center
Duvvuri, Sri Devi; Gruca, Thomas S.
2010-01-01
Identifying price sensitive consumers is an important problem in marketing. We develop a Bayesian multi-level factor analytic model of the covariation among household-level price sensitivities across product categories that are substitutes. Based on a multivariate probit model of category incidence, this framework also allows the researcher to…
ERIC Educational Resources Information Center
Kessler, Lawrence M.
2013-01-01
In this paper I propose Bayesian estimation of a nonlinear panel data model with a fractional dependent variable (bounded between 0 and 1). Specifically, I estimate a panel data fractional probit model which takes into account the bounded nature of the fractional response variable. I outline estimation under the assumption of strict exogeneity as…
A Test of Bayesian Observer Models of Processing in the Eriksen Flanker Task
ERIC Educational Resources Information Center
White, Corey N.; Brown, Scott; Ratcliff, Roger
2012-01-01
Two Bayesian observer models were recently proposed to account for data from the Eriksen flanker task, in which flanking items interfere with processing of a central target. One model assumes that interference stems from a perceptual bias to process nearby items as if they are compatible, and the other assumes that the interference is due to…
ERIC Educational Resources Information Center
Story, Roger E.
1996-01-01
Discussion of the use of Latent Semantic Indexing to determine relevancy in information retrieval focuses on statistical regression and Bayesian methods. Topics include keyword searching; a multiple regression model; how the regression model can aid search methods; and limitations of this approach, including complexity, linearity, and…
Baird, Stuart J E; Santos, Filipe
2010-09-01
Approximate Bayesian computation (ABC) substitutes simulation for analytic models in Bayesian inference. Simulating evolutionary scenarios under Kimura's stepping stone model (KSS) might therefore allow inference over spatial genetic process where analytical results are difficult to obtain. ABC first creates a reference set of simulations and would proceed by comparing summary statistics over KSS simulations to summary statistics from localities sampled in the field, but: comparison of which localities and stepping stones? Identical stepping stones can be arranged so two localities fall in the same stepping stone, nearest or diagonal neighbours, or without contact. None is intrinsically correct, yet some choice must be made and this affects inference. We explore a Bayesian strategy for mapping field observations onto discrete stepping stones. We make Sundial, for projecting field data onto the plane, available. We generalize KSS over regular tilings of the plane. We show Bayesian averaging over the mapping between a continuous field area and discrete stepping stones improves the fit between KSS and isolation by distance expectations. We make Tiler Durden available for carrying out this Bayesian averaging. We describe a novel parameterization of KSS based on Wright's neighbourhood size, placing an upper bound on the geographic area represented by a stepping stone and make it available as m Vector. We generalize spatial coalescence recursions to continuous and discrete space cases and use these to numerically solve for KSS coalescence previously examined only using simulation. We thus provide applied and analytical resources for comparison of stepping stone simulations with field observations.
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2012-01-01
Many successful segmentation algorithms are based on Bayesian models in which prior anatomical knowledge is combined with the available image information. However, these methods typically have many free parameters that are estimated to obtain point estimates only, whereas a faithful Bayesian analysis would also consider all possible alternate values these parameters may take. In this paper, we propose to incorporate the uncertainty of the free parameters in Bayesian segmentation models more accurately by using Monte Carlo sampling. We demonstrate our technique by sampling atlas warps in a recent method for hippocampal subfield segmentation, and show a significant improvement in an Alzheimer's disease classification task. As an additional benefit, the method also yields informative "error bars" on the segmentation results for each of the individual sub-structures.
NASA Astrophysics Data System (ADS)
Xu, Tianfang; Valocchi, Albert J.
2015-11-01
Numerical groundwater flow and solute transport models are usually subject to model structural error due to simplification and/or misrepresentation of the real system, which raises questions regarding the suitability of conventional least squares regression-based (LSR) calibration. We present a new framework that explicitly describes the model structural error statistically in an inductive, data-driven way. We adopt a fully Bayesian approach that integrates Gaussian process error models into the calibration, prediction, and uncertainty analysis of groundwater flow models. We test the usefulness of the fully Bayesian approach with a synthetic case study of the impact of pumping on surface-ground water interaction. We illustrate through this example that the Bayesian parameter posterior distributions differ significantly from parameters estimated by conventional LSR, which does not account for model structural error. For the latter method, parameter compensation for model structural error leads to biased, overconfident prediction under changing pumping condition. In contrast, integrating Gaussian process error models significantly reduces predictive bias and leads to prediction intervals that are more consistent with validation data. Finally, we carry out a generalized LSR recalibration step to assimilate the Bayesian prediction while preserving mass conservation and other physical constraints, using a full error covariance matrix obtained from Bayesian results. It is found that the recalibrated model achieved lower predictive bias compared to the model calibrated using conventional LSR. The results highlight the importance of explicit treatment of model structural error especially in circumstances where subsequent decision-making and risk analysis require accurate prediction and uncertainty quantification.
2014-01-01
Background Transmission models can aid understanding of disease dynamics and are useful in testing the efficiency of control measures. The aim of this study was to formulate an appropriate stochastic Susceptible-Infectious-Resistant/Carrier (SIR) model for Salmonella Typhimurium in pigs and thus estimate the transmission parameters between states. Results The transmission parameters were estimated using data from a longitudinal study of three Danish farrow-to-finish pig herds known to be infected. A Bayesian model framework was proposed, which comprised Binomial components for the transition from susceptible to infectious and from infectious to carrier; and a Poisson component for carrier to infectious. Cohort random effects were incorporated into these models to allow for unobserved cohort-specific variables as well as unobserved sources of transmission, thus enabling a more realistic estimation of the transmission parameters. In the case of the transition from susceptible to infectious, the cohort random effects were also time varying. The number of infectious pigs not detected by the parallel testing was treated as unknown, and the probability of non-detection was estimated using information about the sensitivity and specificity of the bacteriological and serological tests. The estimate of the transmission rate from susceptible to infectious was 0.33 [0.06, 1.52], from infectious to carrier was 0.18 [0.14, 0.23] and from carrier to infectious was 0.01 [0.0001, 0.04]. The estimate for the basic reproduction ration (R 0 ) was 1.91 [0.78, 5.24]. The probability of non-detection was estimated to be 0.18 [0.12, 0.25]. Conclusions The proposed framework for stochastic SIR models was successfully implemented to estimate transmission rate parameters for Salmonella Typhimurium in swine field data. R 0 was 1.91, implying that there was dissemination of the infection within pigs of the same cohort. There was significant temporal-cohort variability, especially at the
Assessment of uncertainty in chemical models by Bayesian probabilities: Why, when, how?
NASA Astrophysics Data System (ADS)
Sahlin, Ullrika
2015-07-01
A prediction of a chemical property or activity is subject to uncertainty. Which type of uncertainties to consider, whether to account for them in a differentiated manner and with which methods, depends on the practical context. In chemical modelling, general guidance of the assessment of uncertainty is hindered by the high variety in underlying modelling algorithms, high-dimensionality problems, the acknowledgement of both qualitative and quantitative dimensions of uncertainty, and the fact that statistics offers alternative principles for uncertainty quantification. Here, a view of the assessment of uncertainty in predictions is presented with the aim to overcome these issues. The assessment sets out to quantify uncertainty representing error in predictions and is based on probability modelling of errors where uncertainty is measured by Bayesian probabilities. Even though well motivated, the choice to use Bayesian probabilities is a challenge to statistics and chemical modelling. Fully Bayesian modelling, Bayesian meta-modelling and bootstrapping are discussed as possible approaches. Deciding how to assess uncertainty is an active choice, and should not be constrained by traditions or lack of validated and reliable ways of doing it.
Automated parameter estimation for biological models using Bayesian statistical model checking
2015-01-01
Background Probabilistic models have gained widespread acceptance in the systems biology community as a useful way to represent complex biological systems. Such models are developed using existing knowledge of the structure and dynamics of the system, experimental observations, and inferences drawn from statistical analysis of empirical data. A key bottleneck in building such models is that some system variables cannot be measured experimentally. These variables are incorporated into the model as numerical parameters. Determining values of these parameters that justify existing experiments and provide reliable predictions when model simulations are performed is a key research problem. Domain experts usually estimate the values of these parameters by fitting the model to experimental data. Model fitting is usually expressed as an optimization problem that requires minimizing a cost-function which measures some notion of distance between the model and the data. This optimization problem is often solved by combining local and global search methods that tend to perform well for the specific application domain. When some prior information about parameters is available, methods such as Bayesian inference are commonly used for parameter learning. Choosing the appropriate parameter search technique requires detailed domain knowledge and insight into the underlying system. Results Using an agent-based model of the dynamics of acute inflammation, we demonstrate a novel parameter estimation algorithm by discovering the amount and schedule of doses of bacterial lipopolysaccharide that guarantee a set of observed clinical outcomes with high probability. We synthesized values of twenty-eight unknown parameters such that the parameterized model instantiated with these parameter values satisfies four specifications describing the dynamic behavior of the model. Conclusions We have developed a new algorithmic technique for discovering parameters in complex stochastic models of
Naznin, Farhana; Currie, Graham; Logan, David; Sarvi, Majid
2016-07-01
Safety is a key concern in the design, operation and development of light rail systems including trams or streetcars as they impose crash risks on road users in terms of crash frequency and severity. The aim of this study is to identify key traffic, transit and route factors that influence tram-involved crash frequencies along tram route sections in Melbourne. A random effects negative binomial (RENB) regression model was developed to analyze crash frequency data obtained from Yarra Trams, the tram operator in Melbourne. The RENB modelling approach can account for spatial and temporal variations within observation groups in panel count data structures by assuming that group specific effects are randomly distributed across locations. The results identify many significant factors effecting tram-involved crash frequency including tram service frequency (2.71), tram stop spacing (-0.42), tram route section length (0.31), tram signal priority (-0.25), general traffic volume (0.18), tram lane priority (-0.15) and ratio of platform tram stops (-0.09). Findings provide useful insights on route section level tram-involved crashes in an urban tram or streetcar operating environment. The method described represents a useful planning tool for transit agencies hoping to improve safety performance.
NASA Astrophysics Data System (ADS)
Pham, Hai V.; Tsai, Frank T.-C.
2015-09-01
The lack of hydrogeological data and knowledge often results in different propositions (or alternatives) to represent uncertain model components and creates many candidate groundwater models using the same data. Uncertainty of groundwater head prediction may become unnecessarily high. This study introduces an experimental design to identify propositions in each uncertain model component and decrease the prediction uncertainty by reducing conceptual model uncertainty. A discrimination criterion is developed based on posterior model probability that directly uses data to evaluate model importance. Bayesian model averaging (BMA) is used to predict future observation data. The experimental design aims to find the optimal number and location of future observations and the number of sampling rounds such that the desired discrimination criterion is met. Hierarchical Bayesian model averaging (HBMA) is adopted to assess if highly probable propositions can be identified and the conceptual model uncertainty can be reduced by the experimental design. The experimental design is implemented to a groundwater study in the Baton Rouge area, Louisiana. We design a new groundwater head observation network based on existing USGS observation wells. The sources of uncertainty that create multiple groundwater models are geological architecture, boundary condition, and fault permeability architecture. All possible design solutions are enumerated using a multi-core supercomputer. Several design solutions are found to achieve an 80%-identifiable groundwater model in 5 years by using six or more existing USGS wells. The HBMA result shows that each highly probable proposition can be identified for each uncertain model component once the discrimination criterion is achieved. The variances of groundwater head predictions are significantly decreased by reducing posterior model probabilities of unimportant propositions.
Bayesian-MCMC-based parameter estimation of stealth aircraft RCS models
NASA Astrophysics Data System (ADS)
Xia, Wei; Dai, Xiao-Xia; Feng, Yuan
2015-12-01
When modeling a stealth aircraft with low RCS (Radar Cross Section), conventional parameter estimation methods may cause a deviation from the actual distribution, owing to the fact that the characteristic parameters are estimated via directly calculating the statistics of RCS. The Bayesian-Markov Chain Monte Carlo (Bayesian-MCMC) method is introduced herein to estimate the parameters so as to improve the fitting accuracies of fluctuation models. The parameter estimations of the lognormal and the Legendre polynomial models are reformulated in the Bayesian framework. The MCMC algorithm is then adopted to calculate the parameter estimates. Numerical results show that the distribution curves obtained by the proposed method exhibit improved consistence with the actual ones, compared with those fitted by the conventional method. The fitting accuracy could be improved by no less than 25% for both fluctuation models, which implies that the Bayesian-MCMC method might be a good candidate among the optimal parameter estimation methods for stealth aircraft RCS models. Project supported by the National Natural Science Foundation of China (Grant No. 61101173), the National Basic Research Program of China (Grant No. 613206), the National High Technology Research and Development Program of China (Grant No. 2012AA01A308), the State Scholarship Fund by the China Scholarship Council (CSC), and the Oversea Academic Training Funds, and University of Electronic Science and Technology of China (UESTC).
A Bayesian approach to the semi-analytic model of galaxy formation
NASA Astrophysics Data System (ADS)
Lu, Yu
It is believed that a wide range of physical processes conspire to shape the observed galaxy population but it remains unsure of their detailed interactions. The semi-analytic model (SAM) of galaxy formation uses multi-dimensional parameterizations of the physical processes of galaxy formation and provides a tool to constrain these underlying physical interactions. Because of the high dimensionality and large uncertainties in the model, the parametric problem of galaxy formation can be profitably tackled with a Bayesian-inference based approach, which allows one to constrain theory with data in a statistically rigorous way. In this thesis, I present a newly developed method to build SAM upon the framework of Bayesian inference. I show that, aided by advanced Markov-Chain Monte-Carlo algorithms, the method has the power to efficiently combine information from diverse data sources, rigorously establish confidence bounds on model parameters, and provide powerful probability-based methods for hypothesis test. Using various data sets (stellar mass function, conditional stellar mass function, K-band luminosity function, and cold gas mass functions) of galaxies in the local Universe, I carry out a series of Bayesian model inferences. The results show that SAM contains huge degeneracies among its parameters, indicating that some of the conclusions drawn previously with the conventional approach may not be truly valid but need to be revisited by the Bayesian approach. Second, some of the degeneracy of the model can be broken by adopting multiple data sets that constrain different aspects of the galaxy population. Third, the inferences reveal that model has challenge to simultaneously explain some important observational results, suggesting that some key physics governing the evolution of star formation and feedback may still be missing from the model. These analyses show clearly that the Bayesian inference based SAM can be used to perform systematic and statistically
Likelihood-free Bayesian computation for structural model calibration: a feasibility study
NASA Astrophysics Data System (ADS)
Jin, Seung-Seop; Jung, Hyung-Jo
2016-04-01
Finite element (FE) model updating is often used to associate FE models with corresponding existing structures for the condition assessment. FE model updating is an inverse problem and prone to be ill-posed and ill-conditioning when there are many errors and uncertainties in both an FE model and its corresponding measurements. In this case, it is important to quantify these uncertainties properly. Bayesian FE model updating is one of the well-known methods to quantify parameter uncertainty by updating our prior belief on the parameters with the available measurements. In Bayesian inference, likelihood plays a central role in summarizing the overall residuals between model predictions and corresponding measurements. Therefore, likelihood should be carefully chosen to reflect the characteristics of the residuals. It is generally known that very little or no information is available regarding the statistical characteristics of the residuals. In most cases, the likelihood is assumed to be the independent identically distributed Gaussian distribution with the zero mean and constant variance. However, this assumption may cause biased and over/underestimated estimates of parameters, so that the uncertainty quantification and prediction are questionable. To alleviate the potential misuse of the inadequate likelihood, this study introduced approximate Bayesian computation (i.e., likelihood-free Bayesian inference), which relaxes the need for an explicit likelihood by analyzing the behavior similarities between model predictions and measurements. We performed FE model updating based on likelihood-free Markov chain Monte Carlo (MCMC) without using the likelihood. Based on the result of the numerical study, we observed that the likelihood-free Bayesian computation can quantify the updating parameters correctly and its predictive capability for the measurements, not used in calibrated, is also secured.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio (PR) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95%CI: 1.005-1.265), 1.128(95%CI: 1.001-1.264) and 1.132(95%CI: 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95% CI: 1.055-1.206) and 1.126(95% CI: 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR, which was 1.125 (95%CI: 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR. Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Bayesian Modeling of Population Variability -- Practical Guidance and Pitfalls
Dana L. Kelly; Corwin L. Atwood
2008-05-01
With the advent of easy-to-use open-source software for Markov chain Monte Carlo (MCMC) simulation, hierarchical Bayesian analysis is gaining in popularity. This paper presents practical guidance for hierarchical Bayes analysis of typical problems in probabilistic safety assessment (PSA). The guidance is related to choosing parameterizations that accelerate convergence of the MCMC sampling and to illustrating the potential sensitivity of the results to the functional form chosen for the first-stage prior. This latter issue has significant ramifications because the mean of the average population variability curve (PVC) from hierarchical Bayes (or the mean of the point estimate distribution from empirical Bayes) can be very sensitive to this choice in cases where variability is large. Numerical examples are provided to illustrate the issues discussed.
Using Bayesian Model Selection to Characterize Neonatal Eeg Recordings
NASA Astrophysics Data System (ADS)
Mitchell, Timothy J.
2009-12-01
The brains of premature infants must undergo significant maturation outside of the womb and are thus particularly susceptible to injury. Electroencephalographic (EEG) recordings are an important diagnostic tool in determining if a newborn's brain is functioning normally or if injury has occurred. However, interpreting the recordings is difficult and requires the skills of a trained electroencephelographer. Because these EEG specialists are rare, an automated interpretation of newborn EEG recordings would increase access to an important diagnostic tool for physicians. To automate this procedure, we employ Bayesian probability theory to compute the posterior probability for the EEG features of interest and use the results in a program designed to mimic EEG specialists. Specifically, we will be identifying waveforms of varying frequency and amplitude, as well as periods of flat recordings where brain activity is minimal.
Berhane, Kiros; Molitor, Nuoo-Ting
2008-10-01
Flexible multilevel models are proposed to allow for cluster-specific smooth estimation of growth curves in a mixed-effects modeling format that includes subject-specific random effects on the growth parameters. Attention is then focused on models that examine between-cluster comparisons of the effects of an ecologic covariate of interest (e.g. air pollution) on nonlinear functionals of growth curves (e.g. maximum rate of growth). A Gibbs sampling approach is used to get posterior mean estimates of nonlinear functionals along with their uncertainty estimates. A second-stage ecologic random-effects model is used to examine the association between a covariate of interest (e.g. air pollution) and the nonlinear functionals. A unified estimation procedure is presented along with its computational and theoretical details. The models are motivated by, and illustrated with, lung function and air pollution data from the Southern California Children's Health Study.
Höhna, Sebastian; Landis, Michael J; Heath, Tracy A; Boussau, Bastien; Lartillot, Nicolas; Moore, Brian R; Huelsenbeck, John P; Ronquist, Fredrik
2016-07-01
Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.].
Dynamical modelling of NGC 6809: selecting the best model using Bayesian inference
NASA Astrophysics Data System (ADS)
Diakogiannis, Foivos I.; Lewis, Geraint F.; Ibata, Rodrigo A.
2014-02-01
The precise cosmological origin of globular clusters remains uncertain, a situation hampered by the struggle of observational approaches in conclusively identifying the presence, or not, of dark matter in these systems. In this paper, we address this question through an analysis of the particular case of NGC 6809. While previous studies have performed dynamical modelling of this globular cluster using a small number of available kinematic data, they did not perform appropriate statistical inference tests for the choice of best model description; such statistical inference for model selection is important since, in general, different models can result in significantly different inferred quantities. With the latest kinematic data, we use Bayesian inference tests for model selection and thus obtain the best-fitting models, as well as mass and dynamic mass-to-light ratio estimates. For this, we introduce a new likelihood function that provides more constrained distributions for the defining parameters of dynamical models. Initially, we consider models with a known distribution function, and then model the cluster using solutions of the spherically symmetric Jeans equation; this latter approach depends upon the mass density profile and anisotropy β parameter. In order to find the best description for the cluster we compare these models by calculating their Bayesian evidence. We find smaller mass and dynamic mass-to-light ratio values than previous studies, with the best-fitting Michie model for a constant mass-to-light ratio of Upsilon = 0.90^{+0.14}_{-0.14} and M_{dyn}=6.10^{+0.51}_{-0.88} × 10^4 M_{{⊙}}. We exclude the significant presence of dark matter throughout the cluster, showing that no physically motivated distribution of dark matter can be present away from the cluster core.
Höhna, Sebastian; Landis, Michael J.
2016-01-01
Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com. [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.] PMID:27235697
Bayesian state space models for dynamic genetic network construction across multiple tissues.
Liang, Yulan; Kelemen, Arpad
2016-08-01
Construction of gene-gene interaction networks and potential pathways is a challenging and important problem in genomic research for complex diseases while estimating the dynamic changes of the temporal correlations and non-stationarity are the keys in this process. In this paper, we develop dynamic state space models with hierarchical Bayesian settings to tackle this challenge for inferring the dynamic profiles and genetic networks associated with disease treatments. We treat both the stochastic transition matrix and the observation matrix time-variant and include temporal correlation structures in the covariance matrix estimations in the multivariate Bayesian state space models. The unevenly spaced short time courses with unseen time points are treated as hidden state variables. Hierarchical Bayesian approaches with various prior and hyper-prior models with Monte Carlo Markov Chain and Gibbs sampling algorithms are used to estimate the model parameters and the hidden state variables. We apply the proposed Hierarchical Bayesian state space models to multiple tissues (liver, skeletal muscle, and kidney) Affymetrix time course data sets following corticosteroid (CS) drug administration. Both simulation and real data analysis results show that the genomic changes over time and gene-gene interaction in response to CS treatment can be well captured by the proposed models. The proposed dynamic Hierarchical Bayesian state space modeling approaches could be expanded and applied to other large scale genomic data, such as next generation sequence (NGS) combined with real time and time varying electronic health record (EHR) for more comprehensive and robust systematic and network based analysis in order to transform big biomedical data into predictions and diagnostics for precision medicine and personalized healthcare with better decision making and patient outcomes.
Medical Inpatient Journey Modeling and Clustering: A Bayesian Hidden Markov Model Based Approach
Huang, Zhengxing; Dong, Wei; Wang, Fei; Duan, Huilong
2015-01-01
Modeling and clustering medical inpatient journeys is useful to healthcare organizations for a number of reasons including inpatient journey reorganization in a more convenient way for understanding and browsing, etc. In this study, we present a probabilistic model-based approach to model and cluster medical inpatient journeys. Specifically, we exploit a Bayesian Hidden Markov Model based approach to transform medical inpatient journeys into a probabilistic space, which can be seen as a richer representation of inpatient journeys to be clustered. Then, using hierarchical clustering on the matrix of similarities, inpatient journeys can be clustered into different categories w.r.t their clinical and temporal characteristics. We evaluated the proposed approach on a real clinical data set pertaining to the unstable angina treatment process. The experimental results reveal that our method can identify and model latent treatment topics underlying in personalized inpatient journeys, and yield impressive clustering quality. PMID:26958200
ERIC Educational Resources Information Center
Tchumtchoua, Sylvie; Dey, Dipak K.
2012-01-01
This paper proposes a semiparametric Bayesian framework for the analysis of associations among multivariate longitudinal categorical variables in high-dimensional data settings. This type of data is frequent, especially in the social and behavioral sciences. A semiparametric hierarchical factor analysis model is developed in which the…
Hierarchical Bayesian Model (HBM) - Derived Estimates of Air Quality for 2007: Annual Report
This report describes EPA's Hierarchical Bayesian model generated (HBM) estimates of ozone (O_{3}) and fine particulate matter (PM_{2.5} particles with aerodynamic diameter < 2.5 microns)concentrations throughout the continental United States during the 2007 calen...
The Bayesian Evaluation of Categorization Models: Comment on Wills and Pothos (2012)
ERIC Educational Resources Information Center
Vanpaemel, Wolf; Lee, Michael D.
2012-01-01
Wills and Pothos (2012) reviewed approaches to evaluating formal models of categorization, raising a series of worthwhile issues, challenges, and goals. Unfortunately, in discussing these issues and proposing solutions, Wills and Pothos (2012) did not consider Bayesian methods in any detail. This means not only that their review excludes a major…
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan; Cai, Jing-Heng
2010-01-01
Analysis of ordered binary and unordered binary data has received considerable attention in social and psychological research. This article introduces a Bayesian approach, which has several nice features in practical applications, for analyzing nonlinear structural equation models with dichotomous data. We demonstrate how to use the software…
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2004 - Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2004 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
Bayesian Structural Equation Modeling: A More Flexible Representation of Substantive Theory
ERIC Educational Resources Information Center
Muthen, Bengt; Asparouhov, Tihomir
2012-01-01
This article proposes a new approach to factor analysis and structural equation modeling using Bayesian analysis. The new approach replaces parameter specifications of exact zeros with approximate zeros based on informative, small-variance priors. It is argued that this produces an analysis that better reflects substantive theories. The proposed…
Bayesian Analysis of Structural Equation Models with Nonlinear Covariates and Latent Variables
ERIC Educational Resources Information Center
Song, Xin-Yuan; Lee, Sik-Yum
2006-01-01
In this article, we formulate a nonlinear structural equation model (SEM) that can accommodate covariates in the measurement equation and nonlinear terms of covariates and exogenous latent variables in the structural equation. The covariates can come from continuous or discrete distributions. A Bayesian approach is developed to analyze the…
A General and Flexible Approach to Estimating the Social Relations Model Using Bayesian Methods
ERIC Educational Resources Information Center
Ludtke, Oliver; Robitzsch, Alexander; Kenny, David A.; Trautwein, Ulrich
2013-01-01
The social relations model (SRM) is a conceptual, methodological, and analytical approach that is widely used to examine dyadic behaviors and interpersonal perception within groups. This article introduces a general and flexible approach to estimating the parameters of the SRM that is based on Bayesian methods using Markov chain Monte Carlo…
ERIC Educational Resources Information Center
Wang, Qiu; Diemer, Matthew A.; Maier, Kimberly S.
2013-01-01
This study integrated Bayesian hierarchical modeling and receiver operating characteristic analysis (BROCA) to evaluate how interest strength (IS) and interest differentiation (ID) predicted low–socioeconomic status (SES) youth's interest-major congruence (IMC). Using large-scale Kuder Career Search online-assessment data, this study fit three…
Bayesian Analysis for Linearized Multi-Stage Models in Quantal Bioassay.
ERIC Educational Resources Information Center
Kuo, Lynn; Cohen, Michael P.
Bayesian methods for estimating dose response curves in quantal bioassay are studied. A linearized multi-stage model is assumed for the shape of the curves. A Gibbs sampling approach with data augmentation is employed to compute the Bayes estimates. In addition, estimation of the "relative additional risk" and the "risk specific…
Pretense, Counterfactuals, and Bayesian Causal Models: Why What Is Not Real Really Matters
ERIC Educational Resources Information Center
Weisberg, Deena S.; Gopnik, Alison
2013-01-01
Young children spend a large portion of their time pretending about non-real situations. Why? We answer this question by using the framework of Bayesian causal models to argue that pretending and counterfactual reasoning engage the same component cognitive abilities: disengaging with current reality, making inferences about an alternative…
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2002– Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2002 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2001 - Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2001 calendar year. HBM estimates provide the spatial and temporal variance of O_{ 3}...
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2003 – Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2003 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2005 - Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2005 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
Hierarchical Bayesian Model (HBM) - Derived Estimates of Air Quality for 2008: Annual Report
This report describes EPA’s Hierarchical Bayesian model generated (HBM) estimates of ozone (O_{3}) and fine particulate matter (PM_{2.5}, particles with aerodynamic diameter < 2.5 microns) concentrations throughout the continental United States during the 2007 ca...
ERIC Educational Resources Information Center
Bekele, Rahel; McPherson, Maggie
2011-01-01
This research work presents a Bayesian Performance Prediction Model that was created in order to determine the strength of personality traits in predicting the level of mathematics performance of high school students in Addis Ababa. It is an automated tool that can be used to collect information from students for the purpose of effective group…
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2006 - Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O_{3} and PM_{2.5} concentrations throughout the continental United States during the 2006 calendar year. HBM estimates provide the spatial and temporal variance of O_{3} ...
A Robust Bayesian Approach for Structural Equation Models with Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum; Xia, Ye-Mao
2008-01-01
In this paper, normal/independent distributions, including but not limited to the multivariate t distribution, the multivariate contaminated distribution, and the multivariate slash distribution, are used to develop a robust Bayesian approach for analyzing structural equation models with complete or missing data. In the context of a nonlinear…
Bayesian Inference for Growth Mixture Models with Latent Class Dependent Missing Data
ERIC Educational Resources Information Center
Lu, Zhenqiu Laura; Zhang, Zhiyong; Lubke, Gitta
2011-01-01
"Growth mixture models" (GMMs) with nonignorable missing data have drawn increasing attention in research communities but have not been fully studied. The goal of this article is to propose and to evaluate a Bayesian method to estimate the GMMs with latent class dependent missing data. An extended GMM is first presented in which class…
A Bayesian modification to the Jelinski-Moranda software reliability growth model
NASA Technical Reports Server (NTRS)
Littlewood, B.; Sofer, A.
1983-01-01
The Jelinski-Moranda (JM) model for software reliability was examined. It is suggested that a major reason for the poor results given by this model is the poor performance of the maximum likelihood method (ML) of parameter estimation. A reparameterization and Bayesian analysis, involving a slight modelling change, are proposed. It is shown that this new Bayesian-Jelinski-Moranda model (BJM) is mathematically quite tractable, and several metrics of interest to practitioners are obtained. The BJM and JM models are compared by using several sets of real software failure data collected and in all cases the BJM model gives superior reliability predictions. A change in the assumption which underlay both models to present the debugging process more accurately is discussed.
NASA Astrophysics Data System (ADS)
Chen, X.; Hao, Z.; Devineni, N.; Lall, U.
2013-09-01
A Hierarchal Bayesian model for forecasting regional summer rainfall and streamflow season-ahead using exogenous climate variables for East Central China is presented. The model provides estimates of the posterior forecasted probability distribution for 12 rainfall and 2 streamflow stations considering parameter uncertainty, and cross-site correlation. The model has a multilevel structure with regression coefficients modeled from a common multivariate normal distribution results in partial-pooling of information across multiple stations and better representation of parameter and posterior distribution uncertainty. Covariance structure of the residuals across stations is explicitly modeled. Model performance is tested under leave-10-out cross-validation. Frequentist and Bayesian performance metrics used include Receiver Operating Characteristic, Reduction of Error, Coefficient of Efficiency, Rank Probability Skill Scores, and coverage by posterior credible intervals. The ability of the model to reliably forecast regional summer rainfall and streamflow season-ahead offers potential for developing adaptive water risk management strategies.
NASA Astrophysics Data System (ADS)
Chen, X.; Hao, Z.; Devineni, N.; Lall, U.
2014-04-01
A Hierarchal Bayesian model is presented for one season-ahead forecasts of summer rainfall and streamflow using exogenous climate variables for east central China. The model provides estimates of the posterior forecasted probability distribution for 12 rainfall and 2 streamflow stations considering parameter uncertainty, and cross-site correlation. The model has a multi-level structure with regression coefficients modeled from a common multi-variate normal distribution resulting in partial pooling of information across multiple stations and better representation of parameter and posterior distribution uncertainty. Covariance structure of the residuals across stations is explicitly modeled. Model performance is tested under leave-10-out cross-validation. Frequentist and Bayesian performance metrics used include receiver operating characteristic, reduction of error, coefficient of efficiency, rank probability skill scores, and coverage by posterior credible intervals. The ability of the model to reliably forecast season-ahead regional summer rainfall and streamflow offers potential for developing adaptive water risk management strategies.
Bayesian meta-analysis for longitudinal data models using multivariate mixture priors.
Lopes, Hedibert Freitas; Müller, Peter; Rosner, Gary L
2003-03-01
We propose a class of longitudinal data models with random effects that generalizes currently used models in two important ways. First, the random-effects model is a flexible mixture of multivariate normals, accommodating population heterogeneity, outliers, and nonlinearity in the regression on subject-specific covariates. Second, the model includes a hierarchical extension to allow for meta-analysis over related studies. The random-effects distributions are decomposed into one part that is common across all related studies (common measure), and one part that is specific to each study and that captures the variability intrinsic between patients within the same study. Both the common measure and the study-specific measures are parameterized as mixture-of-normals models. We carry out inference using reversible jump posterior simulation to allow a random number of terms in the mixtures. The sampler takes advantage of the small number of entertained models. The motivating application is the analysis of two studies carried out by the Cancer and Leukemia Group B (CALGB). In both studies, we record for each patient white blood cell counts (WBC) over time to characterize the toxic effects of treatment. The WBCs are modeled through a nonlinear hierarchical model that gathers the information from both studies.
An Application of Bayesian Approach in Modeling Risk of Death in an Intensive Care Unit
Wong, Rowena Syn Yin; Ismail, Noor Azina
2016-01-01
Background and Objectives There are not many studies that attempt to model intensive care unit (ICU) risk of death in developing countries, especially in South East Asia. The aim of this study was to propose and describe application of a Bayesian approach in modeling in-ICU deaths in a Malaysian ICU. Methods This was a prospective study in a mixed medical-surgery ICU in a multidisciplinary tertiary referral hospital in Malaysia. Data collection included variables that were defined in Acute Physiology and Chronic Health Evaluation IV (APACHE IV) model. Bayesian Markov Chain Monte Carlo (MCMC) simulation approach was applied in the development of four multivariate logistic regression predictive models for the ICU, where the main outcome measure was in-ICU mortality risk. The performance of the models were assessed through overall model fit, discrimination and calibration measures. Results from the Bayesian models were also compared against results obtained using frequentist maximum likelihood method. Results The study involved 1,286 consecutive ICU admissions between January 1, 2009 and June 30, 2010, of which 1,111 met the inclusion criteria. Patients who were admitted to the ICU were generally younger, predominantly male, with low co-morbidity load and mostly under mechanical ventilation. The overall in-ICU mortality rate was 18.5% and the overall mean Acute Physiology Score (APS) was 68.5. All four models exhibited good discrimination, with area under receiver operating characteristic curve (AUC) values approximately 0.8. Calibration was acceptable (Hosmer-Lemeshow p-values > 0.05) for all models, except for model M3. Model M1 was identified as the model with the best overall performance in this study. Conclusion Four prediction models were proposed, where the best model was chosen based on its overall performance in this study. This study has also demonstrated the promising potential of the Bayesian MCMC approach as an alternative in the analysis and modeling of
ERIC Educational Resources Information Center
Hinkle, Dennis; Houston, Charles A.
The purpose of this study was to present and evaluate Bayesian-type models for estimating probabilities of program completion and for predicting first quarter grade point averages of community college students entering certain allied health fields. Two Bayesian models were tested. Bayesian Model 1--Estimating Probabilities of Program…
Bayesian model selection validates a biokinetic model for zirconium processing in humans
2012-01-01
Background In radiation protection, biokinetic models for zirconium processing are of crucial importance in dose estimation and further risk analysis for humans exposed to this radioactive substance. They provide limiting values of detrimental effects and build the basis for applications in internal dosimetry, the prediction for radioactive zirconium retention in various organs as well as retrospective dosimetry. Multi-compartmental models are the tool of choice for simulating the processing of zirconium. Although easily interpretable, determining the exact compartment structure and interaction mechanisms is generally daunting. In the context of observing the dynamics of multiple compartments, Bayesian methods provide efficient tools for model inference and selection. Results We are the first to apply a Markov chain Monte Carlo approach to compute Bayes factors for the evaluation of two competing models for zirconium processing in the human body after ingestion. Based on in vivo measurements of human plasma and urine levels we were able to show that a recently published model is superior to the standard model of the International Commission on Radiological Protection. The Bayes factors were estimated by means of the numerically stable thermodynamic integration in combination with a recently developed copula-based Metropolis-Hastings sampler. Conclusions In contrast to the standard model the novel model predicts lower accretion of zirconium in bones. This results in lower levels of noxious doses for exposed individuals. Moreover, the Bayesian approach allows for retrospective dose assessment, including credible intervals for the initially ingested zirconium, in a significantly more reliable fashion than previously possible. All methods presented here are readily applicable to many modeling tasks in systems biology. PMID:22863152
ERIC Educational Resources Information Center
Rindskopf, David
2012-01-01
Muthen and Asparouhov (2012) made a strong case for the advantages of Bayesian methodology in factor analysis and structural equation models. I show additional extensions and adaptations of their methods and show how non-Bayesians can take advantage of many (though not all) of these advantages by using interval restrictions on parameters. By…
Bayesian Network Model with Application to Smart Power Semiconductor Lifetime Data.
Plankensteiner, Kathrin; Bluder, Olivia; Pilz, Jürgen
2015-09-01
In this article, Bayesian networks are used to model semiconductor lifetime data obtained from a cyclic stress test system. The data of interest are a mixture of log-normal distributions, representing two dominant physical failure mechanisms. Moreover, the data can be censored due to limited test resources. For a better understanding of the complex lifetime behavior, interactions between test settings, geometric designs, material properties, and physical parameters of the semiconductor device are modeled by a Bayesian network. Statistical toolboxes in MATLAB® have been extended and applied to find the best structure of the Bayesian network and to perform parameter learning. Due to censored observations Markov chain Monte Carlo (MCMC) simulations are employed to determine the posterior distributions. For model selection the automatic relevance determination (ARD) algorithm and goodness-of-fit criteria such as marginal likelihoods, Bayes factors, posterior predictive density distributions, and sum of squared errors of prediction (SSEP) are applied and evaluated. The results indicate that the application of Bayesian networks to semiconductor reliability provides useful information about the interactions between the significant covariates and serves as a reliable alternative to currently applied methods.
Comparison of a Bayesian Network with a Logistic Regression Model to Forecast IgA Nephropathy
Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre
2013-01-01
Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation. PMID:24328031
Comparison of a Bayesian network with a logistic regression model to forecast IgA nephropathy.
Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre
2013-01-01
Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation.
Equifinality of formal (DREAM) and informal (GLUE) bayesian approaches in hydrologic modeling?
Vrugt, Jasper A; Robinson, Bruce A; Ter Braak, Cajo J F; Gupta, Hoshin V
2008-01-01
In recent years, a strong debate has emerged in the hydrologic literature regarding what constitutes an appropriate framework for uncertainty estimation. Particularly, there is strong disagreement whether an uncertainty framework should have its roots within a proper statistical (Bayesian) context, or whether such a framework should be based on a different philosophy and implement informal measures and weaker inference to summarize parameter and predictive distributions. In this paper, we compare a formal Bayesian approach using Markov Chain Monte Carlo (MCMC) with generalized likelihood uncertainty estimation (GLUE) for assessing uncertainty in conceptual watershed modeling. Our formal Bayesian approach is implemented using the recently developed differential evolution adaptive metropolis (DREAM) MCMC scheme with a likelihood function that explicitly considers model structural, input and parameter uncertainty. Our results demonstrate that DREAM and GLUE can generate very similar estimates of total streamflow uncertainty. This suggests that formal and informal Bayesian approaches have more common ground than the hydrologic literature and ongoing debate might suggest. The main advantage of formal approaches is, however, that they attempt to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty. This is key to improving hydrologic theory and to better understand and predict the flow of water through catchments.
Number-knower levels in young children: insights from Bayesian modeling.
Lee, Michael D; Sarnecka, Barbara W
2011-09-01
Lee and Sarnecka (2010) developed a Bayesian model of young children's behavior on the Give-N test of number knowledge. This paper presents two new extensions of the model, and applies the model to new data. In the first extension, the model is used to evaluate competing theories about the conceptual knowledge underlying children's behavior. One, the knower-levels theory, is basically a "stage" theory involving real conceptual change. The other, the approximate-meanings theory, assumes that the child's conceptual knowledge is relatively constant, although performance improves over time. In the second extension, the model is used to ask whether the same latent psychological variable (a child's number-knower level) can simultaneously account for behavior on two tasks (the Give-N task and the Fast-Cards task) with different performance demands. Together, these two demonstrations show the potential of the Bayesian modeling approach to improve our understanding of the development of human cognition.
Bayesian spatio-temporal modeling of particulate matter concentrations in Peninsular Malaysia
NASA Astrophysics Data System (ADS)
Manga, Edna; Awang, Norhashidah
2016-06-01
This article presents an application of a Bayesian spatio-temporal Gaussian process (GP) model on particulate matter concentrations from Peninsular Malaysia. We analyze daily PM10 concentration levels from 35 monitoring sites in June and July 2011. The spatiotemporal model set in a Bayesian hierarchical framework allows for inclusion of informative covariates, meteorological variables and spatiotemporal interactions. Posterior density estimates of the model parameters are obtained by Markov chain Monte Carlo methods. Preliminary data analysis indicate information on PM10 levels at sites classified as industrial locations could explain part of the space time variations. We include the site-type indicator in our modeling efforts. Results of the parameter estimates for the fitted GP model show significant spatio-temporal structure and positive effect of the location-type explanatory variable. We also compute some validation criteria for the out of sample sites that show the adequacy of the model for predicting PM10 at unmonitored sites.
NASA Astrophysics Data System (ADS)
Tsai, Frank T.-C.; Elshall, Ahmed S.
2013-09-01
Analysts are often faced with competing propositions for each uncertain model component. How can we judge that we select a correct proposition(s) for an uncertain model component out of numerous possible propositions? We introduce the hierarchical Bayesian model averaging (HBMA) method as a multimodel framework for uncertainty analysis. The HBMA allows for segregating, prioritizing, and evaluating different sources of uncertainty and their corresponding competing propositions through a hierarchy of BMA models that forms a BMA tree. We apply the HBMA to conduct uncertainty analysis on the reconstructed hydrostratigraphic architectures of the Baton Rouge aquifer-fault system, Louisiana. Due to uncertainty in model data, structure, and parameters, multiple possible hydrostratigraphic models are produced and calibrated as base models. The study considers four sources of uncertainty. With respect to data uncertainty, the study considers two calibration data sets. With respect to model structure, the study considers three different variogram models, two geological stationarity assumptions and two fault conceptualizations. The base models are produced following a combinatorial design to allow for uncertainty segregation. Thus, these four uncertain model components with their corresponding competing model propositions result in 24 base models. The results show that the systematic dissection of the uncertain model components along with their corresponding competing propositions allows for detecting the robust model propositions and the major sources of uncertainty.
Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets
2015-01-01
On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, and the ability to share such models still remains a major challenge limiting drug discovery. We describe the creation of a reference implementation of a Bayesian model-building software module, which we have released as an open source component that is now included in the Chemistry Development Kit (CDK) project, as well as implemented in the CDD Vault and in several mobile apps. We use this implementation to build an array of Bayesian models for ADME/Tox, in vitro and in vivo bioactivity, and other physicochemical properties. We show that these models possess cross-validation receiver operator curve values comparable to those generated previously in prior publications using alternative tools. We have now described how the implementation of Bayesian models with FCFP6 descriptors generated in the CDD Vault enables the rapid production of robust machine learning models from public data or the user’s own datasets. The current study sets the stage for generating models in proprietary software (such as CDD) and exporting these models in a format that could be run in open source software using CDK components. This work also demonstrates that we can enable biocomputation across distributed private or public datasets to enhance drug discovery. PMID:25994950
A Bayesian estimation of a stochastic predator-prey model of economic fluctuations
NASA Astrophysics Data System (ADS)
Dibeh, Ghassan; Luchinsky, Dmitry G.; Luchinskaya, Daria D.; Smelyanskiy, Vadim N.
2007-06-01
In this paper, we develop a Bayesian framework for the empirical estimation of the parameters of one of the best known nonlinear models of the business cycle: The Marx-inspired model of a growth cycle introduced by R. M. Goodwin. The model predicts a series of closed cycles representing the dynamics of labor's share and the employment rate in the capitalist economy. The Bayesian framework is used to empirically estimate a modified Goodwin model. The original model is extended in two ways. First, we allow for exogenous periodic variations of the otherwise steady growth rates of the labor force and productivity per worker. Second, we allow for stochastic variations of those parameters. The resultant modified Goodwin model is a stochastic predator-prey model with periodic forcing. The model is then estimated using a newly developed Bayesian estimation method on data sets representing growth cycles in France and Italy during the years 1960-2005. Results show that inference of the parameters of the stochastic Goodwin model can be achieved. The comparison of the dynamics of the Goodwin model with the inferred values of parameters demonstrates quantitative agreement with the growth cycle empirical data.
Lessons Learned from a Past Series of Bayesian Model Averaging studies for Soil/Plant Models
NASA Astrophysics Data System (ADS)
Nowak, Wolfgang; Wöhling, Thomas; Schöniger, Anneli
2015-04-01
In this study we evaluate the lessons learned about modelling soil/plant systems from analyzing evapotranspiration data, soil moisture and leaf area index. The data were analyzed with advanced tools from the area of Bayesian Model Averaging, model ranking and Bayesian Model Selection. We have generated a large variety of model conceptualizations by sampling random parameter sets from the vegetation components of the CERES, SUCROS, GECROS, and SPASS models and a common model for soil water movement via Monte-Carlo simulations. We used data from a one vegetation period of winter wheat at a field site in Nellingen, Germany. The data set includes soil moisture, actual evapotranspiration (ETa) from an eddy covariance tower, and leaf-area index (LAI). The focus of data analysis was on how one can do model ranking and model selection. Further analysis steps included the predictive reliability of different soil/plant models calibrated on different subsets of the available data. Our main conclusion is that model selection between different competing soil-plant models remains a large challenge, because 1. different data types and their combinations favor different models, because competing models are more or less good in simulating the coupling processes between the various compartments and their states, 2. singular events (such as the evolution of LAI during plant senescence) can dominate an entire time series, and long time series can be represented well by the few data values where the models disagree most, 3. the different data types differ in their discriminating power for model selection, 4. the level of noise present in ETa and LAI data, and the level of systematic model bias through simplifications of the complex system (e.g., assuming a few internally homogeneous soil layers) substantially reduce the confidence in model ranking and model selection, 5. none of the models withstands a hypothesis test against the available data, 6. even the assumed level of measurement
NASA Astrophysics Data System (ADS)
Gilet, Estelle; Diard, Julien; Palluel-Germain, Richard; Bessière, Pierre
2011-03-01
This paper is about modeling perception-action loops and, more precisely, the study of the influence of motor knowledge during perception tasks. We use the Bayesian Action-Perception (BAP) model, which deals with the sensorimotor loop involved in reading and writing cursive isolated letters and includes an internal simulation of movement loop. By using this probabilistic model we simulate letter recognition, both with and without internal motor simulation. Comparison of their performance yields an experimental prediction, which we set forth.
Robust video object tracking via Bayesian model averaging-based feature fusion
NASA Astrophysics Data System (ADS)
Dai, Yi; Liu, Bin
2016-08-01
We are concerned with tracking an object of interest in a video stream. We propose an algorithm that is robust against occlusion, the presence of confusing colors, abrupt changes in the object features and changes in scale. We develop the algorithm within a Bayesian modeling framework. The state-space model is used for capturing the temporal correlation in the sequence of frame images by modeling the underlying dynamics of the tracking system. The Bayesian model averaging (BMA) strategy is proposed for fusing multiclue information in the observations. Any number of object features is allowed to be involved in the proposed framework. Every feature represents one source of information to be fused and is associated with an observation model. The state inference is performed by employing the particle filter methods. In comparison with the related approaches, the BMA-based tracker is shown to have robustness, expressivity, and comprehensibility.
Bayesian inference for kinetic models of biotransformation using a generalized rate equation.
Ying, Shanshan; Zhang, Jiangjiang; Zeng, Lingzao; Shi, Jiachun; Wu, Laosheng
2017-03-06
Selecting proper rate equations for the kinetic models is essential to quantify biotransformation processes in the environment. Bayesian model selection method can be used to evaluate the candidate models. However, comparisons of all plausible models can result in high computational cost, while limiting the number of candidate models may lead to biased results. In this work, we developed an integrated Bayesian method to simultaneously perform model selection and parameter estimation by using a generalized rate equation. In the approach, the model hypotheses were represented by discrete parameters and the rate constants were represented by continuous parameters. Then Bayesian inference of the kinetic models was solved by implementing Markov Chain Monte Carlo simulation for parameter estimation with the mixed (i.e., discrete and continuous) priors. The validity of this approach was illustrated through a synthetic case and a nitrogen transformation experimental study. It showed that our method can successfully identify the plausible models and parameters, as well as uncertainties therein. Thus this method can provide a powerful tool to reveal more insightful information for the complex biotransformation processes.
Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models.
Daunizeau, J; Friston, K J; Kiebel, S J
2009-11-01
In this paper, we describe a general variational Bayesian approach for approximate inference on nonlinear stochastic dynamic models. This scheme extends established approximate inference on hidden-states to cover: (i) nonlinear evolution and observation functions, (ii) unknown parameters and (precision) hyperparameters and (iii) model comparison and prediction under uncertainty. Model identification or inversion entails the estimation of the marginal likelihood or evidence of a model. This difficult integration problem can be finessed by optimising a free-energy bound on the evidence using results from variational calculus. This yields a deterministic update scheme that optimises an approximation to the posterior density on the unknown model variables. We derive such a variational Bayesian scheme in the context of nonlinear stochastic dynamic hierarchical models, for both model identification and time-series prediction. The computational complexity of the scheme is comparable to that of an extended Kalman filter, which is critical when inverting high dimensional models or long time-series. Using Monte-Carlo simulations, we assess the estimation efficiency of this variational Bayesian approach using three stochastic variants of chaotic dynamic systems. We also demonstrate the model comparison capabilities of the method, its self-consistency and its predictive power.
Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models
NASA Astrophysics Data System (ADS)
Daunizeau, J.; Friston, K. J.; Kiebel, S. J.
2009-11-01
In this paper, we describe a general variational Bayesian approach for approximate inference on nonlinear stochastic dynamic models. This scheme extends established approximate inference on hidden-states to cover: (i) nonlinear evolution and observation functions, (ii) unknown parameters and (precision) hyperparameters and (iii) model comparison and prediction under uncertainty. Model identification or inversion entails the estimation of the marginal likelihood or evidence of a model. This difficult integration problem can be finessed by optimising a free-energy bound on the evidence using results from variational calculus. This yields a deterministic update scheme that optimises an approximation to the posterior density on the unknown model variables. We derive such a variational Bayesian scheme in the context of nonlinear stochastic dynamic hierarchical models, for both model identification and time-series prediction. The computational complexity of the scheme is comparable to that of an extended Kalman filter, which is critical when inverting high dimensional models or long time-series. Using Monte-Carlo simulations, we assess the estimation efficiency of this variational Bayesian approach using three stochastic variants of chaotic dynamic systems. We also demonstrate the model comparison capabilities of the method, its self-consistency and its predictive power.
Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models
Daunizeau, J.; Friston, K.J.; Kiebel, S.J.
2009-01-01
In this paper, we describe a general variational Bayesian approach for approximate inference on nonlinear stochastic dynamic models. This scheme extends established approximate inference on hidden-states to cover: (i) nonlinear evolution and observation functions, (ii) unknown parameters and (precision) hyperparameters and (iii) model comparison and prediction under uncertainty. Model identification or inversion entails the estimation of the marginal likelihood or evidence of a model. This difficult integration problem can be finessed by optimising a free-energy bound on the evidence using results from variational calculus. This yields a deterministic update scheme that optimises an approximation to the posterior density on the unknown model variables. We derive such a variational Bayesian scheme in the context of nonlinear stochastic dynamic hierarchical models, for both model identification and time-series prediction. The computational complexity of the scheme is comparable to that of an extended Kalman filter, which is critical when inverting high dimensional models or long time-series. Using Monte-Carlo simulations, we assess the estimation efficiency of this variational Bayesian approach using three stochastic variants of chaotic dynamic systems. We also demonstrate the model comparison capabilities of the method, its self-consistency and its predictive power. PMID:19862351
Lu, Dan; Ye, Ming; Curtis, Gary P.
2015-08-01
While Bayesian model averaging (BMA) has been widely used in groundwater modeling, it is infrequently applied to groundwater reactive transport modeling because of multiple sources of uncertainty in the coupled hydrogeochemical processes and because of the long execution time of each model run. To resolve these problems, this study analyzed different levels of uncertainty in a hierarchical way, and used the maximum likelihood version of BMA, i.e., MLBMA, to improve the computational efficiency. Our study demonstrates the applicability of MLBMA to groundwater reactive transport modeling in a synthetic case in which twenty-seven reactive transport models were designed to predict the reactive transport of hexavalent uranium (U(VI)) based on observations at a former uranium mill site near Naturita, CO. Moreover, these reactive transport models contain three uncertain model components, i.e., parameterization of hydraulic conductivity, configuration of model boundary, and surface complexation reactions that simulate U(VI) adsorption. These uncertain model components were aggregated into the alternative models by integrating a hierarchical structure into MLBMA. The modeling results of the individual models and MLBMA were analyzed to investigate their predictive performance. The predictive logscore results show that MLBMA generally outperforms the best model, suggesting that using MLBMA is a sound strategy to achieve more robust model predictions relative to a single model. MLBMA works best when the alternative models are structurally distinct and have diverse model predictions. When correlation in model structure exists, two strategies were used to improve predictive performance by retaining structurally distinct models or assigning smaller prior model probabilities to correlated models. Since the synthetic models were designed using data from the Naturita site, the results of this study are expected to provide guidance for real-world modeling. Finally, limitations of
Lu, Dan; Ye, Ming; Curtis, Gary P.
2015-08-01
While Bayesian model averaging (BMA) has been widely used in groundwater modeling, it is infrequently applied to groundwater reactive transport modeling because of multiple sources of uncertainty in the coupled hydrogeochemical processes and because of the long execution time of each model run. To resolve these problems, this study analyzed different levels of uncertainty in a hierarchical way, and used the maximum likelihood version of BMA, i.e., MLBMA, to improve the computational efficiency. Our study demonstrates the applicability of MLBMA to groundwater reactive transport modeling in a synthetic case in which twenty-seven reactive transport models were designed to predict themore » reactive transport of hexavalent uranium (U(VI)) based on observations at a former uranium mill site near Naturita, CO. Moreover, these reactive transport models contain three uncertain model components, i.e., parameterization of hydraulic conductivity, configuration of model boundary, and surface complexation reactions that simulate U(VI) adsorption. These uncertain model components were aggregated into the alternative models by integrating a hierarchical structure into MLBMA. The modeling results of the individual models and MLBMA were analyzed to investigate their predictive performance. The predictive logscore results show that MLBMA generally outperforms the best model, suggesting that using MLBMA is a sound strategy to achieve more robust model predictions relative to a single model. MLBMA works best when the alternative models are structurally distinct and have diverse model predictions. When correlation in model structure exists, two strategies were used to improve predictive performance by retaining structurally distinct models or assigning smaller prior model probabilities to correlated models. Since the synthetic models were designed using data from the Naturita site, the results of this study are expected to provide guidance for real-world modeling. Finally
Curtis, Gary P.; Lu, Dan; Ye, Ming
2015-01-01
While Bayesian model averaging (BMA) has been widely used in groundwater modeling, it is infrequently applied to groundwater reactive transport modeling because of multiple sources of uncertainty in the coupled hydrogeochemical processes and because of the long execution time of each model run. To resolve these problems, this study analyzed different levels of uncertainty in a hierarchical way, and used the maximum likelihood version of BMA, i.e., MLBMA, to improve the computational efficiency. This study demonstrates the applicability of MLBMA to groundwater reactive transport modeling in a synthetic case in which twenty-seven reactive transport models were designed to predict the reactive transport of hexavalent uranium (U(VI)) based on observations at a former uranium mill site near Naturita, CO. These reactive transport models contain three uncertain model components, i.e., parameterization of hydraulic conductivity, configuration of model boundary, and surface complexation reactions that simulate U(VI) adsorption. These uncertain model components were aggregated into the alternative models by integrating a hierarchical structure into MLBMA. The modeling results of the individual models and MLBMA were analyzed to investigate their predictive performance. The predictive logscore results show that MLBMA generally outperforms the best model, suggesting that using MLBMA is a sound strategy to achieve more robust model predictions relative to a single model. MLBMA works best when the alternative models are structurally distinct and have diverse model predictions. When correlation in model structure exists, two strategies were used to improve predictive performance by retaining structurally distinct models or assigning smaller prior model probabilities to correlated models. Since the synthetic models were designed using data from the Naturita site, the results of this study are expected to provide guidance for real-world modeling. Limitations of applying MLBMA to the
A Bayesian approach for inducing sparsity in generalized linear models with multi-category response
2015-01-01
Background The dimension and complexity of high-throughput gene expression data create many challenges for downstream analysis. Several approaches exist to reduce the number of variables with respect to small sample sizes. In this study, we utilized the Generalized Double Pareto (GDP) prior to induce sparsity in a Bayesian Generalized Linear Model (GLM) setting. The approach was evaluated using a publicly available microarray dataset containing 99 samples corresponding to four different prostate cancer subtypes. Results A hierarchical Sparse Bayesian GLM using GDP prior (SBGG) was developed to take into account the progressive nature of the response variable. We obtained an average overall classification accuracy between 82.5% and 94%, which was higher than Support Vector Machine, Random Forest or a Sparse Bayesian GLM using double exponential priors. Additionally, SBGG outperforms the other 3 methods in correctly identifying pre-metastatic stages of cancer progression, which can prove extremely valuable for therapeutic and diagnostic purposes. Importantly, using Geneset Cohesion Analysis Tool, we found that the top 100 genes produced by SBGG had an average functional cohesion p-value of 2.0E-4 compared to 0.007 to 0.131 produced by the other methods. Conclusions Using GDP in a Bayesian GLM model applied to cancer progression data results in better subclass prediction. In particular, the method identifies pre-metastatic stages of prostate cancer with substantially better accuracy and produces more functionally relevant gene sets. PMID:26423345
Dynamic causal modelling of electrographic seizure activity using Bayesian belief updating
Cooray, Gerald K.; Sengupta, Biswa; Douglas, Pamela K.; Friston, Karl
2016-01-01
Seizure activity in EEG recordings can persist for hours with seizure dynamics changing rapidly over time and space. To characterise the spatiotemporal evolution of seizure activity, large data sets often need to be analysed. Dynamic causal modelling (DCM) can be used to estimate the synaptic drivers of cortical dynamics during a seizure; however, the requisite (Bayesian) inversion procedure is computationally expensive. In this note, we describe a straightforward procedure, within the DCM framework, that provides efficient inversion of seizure activity measured with non-invasive and invasive physiological recordings; namely, EEG/ECoG. We describe the theoretical background behind a Bayesian belief updating scheme for DCM. The scheme is tested on simulated and empirical seizure activity (recorded both invasively and non-invasively) and compared with standard Bayesian inversion. We show that the Bayesian belief updating scheme provides similar estimates of time-varying synaptic parameters, compared to standard schemes, indicating no significant qualitative change in accuracy. The difference in variance explained was small (less than 5%). The updating method was substantially more efficient, taking approximately 5–10 min compared to approximately 1–2 h. Moreover, the setup of the model under the updating scheme allows for a clear specification of how neuronal variables fluctuate over separable timescales. This method now allows us to investigate the effect of fast (neuronal) activity on slow fluctuations in (synaptic) parameters, paving a way forward to understand how seizure activity is generated. PMID:26220742
Modelling household finances: A Bayesian approach to a multivariate two-part model.
Brown, Sarah; Ghosh, Pulak; Su, Li; Taylor, Karl
2015-09-01
We contribute to the empirical literature on household finances by introducing a Bayesian multivariate two-part model, which has been developed to further our understanding of household finances. Our flexible approach allows for the potential interdependence between the holding of assets and liabilities at the household level and also encompasses a two-part process to allow for differences in the influences on asset or liability holding and on the respective amounts held. Furthermore, the framework is dynamic in order to allow for persistence in household finances over time. Our findings endorse the joint modelling approach and provide evidence supporting the importance of dynamics. In addition, we find that certain independent variables exert different influences on the binary and continuous parts of the model thereby highlighting the flexibility of our framework and revealing a detailed picture of the nature of household finances.
Modelling household finances: A Bayesian approach to a multivariate two-part model
Brown, Sarah; Ghosh, Pulak; Su, Li; Taylor, Karl
2016-01-01
We contribute to the empirical literature on household finances by introducing a Bayesian multivariate two-part model, which has been developed to further our understanding of household finances. Our flexible approach allows for the potential interdependence between the holding of assets and liabilities at the household level and also encompasses a two-part process to allow for differences in the influences on asset or liability holding and on the respective amounts held. Furthermore, the framework is dynamic in order to allow for persistence in household finances over time. Our findings endorse the joint modelling approach and provide evidence supporting the importance of dynamics. In addition, we find that certain independent variables exert different influences on the binary and continuous parts of the model thereby highlighting the flexibility of our framework and revealing a detailed picture of the nature of household finances. PMID:27212801
NASA Astrophysics Data System (ADS)
Zeng, X.
2015-12-01
A large number of model executions are required to obtain alternative conceptual models' predictions and their posterior probabilities in Bayesian model averaging (BMA). The posterior model probability is estimated through models' marginal likelihood and prior probability. The heavy computation burden hinders the implementation of BMA prediction, especially for the elaborated marginal likelihood estimator. For overcoming the computation burden of BMA, an adaptive sparse grid (SG) stochastic collocation method is used to build surrogates for alternative conceptual models through the numerical experiment of a synthetical groundwater model. BMA predictions depend on model posterior weights (or marginal likelihoods), and this study also evaluated four marginal likelihood estimators, including arithmetic mean estimator (AME), harmonic mean estimator (HME), stabilized harmonic mean estimator (SHME), and thermodynamic integration estimator (TIE). The results demonstrate that TIE is accurate in estimating conceptual models' marginal likelihoods. The BMA-TIE has better predictive performance than other BMA predictions. TIE has high stability for estimating conceptual model's marginal likelihood. The repeated estimated conceptual model's marginal likelihoods by TIE have significant less variability than that estimated by other estimators. In addition, the SG surrogates are efficient to facilitate BMA predictions, especially for BMA-TIE. The number of model executions needed for building surrogates is 4.13%, 6.89%, 3.44%, and 0.43% of the required model executions of BMA-AME, BMA-HME, BMA-SHME, and BMA-TIE, respectively.
Bayesian methods for spatial upscaling of process-based forest ecosystem models
NASA Astrophysics Data System (ADS)
van Oijen, M.; Cameron, D.; Reinds, G.; Thomson, A.
2010-12-01
Forest ecosystem models are common tools in environmental research: 78 different ones are listed by REM (http://ecobas.org/www-server/index.html). They tend to be process-based, parameter-rich simulators of biogeochemical fluxes at the point-support scale. Model inputs (weather, soil) should thus be provided at point-support too, rather than as regional averages. The high computational demand and point-support hampers the use of the models for larger regions. Models have been applied regionally, but without assessment of uncertainties due to upscaling from point to region. Bayesian methods are increasingly used to quantify parametric and structural uncertainties of forest models (Van Oijen et al. 2005), made possible by improvement in computing power and calibration algorithms such as MCMC. We present two case-studies of regional model application where we used MCMC to quantify uncertainties. The first study concerns Bayesian calibration of the VSD soil acidification model using European forest monitoring data (Reinds et al. 2008). Single-site calibration, applied separately to 122 sites, effectively converted prior parameter uncertainty into spatial variability. In contrast, multiple-site calibration, using the data from all sites simultaneously to estimate a common parameter vector, led to asymptotic collapse of the parameter distribution. The narrow posterior parameter distribution caused 20-100% higher values of NRMSE on 60 test sites than using nearest-neighbour single-site calibration results. Next, we applied the forest model BASFOR to the U.K. at a 20 x 20 km grid, to calculate carbon accumulation in biomass and soil (Van Oijen & Thomson 2010), as part of the U.K.’s Greenhouse Gas Inventory. For each grid cell, the model was run multiple times by taking different samples from the parameter distribution determined by a preceding Bayesian calibration. This allowed us to draw up a map of the spatial distribution of model output uncertainty. Uncertainty was
Precise Network Modeling of Systems Genetics Data Using the Bayesian Network Webserver.
Ziebarth, Jesse D; Cui, Yan
2017-01-01
The Bayesian Network Webserver (BNW, http://compbio.uthsc.edu/BNW ) is an integrated platform for Bayesian network modeling of biological datasets. It provides a web-based network modeling environment that seamlessly integrates advanced algorithms for probabilistic causal modeling and reasoning with Bayesian networks. BNW is designed for precise modeling of relatively small networks that contain less than 20 nodes. The structure learning algorithms used by BNW guarantee the discovery of the best (most probable) network structure given the data. To facilitate network modeling across multiple biological levels, BNW provides a very flexible interface that allows users to assign network nodes into different tiers and define the relationships between and within the tiers. This function is particularly useful for modeling systems genetics datasets that often consist of multiscalar heterogeneous genotype-to-phenotype data. BNW enables users to, within seconds or minutes, go from having a simply formatted input file containing a dataset to using a network model to make predictions about the interactions between variables and the potential effects of experimental interventions. In this chapter, we will introduce the functions of BNW and show how to model systems genetics datasets with BNW.
NASA Astrophysics Data System (ADS)
Elshall, A. S.; Ye, M.; Niu, G. Y.; Barron-Gafford, G.
2015-12-01
Models in biogeoscience involve uncertainties in observation data, model inputs, model structure, model processes and modeling scenarios. To accommodate for different sources of uncertainty, multimodal analysis such as model combination, model selection, model elimination or model discrimination are becoming more popular. To illustrate theoretical and practical challenges of multimodal analysis, we use an example about microbial soil respiration modeling. Global soil respiration releases more than ten times more carbon dioxide to the atmosphere than all anthropogenic emissions. Thus, improving our understanding of microbial soil respiration is essential for improving climate change models. This study focuses on a poorly understood phenomena, which is the soil microbial respiration pulses in response to episodic rainfall pulses (the "Birch effect"). We hypothesize that the "Birch effect" is generated by the following three mechanisms. To test our hypothesis, we developed and assessed five evolving microbial-enzyme models against field measurements from a semiarid Savannah that is characterized by pulsed precipitation. These five model evolve step-wise such that the first model includes none of these three mechanism, while the fifth model includes the three mechanisms. The basic component of Bayesian multimodal analysis is the estimation of marginal likelihood to rank the candidate models based on their overall likelihood with respect to observation data. The first part of the study focuses on using this Bayesian scheme to discriminate between these five candidate models. The second part discusses some theoretical and practical challenges, which are mainly the effect of likelihood function selection and the marginal likelihood estimation methods on both model ranking and Bayesian model averaging. The study shows that making valid inference from scientific data is not a trivial task, since we are not only uncertain about the candidate scientific models, but also about
Prediction and assimilation of surf-zone processes using a Bayesian network: Part II: Inverse models
Plant, Nathaniel G.; Holland, K. Todd
2011-01-01
A Bayesian network model has been developed to simulate a relatively simple problem of wave propagation in the surf zone (detailed in Part I). Here, we demonstrate that this Bayesian model can provide both inverse modeling and data-assimilation solutions for predicting offshore wave heights and depth estimates given limited wave-height and depth information from an onshore location. The inverse method is extended to allow data assimilation using observational inputs that are not compatible with deterministic solutions of the problem. These inputs include sand bar positions (instead of bathymetry) and estimates of the intensity of wave breaking (instead of wave-height observations). Our results indicate that wave breaking information is essential to reduce prediction errors. In many practical situations, this information could be provided from a shore-based observer or from remote-sensing systems. We show that various combinations of the assimilated inputs significantly reduce the uncertainty in the estimates of water depths and wave heights in the model domain. Application of the Bayesian network model to new field data demonstrated significant predictive skill (R2 = 0.7) for the inverse estimate of a month-long time series of offshore wave heights. The Bayesian inverse results include uncertainty estimates that were shown to be most accurate when given uncertainty in the inputs (e.g., depth and tuning parameters). Furthermore, the inverse modeling was extended to directly estimate tuning parameters associated with the underlying wave-process model. The inverse estimates of the model parameters not only showed an offshore wave height dependence consistent with results of previous studies but the uncertainty estimates of the tuning parameters also explain previously reported variations in the model parameters.
Prediction and assimilation of surf-zone processes using a Bayesian network: Part I: Forward models
Plant, Nathaniel G.; Holland, K. Todd
2011-01-01
Prediction of coastal processes, including waves, currents, and sediment transport, can be obtained from a variety of detailed geophysical-process models with many simulations showing significant skill. This capability supports a wide range of research and applied efforts that can benefit from accurate numerical predictions. However, the predictions are only as accurate as the data used to drive the models and, given the large temporal and spatial variability of the surf zone, inaccuracies in data are unavoidable such that useful predictions require corresponding estimates of uncertainty. We demonstrate how a Bayesian-network model can be used to provide accurate predictions of wave-height evolution in the surf zone given very sparse and/or inaccurate boundary-condition data. The approach is based on a formal treatment of a data-assimilation problem that takes advantage of significant reduction of the dimensionality of the model system. We demonstrate that predictions of a detailed geophysical model of the wave evolution are reproduced accurately using a Bayesian approach. In this surf-zone application, forward prediction skill was 83%, and uncertainties in the model inputs were accurately transferred to uncertainty in output variables. We also demonstrate that if modeling uncertainties were not conveyed to the Bayesian network (i.e., perfect data or model were assumed), then overly optimistic prediction uncertainties were computed. More consistent predictions and uncertainties were obtained by including model-parameter errors as a source of input uncertainty. Improved predictions (skill of 90%) were achieved because the Bayesian network simultaneously estimated optimal parameters while predicting wave heights.
A joint inter- and intrascale statistical model for Bayesian wavelet based image denoising.
Pizurica, Aleksandra; Philips, Wilfried; Lemahieu, Ignace; Acheroy, Marc
2002-01-01
This paper presents a new wavelet-based image denoising method, which extends a "geometrical" Bayesian framework. The new method combines three criteria for distinguishing supposedly useful coefficients from noise: coefficient magnitudes, their evolution across scales and spatial clustering of large coefficients near image edges. These three criteria are combined in a Bayesian framework. The spatial clustering properties are expressed in a prior model. The statistical properties concerning coefficient magnitudes and their evolution across scales are expressed in a joint conditional model. The three main novelties with respect to related approaches are (1) the interscale-ratios of wavelet coefficients are statistically characterized and different local criteria for distinguishing useful coefficients from noise are evaluated, (2) a joint conditional model is introduced, and (3) a novel anisotropic Markov random field prior model is proposed. The results demonstrate an improved denoising performance over related earlier techniques.
Variational Bayesian Inference Algorithms for Infinite Relational Model of Network Data.
Konishi, Takuya; Kubo, Takatomi; Watanabe, Kazuho; Ikeda, Kazushi
2015-09-01
Network data show the relationship among one kind of objects, such as social networks and hyperlinks on the Web. Many statistical models have been proposed for analyzing these data. For modeling cluster structures of networks, the infinite relational model (IRM) was proposed as a Bayesian nonparametric extension of the stochastic block model. In this brief, we derive the inference algorithms for the IRM of network data based on the variational Bayesian (VB) inference methods. After showing the standard VB inference, we derive the collapsed VB (CVB) inference and its variant called the zeroth-order CVB inference. We compared the performances of the inference algorithms using six real network datasets. The CVB inference outperformed the VB inference in most of the datasets, and the differences were especially larger in dense networks.
Bayesian inference using two-stage Laplace approximation for differential equation models
NASA Astrophysics Data System (ADS)
Dass, Sarat C.; Lee, Jaeyong; Lee, Kyoungjae
2016-11-01
We consider the problem of Bayesian inference for parameters in non-linear regression models whereby the underlying unknown response functions are formed by a set of differential equations. Bayesian methods of inference for unknown parameters rely primarily on the posterior obtained by Bayes rule. For differential equation models, analytic and closed forms for the posterior are not available and one has to resort to approximations. We propose a two-stage Laplace expansion to approximate the marginal likelihood, and hence, the posterior, to obtain an approximate closed form solution. For large sample sizes, the method of inference borrows from non-linear regression theory for maximum likelihood estimates, and is therefore, consistent. Our approach is exact in the limit and does not need the specification of an additional penalty parameter. Examples in this paper include the exponential model and SIR (Susceptible-Infected-Recovered) disease spread model.
NASA Technical Reports Server (NTRS)
He, Yuning
2015-01-01
The behavior of complex aerospace systems is governed by numerous parameters. For safety analysis it is important to understand how the system behaves with respect to these parameter values. In particular, understanding the boundaries between safe and unsafe regions is of major importance. In this paper, we describe a hierarchical Bayesian statistical modeling approach for the online detection and characterization of such boundaries. Our method for classification with active learning uses a particle filter-based model and a boundary-aware metric for best performance. From a library of candidate shapes incorporated with domain expert knowledge, the location and parameters of the boundaries are estimated using advanced Bayesian modeling techniques. The results of our boundary analysis are then provided in a form understandable by the domain expert. We illustrate our approach using a simulation model of a NASA neuro-adaptive flight control system, as well as a system for the detection of separation violations in the terminal airspace.
A Bayesian Hidden Markov Model-based approach for anomaly detection in electronic systems
NASA Astrophysics Data System (ADS)
Dorj, E.; Chen, C.; Pecht, M.
Early detection of anomalies in any system or component prevents impending failures and enhances performance and availability. The complex architecture of electronics, the interdependency of component functionalities, and the miniaturization of most electronic systems make it difficult to detect and analyze anomalous behaviors. A Hidden Markov Model-based classification technique determines unobservable hidden behaviors of complex and remotely inaccessible electronic systems using observable signals. This paper presents a data-driven approach for anomaly detection in electronic systems based on a Bayesian Hidden Markov Model classification technique. The posterior parameters of the Hidden Markov Models are estimated using the conjugate prior method. An application of the developed Bayesian Hidden Markov Model-based anomaly detection approach is presented for detecting anomalous behavior in Insulated Gate Bipolar Transistors using experimental data. The detection results illustrate that the developed anomaly detection approach can help detect anomalous behaviors in electronic systems, which can help prevent system downtime and catastrophic failures.
SensibleSleep: A Bayesian Model for Learning Sleep Patterns from Smartphone Events.
Cuttone, Andrea; Bækgaard, Per; Sekara, Vedran; Jonsson, Håkan; Larsen, Jakob Eg; Lehmann, Sune
2017-01-01
We propose a Bayesian model for extracting sleep patterns from smartphone events. Our method is able to identify individuals' daily sleep periods and their evolution over time, and provides an estimation of the probability of sleep and wake transitions. The model is fitted to more than 400 participants from two different datasets, and we verify the results against ground truth from dedicated armband sleep trackers. We show that the model is able to produce reliable sleep estimates with an accuracy of 0.89, both at the individual and at the collective level. Moreover the Bayesian model is able to quantify uncertainty and encode prior knowledge about sleep patterns. Compared with existing smartphone-based systems, our method requires only screen on/off events, and is therefore much less intrusive in terms of privacy and more battery-efficient.
SensibleSleep: A Bayesian Model for Learning Sleep Patterns from Smartphone Events
Sekara, Vedran; Jonsson, Håkan; Larsen, Jakob Eg; Lehmann, Sune
2017-01-01
We propose a Bayesian model for extracting sleep patterns from smartphone events. Our method is able to identify individuals’ daily sleep periods and their evolution over time, and provides an estimation of the probability of sleep and wake transitions. The model is fitted to more than 400 participants from two different datasets, and we verify the results against ground truth from dedicated armband sleep trackers. We show that the model is able to produce reliable sleep estimates with an accuracy of 0.89, both at the individual and at the collective level. Moreover the Bayesian model is able to quantify uncertainty and encode prior knowledge about sleep patterns. Compared with existing smartphone-based systems, our method requires only screen on/off events, and is therefore much less intrusive in terms of privacy and more battery-efficient. PMID:28076375
Partially linear models with autoregressive scale-mixtures of normal errors: A Bayesian approach
NASA Astrophysics Data System (ADS)
Ferreira, Guillermo; Castro, Mauricio; Lachos, Victor H.
2012-10-01
Normality and independence of error terms is a typical assumption for partial linear models. However, such an assumption may be unrealistic on many fields such as economics, finance and biostatistics. In this paper, we develop a Bayesian analysis for partial linear model with first-order autoregressive errors belonging to the class of scale mixtures of normal (SMN) distributions. The proposed model provides a useful generalization of the symmetrical linear regression models with independent error, since the error distribution cover both correlated and thick-tailed distribution, and has a convenient hierarchical representation allowing to us an easily implementation of a Markov chain Monte Carlo (MCMC) scheme. In order to examine the robustness of this distribution against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler (K-L) divergence. The proposed methodology is applied to the Cuprum Company monthly returns.
Genome Scans for Detecting Footprints of Local Adaptation Using a Bayesian Factor Model
Duforet-Frebourg, Nicolas; Bazin, Eric; Blum, Michael G.B.
2014-01-01
There is a considerable impetus in population genomics to pinpoint loci involved in local adaptation. A powerful approach to find genomic regions subject to local adaptation is to genotype numerous molecular markers and look for outlier loci. One of the most common approaches for selection scans is based on statistics that measure population differentiation such as FST. However, there are important caveats with approaches related to FST because they require grouping individuals into populations and they additionally assume a particular model of population structure. Here, we implement a more flexible individual-based approach based on Bayesian factor models. Factor models capture population structure with latent variables called factors, which can describe clustering of individuals into populations or isolation-by-distance patterns. Using hierarchical Bayesian modeling, we both infer population structure and identify outlier loci that are candidates for local adaptation. In order to identify outlier loci, the hierarchical factor model searches for loci that are atypically related to population structure as measured by the latent factors. In a model of population divergence, we show that it can achieve a 2-fold or more reduction of false discovery rate compared with the software BayeScan or with an FST approach. We show that our software can handle large data sets by analyzing the single nucleotide polymorphisms of the Human Genome Diversity Project. The Bayesian factor model is implemented in the open-source PCAdapt software. PMID:24899666
A Genomic Bayesian Multi-trait and Multi-environment Model
Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Toledo, Fernando H.; Pérez-Hernández, Oscar; Eskridge, Kent M.; Rutkoski, Jessica
2016-01-01
When information on multiple genotypes evaluated in multiple environments is recorded, a multi-environment single trait model for assessing genotype × environment interaction (G × E) is usually employed. Comprehensive models that simultaneously take into account the correlated traits and trait × genotype × environment interaction (T × G × E) are lacking. In this research, we propose a Bayesian model for analyzing multiple traits and multiple environments for whole-genome prediction (WGP) model. For this model, we used Half-t priors on each standard deviation term and uniform priors on each correlation of the covariance matrix. These priors were not informative and led to posterior inferences that were insensitive to the choice of hyper-parameters. We also developed a computationally efficient Markov Chain Monte Carlo (MCMC) under the above priors, which allowed us to obtain all required full conditional distributions of the parameters leading to an exact Gibbs sampling for the posterior distribution. We used two real data sets to implement and evaluate the proposed Bayesian method and found that when the correlation between traits was high (>0.5), the proposed model (with unstructured variance–covariance) improved prediction accuracy compared to the model with diagonal and standard variance–covariance structures. The R-software package Bayesian Multi-Trait and Multi-Environment (BMTME) offers optimized C++ routines to efficiently perform the analyses. PMID:27342738
Models and simulation of 3D neuronal dendritic trees using Bayesian networks.
López-Cruz, Pedro L; Bielza, Concha; Larrañaga, Pedro; Benavides-Piccione, Ruth; DeFelipe, Javier
2011-12-01
Neuron morphology is crucial for neuronal connectivity and brain information processing. Computational models are important tools for studying dendritic morphology and its role in brain function. We applied a class of probabilistic graphical models called Bayesian networks to generate virtual dendrites from layer III pyramidal neurons from three different regions of the neocortex of the mouse. A set of 41 morphological variables were measured from the 3D reconstructions of real dendrites and their probability distributions used in a machine learning algorithm to induce the model from the data. A simulation algorithm is also proposed to obtain new dendrites by sampling values from Bayesian networks. The main advantage of this approach is that it takes into account and automatically locates the relationships between variables in the data instead of using predefined dependencies. Therefore, the methodology can be applied to any neuronal class while at the same time exploiting class-specific properties. Also, a Bayesian network was defined for each part of the dendrite, allowing the relationships to change in the different sections and to model heterogeneous developmental factors or spatial influences. Several univariate statistical tests and a novel multivariate test based on Kullback-Leibler divergence estimation confirmed that virtual dendrites were similar to real ones. The analyses of the models showed relationships that conform to current neuroanatomical knowledge and support model correctness. At the same time, studying the relationships in the models can help to identify new interactions between variables related to dendritic morphology.
Calibration of complex models through Bayesian evidence synthesis: a demonstration and tutorial.
Jackson, Christopher H; Jit, Mark; Sharples, Linda D; De Angelis, Daniela
2015-02-01
Decision-analytic models must often be informed using data that are only indirectly related to the main model parameters. The authors outline how to implement a Bayesian synthesis of diverse sources of evidence to calibrate the parameters of a complex model. A graphical model is built to represent how observed data are generated from statistical models with unknown parameters and how those parameters are related to quantities of interest for decision making. This forms the basis of an algorithm to estimate a posterior probability distribution, which represents the updated state of evidence for all unknowns given all data and prior beliefs. This process calibrates the quantities of interest against data and, at the same time, propagates all parameter uncertainties to the results used for decision making. To illustrate these methods, the authors demonstrate how a previously developed Markov model for the progression of human papillomavirus (HPV-16) infection was rebuilt in a Bayesian framework. Transition probabilities between states of disease severity are inferred indirectly from cross-sectional observations of prevalence of HPV-16 and HPV-16-related disease by age, cervical cancer incidence, and other published information. Previously, a discrete collection of plausible scenarios was identified but with no further indication of which of these are more plausible. Instead, the authors derive a Bayesian posterior distribution, in which scenarios are implicitly weighted according to how well they are supported by the data. In particular, we emphasize the appropriate choice of prior distributions and checking and comparison of fitted models.
Bayesian Statistical Inference in Ion-Channel Models with Exact Missed Event Correction.
Epstein, Michael; Calderhead, Ben; Girolami, Mark A; Sivilotti, Lucia G
2016-07-26
The stochastic behavior of single ion channels is most often described as an aggregated continuous-time Markov process with discrete states. For ligand-gated channels each state can represent a different conformation of the channel protein or a different number of bound ligands. Single-channel recordings show only whether the channel is open or shut: states of equal conductance are aggregated, so transitions between them have to be inferred indirectly. The requirement to filter noise from the raw signal further complicates the modeling process, as it limits the time resolution of the data. The consequence of the reduced bandwidth is that openings or shuttings that are shorter than the resolution cannot be observed; these are known as missed events. Postulated models fitted using filtered data must therefore explicitly account for missed events to avoid bias in the estimation of rate parameters and therefore assess parameter identifiability accurately. In this article, we present the first, to our knowledge, Bayesian modeling of ion-channels with exact missed events correction. Bayesian analysis represents uncertain knowledge of the true value of model parameters by considering these parameters as random variables. This allows us to gain a full appreciation of parameter identifiability and uncertainty when estimating values for model parameters. However, Bayesian inference is particularly challenging in this context as the correction for missed events increases the computational complexity of the model likelihood. Nonetheless, we successfully implemented a two-step Markov chain Monte Carlo method that we called "BICME", which performs Bayesian inference in models of realistic complexity. The method is demonstrated on synthetic and real single-channel data from muscle nicotinic acetylcholine channels. We show that parameter uncertainty can be characterized more accurately than with maximum-likelihood methods. Our code for performing inference in these ion channel
Bayesian Analysis of Multivariate Probit Models with Surrogate Outcome Data
ERIC Educational Resources Information Center
Poon, Wai-Yin; Wang, Hai-Bin
2010-01-01
A new class of parametric models that generalize the multivariate probit model and the errors-in-variables model is developed to model and analyze ordinal data. A general model structure is assumed to accommodate the information that is obtained via surrogate variables. A hybrid Gibbs sampler is developed to estimate the model parameters. To…
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach.
Duarte, Belmiro P M; Wong, Weng Kee
2015-08-01
This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted.
NASA Astrophysics Data System (ADS)
Varvia, Petri; Rautiainen, Miina; Seppänen, Aku
2017-04-01
Hyperspectral remote sensing data carry information on the leaf area index (LAI) of forests, and thus in principle, LAI can be estimated based on the data by inverting a forest reflectance model. However, LAI is usually not the only unknown in a reflectance model; especially, the leaf spectral albedo and understory reflectance are also not known. If the uncertainties of these parameters are not accounted for, the inversion of a forest reflectance model can lead to biased estimates for LAI. In this paper, we study the effects of reflectance model uncertainties on LAI estimates, and further, investigate whether the LAI estimates could recover from these uncertainties with the aid of Bayesian inference. In the proposed approach, the unknown leaf albedo and understory reflectance are estimated simultaneously with LAI from hyperspectral remote sensing data. The feasibility of the approach is tested with numerical simulation studies. The results show that in the presence of unknown parameters, the Bayesian LAI estimates which account for the model uncertainties outperform the conventional estimates that are based on biased model parameters. Moreover, the results demonstrate that the Bayesian inference can also provide feasible measures for the uncertainty of the estimated LAI.
BAYESIAN INFERENCE OF HIDDEN GAMMA WEAR PROCESS MODEL FOR SURVIVAL DATA WITH TIES
Sinha, Arijit; Chi, Zhiyi; Chen, Ming-Hui
2015-01-01
Survival data often contain tied event times. Inference without careful treatment of the ties can lead to biased estimates. This paper develops the Bayesian analysis of a stochastic wear process model to fit survival data that might have a large number of ties. Under a general wear process model, we derive the likelihood of parameters. When the wear process is a Gamma process, the likelihood has a semi-closed form that allows posterior sampling to be carried out for the parameters, hence achieving model selection using Bayesian deviance information criterion. An innovative simulation algorithm via direct forward sampling and Gibbs sampling is developed to sample event times that may have ties in the presence of arbitrary covariates; this provides a tool to assess the precision of inference. An extensive simulation study is reported and a data set is used to further illustrate the proposed methodology. PMID:26576105
Use of Bayesian Markov Chain Monte Carlo methods to model cost-of-illness data.
Cooper, Nicola J; Sutton, Alex J; Mugford, Miranda; Abrams, Keith R
2003-01-01
It is well known that the modeling of cost data is often problematic due to the distribution of such data. Commonly observed problems include 1) a strongly right-skewed data distribution and 2) a significant percentage of zero-cost observations. This article demonstrates how a hurdle model can be implemented from a Bayesian perspective by means of Markov Chain Monte Carlo simulation methods using the freely available software WinBUGS. Assessment of model fit is addressed through the implementation of two cross-validation methods. The relative merits of this Bayesian approach compared to the classical equivalent are discussed in detail. To illustrate the methods described, patient-specific non-health-care resource-use data from a prospective longitudinal study and the Norfolk Arthritis Register (NOAR) are utilized for 218 individuals with early inflammatory polyarthritis (IP). The NOAR database also includes information on various patient-level covariates.
Bagging linear sparse Bayesian learning models for variable selection in cancer diagnosis.
Lu, Chuan; Devos, Andy; Suykens, Johan A K; Arús, Carles; Van Huffel, Sabine
2007-05-01
This paper investigates variable selection (VS) and classification for biomedical datasets with a small sample size and a very high input dimension. The sequential sparse Bayesian learning methods with linear bases are used as the basic VS algorithm. Selected variables are fed to the kernel-based probabilistic classifiers: Bayesian least squares support vector machines (BayLS-SVMs) and relevance vector machines (RVMs). We employ the bagging techniques for both VS and model building in order to improve the reliability of the selected variables and the predictive performance. This modeling strategy is applied to real-life medical classification problems, including two binary cancer diagnosis problems based on microarray data and a brain tumor multiclass classification problem using spectra acquired via magnetic resonance spectroscopy. The work is experimentally compared to other VS methods. It is shown that the use of bagging can improve the reliability and stability of both VS and model prediction.
BAYESIAN INFERENCE OF HIDDEN GAMMA WEAR PROCESS MODEL FOR SURVIVAL DATA WITH TIES.
Sinha, Arijit; Chi, Zhiyi; Chen, Ming-Hui
2015-10-01
Survival data often contain tied event times. Inference without careful treatment of the ties can lead to biased estimates. This paper develops the Bayesian analysis of a stochastic wear process model to fit survival data that might have a large number of ties. Under a general wear process model, we derive the likelihood of parameters. When the wear process is a Gamma process, the likelihood has a semi-closed form that allows posterior sampling to be carried out for the parameters, hence achieving model selection using Bayesian deviance information criterion. An innovative simulation algorithm via direct forward sampling and Gibbs sampling is developed to sample event times that may have ties in the presence of arbitrary covariates; this provides a tool to assess the precision of inference. An extensive simulation study is reported and a data set is used to further illustrate the proposed methodology.
Bayesian evidence computation for model selection in non-linear geoacoustic inference problems.
Dettmer, Jan; Dosso, Stan E; Osler, John C
2010-12-01
This paper applies a general Bayesian inference approach, based on Bayesian evidence computation, to geoacoustic inversion of interface-wave dispersion data. Quantitative model selection is carried out by computing the evidence (normalizing constants) for several model parameterizations using annealed importance sampling. The resulting posterior probability density estimate is compared to estimates obtained from Metropolis-Hastings sampling to ensure consistent results. The approach is applied to invert interface-wave dispersion data collected on the Scotian Shelf, off the east coast of Canada for the sediment shear-wave velocity profile. Results are consistent with previous work on these data but extend the analysis to a rigorous approach including model selection and uncertainty analysis. The results are also consistent with core samples and seismic reflection measurements carried out in the area.
Xu, Lei; Johnson, Timothy D.; Nichols, Thomas E.; Nee, Derek E.
2010-01-01
Summary The aim of this work is to develop a spatial model for multi-subject fMRI data. There has been extensive work on univariate modeling of each voxel for single and multi-subject data, some work on spatial modeling of single-subject data, and some recent work on spatial modeling of multi-subject data. However, there has been no work on spatial models that explicitly account for inter-subject variability in activation locations. In this work, we use the idea of activation centers and model the inter-subject variability in activation locations directly. Our model is specified in a Bayesian hierarchical frame work which allows us to draw inferences at all levels: the population level, the individual level and the voxel level. We use Gaussian mixtures for the probability that an individual has a particular activation. This helps answer an important question which is not addressed by any of the previous methods: What proportion of subjects had a significant activity in a given region. Our approach incorporates the unknown number of mixture components into the model as a parameter whose posterior distribution is estimated by reversible jump Markov Chain Monte Carlo. We demonstrate our method with a fMRI study of resolving proactive interference and show dramatically better precision of localization with our method relative to the standard mass-univariate method. Although we are motivated by fMRI data, this model could easily be modified to handle other types of imaging data. PMID:19210732
NASA Astrophysics Data System (ADS)
Freni, Gabriele; Mannina, Giorgio
In urban drainage modelling, uncertainty analysis is of undoubted necessity. However, uncertainty analysis in urban water-quality modelling is still in its infancy and only few studies have been carried out. Therefore, several methodological aspects still need to be experienced and clarified especially regarding water quality modelling. The use of the Bayesian approach for uncertainty analysis has been stimulated by its rigorous theoretical framework and by the possibility of evaluating the impact of new knowledge on the modelling predictions. Nevertheless, the Bayesian approach relies on some restrictive hypotheses that are not present in less formal methods like the Generalised Likelihood Uncertainty Estimation (GLUE). One crucial point in the application of Bayesian method is the formulation of a likelihood function that is conditioned by the hypotheses made regarding model residuals. Statistical transformations, such as the use of Box-Cox equation, are generally used to ensure the homoscedasticity of residuals. However, this practice may affect the reliability of the analysis leading to a wrong uncertainty estimation. The present paper aims to explore the influence of the Box-Cox equation for environmental water quality models. To this end, five cases were considered one of which was the “real” residuals distributions (i.e. drawn from available data). The analysis was applied to the Nocella experimental catchment (Italy) which is an agricultural and semi-urbanised basin where two sewer systems, two wastewater treatment plants and a river reach were monitored during both dry and wet weather periods. The results show that the uncertainty estimation is greatly affected by residual transformation and a wrong assumption may also affect the evaluation of model uncertainty. The use of less formal methods always provide an overestimation of modelling uncertainty with respect to Bayesian method but such effect is reduced if a wrong assumption is made regarding the
Bayesian latent variable models for the analysis of experimental psychology data.
Merkle, Edgar C; Wang, Ting
2016-03-18
In this paper, we address the use of Bayesian factor analysis and structural equation models to draw inferences from experimental psychology data. While such application is non-standard, the models are generally useful for the unified analysis of multivariate data that stem from, e.g., subjects' responses to multiple experimental stimuli. We first review the models and the parameter identification issues inherent in the models. We then provide details on model estimation via JAGS and on Bayes factor estimation. Finally, we use the models to re-analyze experimental data on risky choice, comparing the approach to simpler, alternative methods.
Eacker, Daniel R; Lukacs, Paul M; Proffitt, Kelly M; Hebblewhite, Mark
2017-02-11
To successfully respond to changing habitat, climate or harvest, managers need to identify the most effective strategies to reverse population trends of declining species and/or manage harvest of game species. A classic approach in conservation biology for the last two decades has been the use of matrix population models to determine the most important vital rates affecting population growth rate (λ), that is, sensitivity. Ecologists quickly realized the critical role of environmental variability in vital rates affecting population growth rate by developing approaches such as life-stage simulation analysis (LSA) that account for both sensitivity and variability of a vital rate. These LSA methods used matrix-population modeling and Monte Carlo simulation methods, but faced challenges in integrating data from different sources, disentangling process and sampling variation, and in their flexibility. Here, we developed a Bayesian integrated population model (IPM) for two populations of a large herbivore, elk (Cervus canadensis) in Montana, USA. We then extended the IPM to evaluate sensitivity in a Bayesian framework. We integrated known-fate survival data from radio-marked adults and juveniles, fecundity data, and population counts in a hierarchical population model that explicitly accounted for process and sampling variance. Next, we tested the prevailing paradigm in large herbivore population ecology that juvenile survival of neonates <90 days old drives λ using our Bayesian LSA approach. In contrast to the prevailing paradigm in large herbivore ecology, we found that adult female survival explained more of the variation in λ than elk calf survival, and that summer and winter elk calf survival periods were nearly equivalent in importance for λ. Our Bayesian IPM improved precision of our vital rate estimates and highlighted discrepancies between count and vital rate data that could refine population monitoring, demonstrating that combining sensitivity analysis
On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization
Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang
2015-02-01
The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesian inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.
NASA Astrophysics Data System (ADS)
Xu, T.; Valocchi, A. J.
2014-12-01
Effective water resource management typically relies on numerical models to analyse groundwater flow and solute transport processes. These models are usually subject to model structure error due to simplification and/or misrepresentation of the real system. As a result, the model outputs may systematically deviate from measurements, thus violating a key assumption for traditional regression-based calibration and uncertainty analysis. On the other hand, model structure error induced bias can be described statistically in an inductive, data-driven way based on historical model-to-measurement misfit. We adopt a fully Bayesian approach that integrates a Gaussian process error model to account for model structure error to the calibration, prediction and uncertainty analysis of groundwater models. The posterior distributions of parameters of the groundwater model and the Gaussian process error model are jointly inferred using DREAM, an efficient Markov chain Monte Carlo sampler. We test the usefulness of the fully Bayesian approach towards a synthetic case study of surface-ground water interaction under changing pumping conditions. We first illustrate through this example that traditional least squares regression without accounting for model structure error yields biased parameter estimates due to parameter compensation as well as biased predictions. In contrast, the Bayesian approach gives less biased parameter estimates. Moreover, the integration of a Gaussian process error model significantly reduces predictive bias and leads to prediction intervals that are more consistent with observations. The results highlight the importance of explicit treatment of model structure error especially in circumstances where subsequent decision-making and risk analysis require accurate prediction and uncertainty quantification. In addition, the data-driven error modelling approach is capable of extracting more information from observation data than using a groundwater model alone.