Science.gov

Sample records for bayesian random-effects model

  1. Bayesian nonparametric centered random effects models with variable selection.

    PubMed

    Yang, Mingan

    2013-03-01

    In a linear mixed effects model, it is common practice to assume that the random effects follow a parametric distribution such as a normal distribution with mean zero. However, in the case of variable selection, substantial violation of the normality assumption can potentially impact the subset selection and result in poor interpretation and even incorrect results. In nonparametric random effects models, the random effects generally have a nonzero mean, which causes an identifiability problem for the fixed effects that are paired with the random effects. In this article, we focus on a Bayesian method for variable selection. We characterize the subject-specific random effects nonparametrically with a Dirichlet process and resolve the bias simultaneously. In particular, we propose flexible modeling of the conditional distribution of the random effects with changes across the predictor space. The approach is implemented using a stochastic search Gibbs sampler to identify subsets of fixed effects and random effects to be included in the model. Simulations are provided to evaluate and compare the performance of our approach to the existing ones. We then apply the new approach to a real data example, cross-country and interlaboratory rodent uterotrophic bioassay.

  2. A Bayesian Random Effects Model for Testlets.

    ERIC Educational Resources Information Center

    Bradlow, Eric T.; Wainer, Howard; Wang, Xiaohui

    1999-01-01

    Proposes a parametric approach that involves a modification of standard Item Response Theory models that explicitly accounts for the nesting of items within the same testlets and that can be applied to multiple-choice sections comprising a mixture of independent items and testlets. (Author/SLD)

  3. A Robust Bayesian Random Effects Model for Nonlinear Calibration Problems

    PubMed Central

    Fong, Y.; Wakefield, J.; De Rosa, S.; Frahm, N.

    2013-01-01

    Summary In the context of a bioassay or an immunoassay, calibration means fitting a curve, usually nonlinear, through the observations collected on a set of samples containing known concentrations of a target substance, and then using the fitted curve and observations collected on samples of interest to predict the concentrations of the target substance in these samples. Recent technological advances have greatly improved our ability to quantify minute amounts of substance from a tiny volume of biological sample. This has in turn led to a need to improve statistical methods for calibration. In this paper, we focus on developing calibration methods robust to dependent outliers. We introduce a novel normal mixture model with dependent error terms to model the experimental noise. In addition, we propose a re-parameterization of the five parameter logistic nonlinear regression model that allows us to better incorporate prior information. We examine the performance of our methods with simulation studies and show that they lead to a substantial increase in performance measured in terms of mean squared error of estimation and a measure of the average prediction accuracy. A real data example from the HIV Vaccine Trials Network Laboratory is used to illustrate the methods. PMID:22551415

  4. A Bayesian random effects discrete-choice model for resource selection: Population-level selection inference

    USGS Publications Warehouse

    Thomas, D.L.; Johnson, D.; Griffith, B.

    2006-01-01

    Modeling the probability of use of land units characterized by discrete and continuous measures, we present a Bayesian random-effects model to assess resource selection. This model provides simultaneous estimation of both individual- and population-level selection. Deviance information criterion (DIC), a Bayesian alternative to AIC that is sample-size specific, is used for model selection. Aerial radiolocation data from 76 adult female caribou (Rangifer tarandus) and calf pairs during 1 year on an Arctic coastal plain calving ground were used to illustrate models and assess population-level selection of landscape attributes, as well as individual heterogeneity of selection. Landscape attributes included elevation, NDVI (a measure of forage greenness), and land cover-type classification. Results from the first of a 2-stage model-selection procedure indicated that there is substantial heterogeneity among cow-calf pairs with respect to selection of the landscape attributes. In the second stage, selection of models with heterogeneity included indicated that at the population-level, NDVI and land cover class were significant attributes for selection of different landscapes by pairs on the calving ground. Population-level selection coefficients indicate that the pairs generally select landscapes with higher levels of NDVI, but the relationship is quadratic. The highest rate of selection occurs at values of NDVI less than the maximum observed. Results for land cover-class selections coefficients indicate that wet sedge, moist sedge, herbaceous tussock tundra, and shrub tussock tundra are selected at approximately the same rate, while alpine and sparsely vegetated landscapes are selected at a lower rate. Furthermore, the variability in selection by individual caribou for moist sedge and sparsely vegetated landscapes is large relative to the variability in selection of other land cover types. The example analysis illustrates that, while sometimes computationally intense, a

  5. A Bayesian nonlinear random effects model for identification of defective batteries from lot samples

    NASA Astrophysics Data System (ADS)

    Cripps, Edward; Pecht, Michael

    2017-02-01

    Numerous materials and processes go into the manufacture of lithium-ion batteries, resulting in variations across batteries' capacity fade measurements. Accounting for this variability is essential when determining whether batteries are performing satisfactorily. Motivated by a real manufacturing problem, this article presents an approach to assess whether lithium-ion batteries from a production lot are not representative of a healthy population of batteries from earlier production lots, and to determine, based on capacity fade data, the earliest stage (in terms of cycles) that battery anomalies can be identified. The approach involves the use of a double exponential function to describe nonlinear capacity fade data. To capture the variability of repeated measurements on a number of individual batteries, the double exponential function is then embedded as the individual batteries' trajectories in a Bayesian random effects model. The model allows for probabilistic predictions of capacity fading not only at the underlying mean process level but also at the individual battery level. The results show good predictive coverage for individual batteries and demonstrate that, for our data, non-healthy lithium-ion batteries can be identified in as few as 50 cycles.

  6. Exploring Neighborhood Influences on Small-Area Variations in Intimate Partner Violence Risk: A Bayesian Random-Effects Modeling Approach

    PubMed Central

    Gracia, Enrique; López-Quílez, Antonio; Marco, Miriam; Lladosa, Silvia; Lila, Marisol

    2014-01-01

    This paper uses spatial data of cases of intimate partner violence against women (IPVAW) to examine neighborhood-level influences on small-area variations in IPVAW risk in a police district of the city of Valencia (Spain). To analyze area variations in IPVAW risk and its association with neighborhood-level explanatory variables we use a Bayesian spatial random-effects modeling approach, as well as disease mapping methods to represent risk probabilities in each area. Analyses show that IPVAW cases are more likely in areas of high immigrant concentration, high public disorder and crime, and high physical disorder. Results also show a spatial component indicating remaining variability attributable to spatially structured random effects. Bayesian spatial modeling offers a new perspective to identify IPVAW high and low risk areas, and provides a new avenue for the design of better-informed prevention and intervention strategies. PMID:24413701

  7. Bayesian informative dropout model for longitudinal binary data with random effects using conditional and joint modeling approaches.

    PubMed

    Chan, Jennifer S K

    2016-05-01

    Dropouts are common in longitudinal study. If the dropout probability depends on the missing observations at or after dropout, this type of dropout is called informative (or nonignorable) dropout (ID). Failure to accommodate such dropout mechanism into the model will bias the parameter estimates. We propose a conditional autoregressive model for longitudinal binary data with an ID model such that the probabilities of positive outcomes as well as the drop-out indicator in each occasion are logit linear in some covariates and outcomes. This model adopting a marginal model for outcomes and a conditional model for dropouts is called a selection model. To allow for the heterogeneity and clustering effects, the outcome model is extended to incorporate mixture and random effects. Lastly, the model is further extended to a novel model that models the outcome and dropout jointly such that their dependency is formulated through an odds ratio function. Parameters are estimated by a Bayesian approach implemented using the user-friendly Bayesian software WinBUGS. A methadone clinic dataset is analyzed to illustrate the proposed models. Result shows that the treatment time effect is still significant but weaker after allowing for an ID process in the data. Finally the effect of drop-out on parameter estimates is evaluated through simulation studies.

  8. Bayesian random effect models incorporating real-time weather and traffic data to investigate mountainous freeway hazardous factors.

    PubMed

    Yu, Rongjie; Abdel-Aty, Mohamed; Ahmed, Mohamed

    2013-01-01

    Freeway crash occurrences are highly influenced by geometric characteristics, traffic status, weather conditions and drivers' behavior. For a mountainous freeway which suffers from adverse weather conditions, it is critical to incorporate real-time weather information and traffic data in the crash frequency study. In this paper, a Bayesian inference method was employed to model one year's crash data on I-70 in the state of Colorado. Real-time weather and traffic variables, along with geometric characteristics variables were evaluated in the models. Two scenarios were considered in this study, one seasonal and one crash type based case. For the methodology part, the Poisson model and two random effect models with a Bayesian inference method were employed and compared in this study. Deviance Information Criterion (DIC) was utilized as a comparison factor. The correlated random effect models outperformed the others. The results indicate that the weather condition variables, especially precipitation, play a key role in the crash occurrence models. The conclusions imply that different active traffic management strategies should be designed based on seasons, and single-vehicle crashes have different crash mechanism compared to multi-vehicle crashes.

  9. Genetic analysis of the age at menopause by using estimating equations and Bayesian random effects models.

    PubMed

    Do, K A; Broom, B M; Kuhnert, P; Duffy, D L; Todorov, A A; Treloar, S A; Martin, N G

    2000-05-15

    Multi-wave self-report data on age at menopause in 2182 female twin pairs (1355 monozygotic and 827 dizygotic pairs), were analysed to estimate the genetic, common and unique environmental contribution to variation in age at menopause. Two complementary approaches for analysing correlated time-to-onset twin data are considered: the generalized estimating equations (GEE) method in which one can estimate zygosity-specific dependence simultaneously with regression coefficients that describe the average population response to changing covariates; and a subject-specific Bayesian mixed model in which heterogeneity in regression parameters is explicitly modelled and the different components of variation may be estimated directly. The proportional hazards and Weibull models were utilized, as both produce natural frameworks for estimating relative risks while adjusting for simultaneous effects of other covariates. A simple Markov chain Monte Carlo method for covariate imputation of missing data was used and the actual implementation of the Bayesian model was based on Gibbs sampling using the freeware package BUGS. Copyright 2000 John Wiley & Sons, Ltd.

  10. Comparing multilevel and Bayesian spatial random effects survival models to assess geographical inequalities in colorectal cancer survival: a case study.

    PubMed

    Dasgupta, Paramita; Cramb, Susanna M; Aitken, Joanne F; Turrell, Gavin; Baade, Peter D

    2014-10-04

    Multilevel and spatial models are being increasingly used to obtain substantive information on area-level inequalities in cancer survival. Multilevel models assume independent geographical areas, whereas spatial models explicitly incorporate geographical correlation, often via a conditional autoregressive prior. However the relative merits of these methods for large population-based studies have not been explored. Using a case-study approach, we report on the implications of using multilevel and spatial survival models to study geographical inequalities in all-cause survival. Multilevel discrete-time and Bayesian spatial survival models were used to study geographical inequalities in all-cause survival for a population-based colorectal cancer cohort of 22,727 cases aged 20-84 years diagnosed during 1997-2007 from Queensland, Australia. Both approaches were viable on this large dataset, and produced similar estimates of the fixed effects. After adding area-level covariates, the between-area variability in survival using multilevel discrete-time models was no longer significant. Spatial inequalities in survival were also markedly reduced after adjusting for aggregated area-level covariates. Only the multilevel approach however, provided an estimation of the contribution of geographical variation to the total variation in survival between individual patients. With little difference observed between the two approaches in the estimation of fixed effects, multilevel models should be favored if there is a clear hierarchical data structure and measuring the independent impact of individual- and area-level effects on survival differences is of primary interest. Bayesian spatial analyses may be preferred if spatial correlation between areas is important and if the priority is to assess small-area variations in survival and map spatial patterns. Both approaches can be readily fitted to geographically enabled survival data from international settings.

  11. Investigation of Hit-and-run Crash Occurrence and Severity Using Real-time Loop Detector Data and Hierarchical Bayesian Binary Logit Model with Random Effects.

    PubMed

    Xie, Meiquan; Cheng, Wen; Gill, Gurdiljot Singh; Zhou, Jiao; Jia, Xudong; Choi, Simon

    2017-08-24

    Most of the extensive research dedicated to identifying the influential factors of hit-and-run (HR) crashes has utilized the typical Maximum Likelihood Estimation Binary Logit models, and none of them have employed the real-time traffic data. To fill this gap, this study focused on investigating contributing factors of HR crashes, as well as the severity levels of HR. This study analyzed four-year crash and real time loop detector data by employing the hierarchical Bayesian models with random effects within a sequential Logit structure. Along with the evaluation of impact of random effects on model fitness and complexity, the prediction capability of the models was also examined. Stepwise incremental sensitivity and specificity were calculated and ROC (Receiver Operating Characteristic) curve was utilized to graphically illustrate the predictive performance of the model. Among the real-time flow variables, the average occupancy and speed from upstream detector was observed to be positively correlated with HR crash possibility. The average upstream speed and speed difference of upstream and downstream speed were correlated with the occurrence of severe HR crashes. Apart from real-time factors, the other variables found influential for HR and severe HR crashes were length of segment, adverse weather conditions, dark lighting conditions with malfunctioning street light, driving under influence of alcohol, width of inner shoulder, and night time. This study suggests the potential traffic conditions of HR and severe HR occurrence, which refer to relatively congested upstream traffic conditions with high upstream speed and significant speed deviations on long segments. The above findings suggest that traffic enforcement should be directed towards mitigating the risky driving under the aforementioned traffic conditions. Moreover, the enforcement agencies may employ alcohol checkpoints to counter DUI during the night time. As per the engineering improvements, wider inner

  12. Bayesian Estimation and Testing in Random Effects Meta-analysis of Rare Binary Adverse Events.

    PubMed

    Bai, Ou; Chen, Min; Wang, Xinlei

    Meta-analysis has been widely applied to rare adverse event data because it is very difficult to reliably detect the effect of a treatment on such events in an individual clinical study. However, it is known that standard meta-analysis methods are often biased, especially when the background incidence rate is very low. A recent work by Bhaumik et al. (2012) proposed new moment-based approaches under a natural random effects model, to improve estimation and testing of the treatment effect and the between-study heterogeneity parameter. It has been demonstrated that for rare binary events, their methods have superior performance to commonly-used meta-analysis methods. However, their comparison does not include any Bayesian methods, although Bayesian approaches are a natural and attractive choice under the random-effects model. In this paper, we study a Bayesian hierarchical approach to estimation and testing in meta-analysis of rare binary events using the random effects model in Bhaumik et al. (2012). We develop Bayesian estimators of the treatment effect and the heterogeneity parameter, as well as hypothesis testing methods based on Bayesian model selection procedures. We compare them with the existing methods through simulation. A data example is provided to illustrate the Bayesian approach as well.

  13. A non-parametric Bayesian model for joint cell clustering and cluster matching: identification of anomalous sample phenotypes with random effects.

    PubMed

    Dundar, Murat; Akova, Ferit; Yerebakan, Halid Z; Rajwa, Bartek

    2014-09-24

    Flow cytometry (FC)-based computer-aided diagnostics is an emerging technique utilizing modern multiparametric cytometry systems.The major difficulty in using machine-learning approaches for classification of FC data arises from limited access to a wide variety of anomalous samples for training. In consequence, any learning with an abundance of normal cases and a limited set of specific anomalous cases is biased towards the types of anomalies represented in the training set. Such models do not accurately identify anomalies, whether previously known or unknown, that may exist in future samples tested. Although one-class classifiers trained using only normal cases would avoid such a bias, robust sample characterization is critical for a generalizable model. Owing to sample heterogeneity and instrumental variability, arbitrary characterization of samples usually introduces feature noise that may lead to poor predictive performance. Herein, we present a non-parametric Bayesian algorithm called ASPIRE (anomalous sample phenotype identification with random effects) that identifies phenotypic differences across a batch of samples in the presence of random effects. Our approach involves simultaneous clustering of cellular measurements in individual samples and matching of discovered clusters across all samples in order to recover global clusters using probabilistic sampling techniques in a systematic way. We demonstrate the performance of the proposed method in identifying anomalous samples in two different FC data sets, one of which represents a set of samples including acute myeloid leukemia (AML) cases, and the other a generic 5-parameter peripheral-blood immunophenotyping. Results are evaluated in terms of the area under the receiver operating characteristics curve (AUC). ASPIRE achieved AUCs of 0.99 and 1.0 on the AML and generic blood immunophenotyping data sets, respectively. These results demonstrate that anomalous samples can be identified by ASPIRE with almost

  14. Application of Poisson random effect models for highway network screening.

    PubMed

    Jiang, Ximiao; Abdel-Aty, Mohamed; Alamili, Samer

    2014-02-01

    In recent years, Bayesian random effect models that account for the temporal and spatial correlations of crash data became popular in traffic safety research. This study employs random effect Poisson Log-Normal models for crash risk hotspot identification. Both the temporal and spatial correlations of crash data were considered. Potential for Safety Improvement (PSI) were adopted as a measure of the crash risk. Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in the Central Florida area, the random effect approaches were compared to the traditional Empirical Bayesian (EB) method and the conventional Bayesian Poisson Log-Normal model. A series of method examination tests were conducted to evaluate the performance of different approaches. These tests include the previously developed site consistence test, method consistence test, total rank difference test, and the modified total score test, as well as the newly proposed total safety performance measure difference test. Results show that the Bayesian Poisson model accounting for both temporal and spatial random effects (PTSRE) outperforms the model that with only temporal random effect, and both are superior to the conventional Poisson Log-Normal model (PLN) and the EB model in the fitting of crash data. Additionally, the method evaluation tests indicate that the PTSRE model is significantly superior to the PLN model and the EB model in consistently identifying hotspots during successive time periods. The results suggest that the PTSRE model is a superior alternative for road site crash risk hotspot identification.

  15. Analyzing degradation data with a random effects spline regression model

    DOE PAGES

    Fugate, Michael Lynn; Hamada, Michael Scott; Weaver, Brian Phillip

    2017-03-17

    This study proposes using a random effects spline regression model to analyze degradation data. Spline regression avoids having to specify a parametric function for the true degradation of an item. A distribution for the spline regression coefficients captures the variation of the true degradation curves from item to item. We illustrate the proposed methodology with a real example using a Bayesian approach. The Bayesian approach allows prediction of degradation of a population over time and estimation of reliability is easy to perform.

  16. Random effects in censored ordinal regression: latent structure and Bayesian approach.

    PubMed

    Xie, M; Simpson, D G; Carroll, R J

    2000-06-01

    This paper discusses random effects in censored ordinal regression and presents a Gibbs sampling approach to fit the regression model. A latent structure and its corresponding Bayesian formulation are introduced to effectively deal with heterogeneous and censored ordinal observations. This work is motivated by the need to analyze interval-censored ordinal data from multiple studies in toxicological risk assessment. Application of our methodology to the data offers further support to the conclusions developed earlier using GEE methods yet provides additional insight into the uncertainty levels of the risk estimates.

  17. Bayesian nonparametric regression analysis of data with random effects covariates from longitudinal measurements.

    PubMed

    Ryu, Duchwan; Li, Erning; Mallick, Bani K

    2011-06-01

    We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves.

  18. Bayesian regression analysis of data with random effects covariates from nonlinear longitudinal measurements

    PubMed Central

    De la Cruz, Rolando; Meza, Cristian; Arribas-Gil, Ana; Carroll, Raymond J.

    2016-01-01

    Joint models for a wide class of response variables and longitudinal measurements consist on a mixed-effects model to fit longitudinal trajectories whose random effects enter as covariates in a generalized linear model for the primary response. They provide a useful way to assess association between these two kinds of data, which in clinical studies are often collected jointly on a series of individuals and may help understanding, for instance, the mechanisms of recovery of a certain disease or the efficacy of a given therapy. When a nonlinear mixed-effects model is used to fit the longitudinal trajectories, the existing estimation strategies based on likelihood approximations have been shown to exhibit some computational efficiency problems (De la Cruz et al., 2011). In this article we consider a Bayesian estimation procedure for the joint model with a nonlinear mixed-effects model for the longitudinal data and a generalized linear model for the primary response. The proposed prior structure allows for the implementation of an MCMC sampler. Moreover, we consider that the errors in the longitudinal model may be correlated. We apply our method to the analysis of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. We also conduct a simulation study to assess the importance of modelling correlated errors and quantify the consequences of model misspecification. PMID:27274601

  19. Modelling the random effects covariance matrix in longitudinal data.

    PubMed

    Daniels, Michael J; Zhao, Yan D

    2003-05-30

    A common class of models for longitudinal data are random effects (mixed) models. In these models, the random effects covariance matrix is typically assumed constant across subject. However, in many situations this matrix may differ by measured covariates. In this paper, we propose an approach to model the random effects covariance matrix by using a special Cholesky decomposition of the matrix. In particular, we will allow the parameters that result from this decomposition to depend on subject-specific covariates and also explore ways to parsimoniously model these parameters. An advantage of this parameterization is that there is no concern about the positive definiteness of the resulting estimator of the covariance matrix. In addition, the parameters resulting from this decomposition have a sensible interpretation. We propose fully Bayesian modelling for which a simple Gibbs sampler can be implemented to sample from the posterior distribution of the parameters. We illustrate these models on data from depression studies and examine the impact of heterogeneity in the covariance matrix on estimation of both fixed and random effects.

  20. Modelling the random effects covariance matrix in longitudinal data

    PubMed Central

    Daniels, Michael J.; Zhao, Yan D.

    2009-01-01

    SUMMARY A common class of models for longitudinal data are random effects (mixed) models. In these models, the random effects covariance matrix is typically assumed constant across subject. However, in many situations this matrix may differ by measured covariates. In this paper, we propose an approach to model the random effects covariance matrix by using a special Cholesky decomposition of the matrix. In particular, we will allow the parameters that result from this decomposition to depend on subject-specific covariates and also explore ways to parsimoniously model these parameters. An advantage of this parameterization is that there is no concern about the positive definiteness of the resulting estimator of the covariance matrix. In addition, the parameters resulting from this decomposition have a sensible interpretation. We propose fully Bayesian modelling for which a simple Gibbs sampler can be implemented to sample from the posterior distribution of the parameters. We illustrate these models on data from depression studies and examine the impact of heterogeneity in the covariance matrix on estimation of both fixed and random effects. PMID:12720301

  1. Fixed and random effects selection in linear and logistic models.

    PubMed

    Kinney, Satkartar K; Dunson, David B

    2007-09-01

    We address the problem of selecting which variables should be included in the fixed and random components of logistic mixed effects models for correlated data. A fully Bayesian variable selection is implemented using a stochastic search Gibbs sampler to estimate the exact model-averaged posterior distribution. This approach automatically identifies subsets of predictors having nonzero fixed effect coefficients or nonzero random effects variance, while allowing uncertainty in the model selection process. Default priors are proposed for the variance components and an efficient parameter expansion Gibbs sampler is developed for posterior computation. The approach is illustrated using simulated data and an epidemiologic example.

  2. The Random-Effect DINA Model

    ERIC Educational Resources Information Center

    Huang, Hung-Yu; Wang, Wen-Chung

    2014-01-01

    The DINA (deterministic input, noisy, and gate) model has been widely used in cognitive diagnosis tests and in the process of test development. The outcomes known as slip and guess are included in the DINA model function representing the responses to the items. This study aimed to extend the DINA model by using the random-effect approach to allow…

  3. The Random-Effect DINA Model

    ERIC Educational Resources Information Center

    Huang, Hung-Yu; Wang, Wen-Chung

    2014-01-01

    The DINA (deterministic input, noisy, and gate) model has been widely used in cognitive diagnosis tests and in the process of test development. The outcomes known as slip and guess are included in the DINA model function representing the responses to the items. This study aimed to extend the DINA model by using the random-effect approach to allow…

  4. Random-effects models for longitudinal data

    SciTech Connect

    Laird, N.M.; Ware, J.H.

    1982-12-01

    Models for the analysis of longitudinal data must recognize the relationship between serial observations on the same unit. Multivariate models with general covariance structure are often difficult to apply to highly unbalanced data, whereas two-stage random-effects models can be used easily. In two-stage models, the probability distributions for the response vectors of different individuals belong to a single family, but some random-effects parameters vary across individuals, with a distribution specified at the second stage. A general family of models is discussed, which includes both growth models and repeated-measures models as special cases. A unified approach to fitting these models, based on a combination of empirical Bayes and maximum likelihood estimation of model parameters and using the EM algorithm, is discussed. Two examples are taken from a current epidemiological study of the health effects of air pollution.

  5. A Bayesian hierarchical method to account for random effects in cytogenetic dosimetry based on calibration curves.

    PubMed

    Mano, Shuhei; Suto, Yumiko

    2014-11-01

    The dicentric chromosome assay (DCA) is one of the most sensitive and reliable methods of inferring doses of radiation exposure in patients. In DCA, one calibration curve is prepared in advance by in vitro irradiation to blood samples from one or sometimes multiple healthy donors in considering possible inter-individual variability. Although the standard method has been demonstrated to be quite accurate for actual dose estimates, it cannot account for random effects, which come from such as the blood donor used to prepare the calibration curve, the radiation-exposed patient, and the examiners. To date, it is unknown how these random effects impact on the standard method of dose estimation. We propose a novel Bayesian hierarchical method that incorporates random effects into the dose estimation. To demonstrate dose estimation by the proposed method and to assess the impact of inter-individual variability in samples from multiple donors on the estimation, peripheral blood samples from 13 occupationally non-exposed, non-smoking, healthy individuals were collected and irradiated with gamma rays. The results clearly showed significant inter-individual variability and the standard method using a sample from a single donor gave anti-conservative confidence interval of the irradiated dose. In contrast, the Bayesian credible interval for irradiated dose calculated by the proposed method using samples from multiple donors properly covered the actual doses. Although the classical confidence interval of calibration curve with accounting inter-individual variability in samples from multiple donors was roughly coincident with the Bayesian credible interval, the proposed method has better reasoning and potential for extensions.

  6. An Evaluation of Information Criteria Use for Correct Cross-Classified Random Effects Model Selection

    ERIC Educational Resources Information Center

    Beretvas, S. Natasha; Murphy, Daniel L.

    2013-01-01

    The authors assessed correct model identification rates of Akaike's information criterion (AIC), corrected criterion (AICC), consistent AIC (CAIC), Hannon and Quinn's information criterion (HQIC), and Bayesian information criterion (BIC) for selecting among cross-classified random effects models. Performance of default values for the 5…

  7. An Evaluation of Information Criteria Use for Correct Cross-Classified Random Effects Model Selection

    ERIC Educational Resources Information Center

    Beretvas, S. Natasha; Murphy, Daniel L.

    2013-01-01

    The authors assessed correct model identification rates of Akaike's information criterion (AIC), corrected criterion (AICC), consistent AIC (CAIC), Hannon and Quinn's information criterion (HQIC), and Bayesian information criterion (BIC) for selecting among cross-classified random effects models. Performance of default values for the 5…

  8. Model Diagnostics for Bayesian Networks

    ERIC Educational Resources Information Center

    Sinharay, Sandip

    2006-01-01

    Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…

  9. A random effects epidemic-type aftershock sequence model.

    PubMed

    Lin, Feng-Chang

    2011-04-01

    We consider an extension of the temporal epidemic-type aftershock sequence (ETAS) model with random effects as a special case of a well-known doubly stochastic self-exciting point process. The new model arises from a deterministic function that is randomly scaled by a nonnegative random variable, which is unobservable but assumed to follow either positive stable or one-parameter gamma distribution with unit mean. Both random effects models are of interest although the one-parameter gamma random effects model is more popular when modeling associated survival times. Our estimation is based on the maximum likelihood approach with marginalized intensity. The methods are shown to perform well in simulation experiments. When applied to an earthquake sequence on the east coast of Taiwan, the extended model with positive stable random effects provides a better model fit, compared to the original ETAS model and the extended model with one-parameter gamma random effects.

  10. Logistic random effects regression models: a comparison of statistical packages for binary and ordinal outcomes

    PubMed Central

    2011-01-01

    Background Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. Methods We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC. Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. Results The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability

  11. Bayesian hierarchical modeling of drug stability data.

    PubMed

    Chen, Jie; Zhong, Jinglin; Nie, Lei

    2008-06-15

    Stability data are commonly analyzed using linear fixed or random effect model. The linear fixed effect model does not take into account the batch-to-batch variation, whereas the random effect model may suffer from the unreliable shelf-life estimates due to small sample size. Moreover, both methods do not utilize any prior information that might have been available. In this article, we propose a Bayesian hierarchical approach to modeling drug stability data. Under this hierarchical structure, we first use Bayes factor to test the poolability of batches. Given the decision on poolability of batches, we then estimate the shelf-life that applies to all batches. The approach is illustrated with two example data sets and its performance is compared in simulation studies with that of the commonly used frequentist methods. (c) 2008 John Wiley & Sons, Ltd.

  12. A Gompertzian model with random effects to cervical cancer growth

    SciTech Connect

    Mazlan, Mazma Syahidatul Ayuni; Rosli, Norhayati

    2015-05-15

    In this paper, a Gompertzian model with random effects is introduced to describe the cervical cancer growth. The parameters values of the mathematical model are estimated via maximum likehood estimation. We apply 4-stage Runge-Kutta (SRK4) for solving the stochastic model numerically. The efficiency of mathematical model is measured by comparing the simulated result and the clinical data of the cervical cancer growth. Low values of root mean-square error (RMSE) of Gompertzian model with random effect indicate good fits.

  13. A Gompertzian model with random effects to cervical cancer growth

    NASA Astrophysics Data System (ADS)

    Mazlan, Mazma Syahidatul Ayuni; Rosli, Norhayati

    2015-05-01

    In this paper, a Gompertzian model with random effects is introduced to describe the cervical cancer growth. The parameters values of the mathematical model are estimated via maximum likehood estimation. We apply 4-stage Runge-Kutta (SRK4) for solving the stochastic model numerically. The efficiency of mathematical model is measured by comparing the simulated result and the clinical data of the cervical cancer growth. Low values of root mean-square error (RMSE) of Gompertzian model with random effect indicate good fits.

  14. A Bayesian nonparametric meta-analysis model.

    PubMed

    Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G

    2015-03-01

    In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall effect size, such models may be adequate, but for prediction, they surely are not if the effect-size distribution exhibits non-normal behavior. To address this issue, we propose a Bayesian nonparametric meta-analysis model, which can describe a wider range of effect-size distributions, including unimodal symmetric distributions, as well as skewed and more multimodal distributions. We demonstrate our model through the analysis of real meta-analytic data arising from behavioral-genetic research. We compare the predictive performance of the Bayesian nonparametric model against various conventional and more modern normal fixed-effects and random-effects models. Copyright © 2014 John Wiley & Sons, Ltd.

  15. A random effects meta-analysis model with Box-Cox transformation.

    PubMed

    Yamaguchi, Yusuke; Maruo, Kazushi; Partlett, Christopher; Riley, Richard D

    2017-07-19

    In a random effects meta-analysis model, true treatment effects for each study are routinely assumed to follow a normal distribution. However, normality is a restrictive assumption and the misspecification of the random effects distribution may result in a misleading estimate of overall mean for the treatment effect, an inappropriate quantification of heterogeneity across studies and a wrongly symmetric prediction interval. We focus on problems caused by an inappropriate normality assumption of the random effects distribution, and propose a novel random effects meta-analysis model where a Box-Cox transformation is applied to the observed treatment effect estimates. The proposed model aims to normalise an overall distribution of observed treatment effect estimates, which is sum of the within-study sampling distributions and the random effects distribution. When sampling distributions are approximately normal, non-normality in the overall distribution will be mainly due to the random effects distribution, especially when the between-study variation is large relative to the within-study variation. The Box-Cox transformation addresses this flexibly according to the observed departure from normality. We use a Bayesian approach for estimating parameters in the proposed model, and suggest summarising the meta-analysis results by an overall median, an interquartile range and a prediction interval. The model can be applied for any kind of variables once the treatment effect estimate is defined from the variable. A simulation study suggested that when the overall distribution of treatment effect estimates are skewed, the overall mean and conventional I (2) from the normal random effects model could be inappropriate summaries, and the proposed model helped reduce this issue. We illustrated the proposed model using two examples, which revealed some important differences on summary results, heterogeneity measures and prediction intervals from the normal random effects model

  16. Cure fraction model with random effects for regional variation in cancer survival.

    PubMed

    Seppä, Karri; Hakulinen, Timo; Kim, Hyon-Jung; Läärä, Esa

    2010-11-30

    Assessing regional differences in the survival of cancer patients is important but difficult when separate regions are small or sparsely populated. In this paper, we apply a mixture cure fraction model with random effects to cause-specific survival data of female breast cancer patients collected by the population-based Finnish Cancer Registry. Two sets of random effects were used to capture the regional variation in the cure fraction and in the survival of the non-cured patients, respectively. This hierarchical model was implemented in a Bayesian framework using a Metropolis-within-Gibbs algorithm. To avoid poor mixing of the Markov chain, when the variance of either set of random effects was close to zero, posterior simulations were based on a parameter-expanded model with tailor-made proposal distributions in Metropolis steps. The random effects allowed the fitting of the cure fraction model to the sparse regional data and the estimation of the regional variation in 10-year cause-specific breast cancer survival with a parsimonious number of parameters. Before 1986, the capital of Finland clearly stood out from the rest, but since then all the 21 hospital districts have achieved approximately the same level of survival.

  17. Semiparametric Approach to a Random Effects Quantile Regression Model

    PubMed Central

    Kim, Mi-Ok; Yang, Yunwen

    2011-01-01

    We consider a random effects quantile regression analysis of clustered data and propose a semiparametric approach using empirical likelihood. The random regression coefficients are assumed independent with a common mean, following parametrically specified distributions. The common mean corresponds to the population-average effects of explanatory variables on the conditional quantile of interest, while the random coefficients represent cluster specific deviations in the covariate effects. We formulate the estimation of the random coefficients as an estimating equations problem and use empirical likelihood to incorporate the parametric likelihood of the random coefficients. A likelihood-like statistical criterion function is yield, which we show is asymptotically concave in a neighborhood of the true parameter value and motivates its maximizer as a natural estimator. We use Markov Chain Monte Carlo (MCMC) samplers in the Bayesian framework, and propose the resulting quasi-posterior mean as an estimator. We show that the proposed estimator of the population-level parameter is asymptotically normal and the estimators of the random coefficients are shrunk toward the population-level parameter in the first order asymptotic sense. These asymptotic results do not require Gaussian random effects, and the empirical likelihood based likelihood-like criterion function is free of parameters related to the error densities. This makes the proposed approach both flexible and computationally simple. We illustrate the methodology with two real data examples. PMID:22347760

  18. The Random-Effect Generalized Rating Scale Model

    ERIC Educational Resources Information Center

    Wang, Wen-Chung; Wu, Shiu-Lien

    2011-01-01

    Rating scale items have been widely used in educational and psychological tests. These items require people to make subjective judgments, and these subjective judgments usually involve randomness. To account for this randomness, Wang, Wilson, and Shih proposed the random-effect rating scale model in which the threshold parameters are treated as…

  19. Performance of Random Effects Model Estimators under Complex Sampling Designs

    ERIC Educational Resources Information Center

    Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan

    2011-01-01

    In this article, we consider estimation of parameters of random effects models from samples collected via complex multistage designs. Incorporation of sampling weights is one way to reduce estimation bias due to unequal probabilities of selection. Several weighting methods have been proposed in the literature for estimating the parameters of…

  20. Performance of Random Effects Model Estimators under Complex Sampling Designs

    ERIC Educational Resources Information Center

    Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan

    2011-01-01

    In this article, we consider estimation of parameters of random effects models from samples collected via complex multistage designs. Incorporation of sampling weights is one way to reduce estimation bias due to unequal probabilities of selection. Several weighting methods have been proposed in the literature for estimating the parameters of…

  1. The Random-Effect Generalized Rating Scale Model

    ERIC Educational Resources Information Center

    Wang, Wen-Chung; Wu, Shiu-Lien

    2011-01-01

    Rating scale items have been widely used in educational and psychological tests. These items require people to make subjective judgments, and these subjective judgments usually involve randomness. To account for this randomness, Wang, Wilson, and Shih proposed the random-effect rating scale model in which the threshold parameters are treated as…

  2. Bayesian Model Selection for Group Studies

    PubMed Central

    Stephan, Klaas Enno; Penny, Will D.; Daunizeau, Jean; Moran, Rosalyn J.; Friston, Karl J.

    2009-01-01

    Bayesian model selection (BMS) is a powerful method for determining the most likely among a set of competing hypotheses about the mechanisms that generated observed data. BMS has recently found widespread application in neuroimaging, particularly in the context of dynamic causal modelling (DCM). However, so far, combining BMS results from several subjects has relied on simple (fixed effects) metrics, e.g. the group Bayes factor (GBF), that do not account for group heterogeneity or outliers. In this paper, we compare the GBF with two random effects methods for BMS at the between-subject or group level. These methods provide inference on model-space using a classical and Bayesian perspective respectively. First, a classical (frequentist) approach uses the log model evidence as a subject-specific summary statistic. This enables one to use analysis of variance to test for differences in log-evidences over models, relative to inter-subject differences. We then consider the same problem in Bayesian terms and describe a novel hierarchical model, which is optimised to furnish a probability density on the models themselves. This new variational Bayes method rests on treating the model as a random variable and estimating the parameters of a Dirichlet distribution which describes the probabilities for all models considered. These probabilities then define a multinomial distribution over model space, allowing one to compute how likely it is that a specific model generated the data of a randomly chosen subject as well as the exceedance probability of one model being more likely than any other model. Using empirical and synthetic data, we show that optimising a conditional density of the model probabilities, given the log-evidences for each model over subjects, is more informative and appropriate than both the GBF and frequentist tests of the log-evidences. In particular, we found that the hierarchical Bayesian approach is considerably more robust than either of the other

  3. Bayesian analysis of CCDM models

    NASA Astrophysics Data System (ADS)

    Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.

    2017-09-01

    Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.

  4. Modeling of Academic Achievement of Primary School Students in Ethiopia Using Bayesian Multilevel Approach

    ERIC Educational Resources Information Center

    Sebro, Negusse Yohannes; Goshu, Ayele Taye

    2017-01-01

    This study aims to explore Bayesian multilevel modeling to investigate variations of average academic achievement of grade eight school students. A sample of 636 students is randomly selected from 26 private and government schools by a two-stage stratified sampling design. Bayesian method is used to estimate the fixed and random effects. Input and…

  5. Bayesian model reduction and empirical Bayes for group (DCM) studies.

    PubMed

    Friston, Karl J; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E; van Wijk, Bernadette C M; Ziegler, Gabriel; Zeidman, Peter

    2016-03-01

    This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level - e.g., dynamic causal models - and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  6. Bayesian model reduction and empirical Bayes for group (DCM) studies

    PubMed Central

    Friston, Karl J.; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E.; van Wijk, Bernadette C.M.; Ziegler, Gabriel; Zeidman, Peter

    2016-01-01

    This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level – e.g., dynamic causal models – and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570

  7. Bayesian Model Averaging for Propensity Score Analysis

    ERIC Educational Resources Information Center

    Kaplan, David; Chen, Jianshen

    2013-01-01

    The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…

  8. Semiparametric Bayesian inference on skew-normal joint modeling of multivariate longitudinal and survival data.

    PubMed

    Tang, An-Min; Tang, Nian-Sheng

    2015-02-28

    We propose a semiparametric multivariate skew-normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within-subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew-normal distribution to specify the within-subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis-Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within-subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies.

  9. Introduction to Bayesian modelling in dental research.

    PubMed

    Gilthorpe, M S; Maddick, I H; Petrie, A

    2000-12-01

    To explain the concepts and application of Bayesian modelling and how it can be applied to the analysis of dental research data. Methodological in nature, this article introduces Bayesian modelling through hypothetical dental examples. The synthesis of RCT results with previous evidence, including expert opinion, is used to illustrate full Bayesian modelling. Meta-analysis, in the form of empirical Bayesian modelling, is introduced. An example of full Bayesian modelling is described for the synthesis of evidence from several studies that investigate the success of root canal treatment. Hierarchical (Bayesian) modelling is demonstrated for a survey of childhood caries, where surface data is nested within subjects. Bayesian methods enhance interpretation of research evidence through the synthesis of information from multiple sources. Bayesian modelling is now readily accessible to clinical researchers and is able to augment the application of clinical decision making in the development of guidelines and clinical practice.

  10. Bayesian stable isotope mixing models

    EPA Science Inventory

    In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...

  11. Bayesian stable isotope mixing models

    EPA Science Inventory

    In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...

  12. Bayesian kinematic earthquake source models

    NASA Astrophysics Data System (ADS)

    Minson, S. E.; Simons, M.; Beck, J. L.; Genrich, J. F.; Galetzka, J. E.; Chowdhury, F.; Owen, S. E.; Webb, F.; Comte, D.; Glass, B.; Leiva, C.; Ortega, F. H.

    2009-12-01

    Most coseismic, postseismic, and interseismic slip models are based on highly regularized optimizations which yield one solution which satisfies the data given a particular set of regularizing constraints. This regularization hampers our ability to answer basic questions such as whether seismic and aseismic slip overlap or instead rupture separate portions of the fault zone. We present a Bayesian methodology for generating kinematic earthquake source models with a focus on large subduction zone earthquakes. Unlike classical optimization approaches, Bayesian techniques sample the ensemble of all acceptable models presented as an a posteriori probability density function (PDF), and thus we can explore the entire solution space to determine, for example, which model parameters are well determined and which are not, or what is the likelihood that two slip distributions overlap in space. Bayesian sampling also has the advantage that all a priori knowledge of the source process can be used to mold the a posteriori ensemble of models. Although very powerful, Bayesian methods have up to now been of limited use in geophysical modeling because they are only computationally feasible for problems with a small number of free parameters due to what is called the "curse of dimensionality." However, our methodology can successfully sample solution spaces of many hundreds of parameters, which is sufficient to produce finite fault kinematic earthquake models. Our algorithm is a modification of the tempered Markov chain Monte Carlo (tempered MCMC or TMCMC) method. In our algorithm, we sample a "tempered" a posteriori PDF using many MCMC simulations running in parallel and evolutionary computation in which models which fit the data poorly are preferentially eliminated in favor of models which better predict the data. We present results for both synthetic test problems as well as for the 2007 Mw 7.8 Tocopilla, Chile earthquake, the latter of which is constrained by InSAR, local high

  13. Random-effects models for meta-analytic structural equation modeling: review, issues, and illustrations.

    PubMed

    Cheung, Mike W-L; Cheung, Shu Fai

    2016-06-01

    Meta-analytic structural equation modeling (MASEM) combines the techniques of meta-analysis and structural equation modeling for the purpose of synthesizing correlation or covariance matrices and fitting structural equation models on the pooled correlation or covariance matrix. Both fixed-effects and random-effects models can be defined in MASEM. Random-effects models are well known in conventional meta-analysis but are less studied in MASEM. The primary objective of this paper was to address issues related to random-effects models in MASEM. Specifically, we compared two different random-effects models in MASEM-correlation-based MASEM and parameter-based MASEM-and explored their strengths and limitations. Two examples were used to illustrate the similarities and differences between these models. We offered some practical guidelines for choosing between these two models. Future directions for research on random-effects models in MASEM were also discussed. Copyright © 2016 John Wiley & Sons, Ltd.

  14. Bayesian multi-scale modeling for aggregated disease mapping data.

    PubMed

    Aregay, Mehreteab; Lawson, Andrew B; Faes, Christel; Kirby, Russell S

    2015-09-29

    In disease mapping, a scale effect due to an aggregation of data from a finer resolution level to a coarser level is a common phenomenon. This article addresses this issue using a hierarchical Bayesian modeling framework. We propose four different multiscale models. The first two models use a shared random effect that the finer level inherits from the coarser level. The third model assumes two independent convolution models at the finer and coarser levels. The fourth model applies a convolution model at the finer level, but the relative risk at the coarser level is obtained by aggregating the estimates at the finer level. We compare the models using the deviance information criterion (DIC) and Watanabe-Akaike information criterion (WAIC) that are applied to real and simulated data. The results indicate that the models with shared random effects outperform the other models on a range of criteria.

  15. Frequentist tests for Bayesian models

    NASA Astrophysics Data System (ADS)

    Lucy, L. B.

    2016-04-01

    Analogues of the frequentist chi-square and F tests are proposed for testing goodness-of-fit and consistency for Bayesian models. Simple examples exhibit these tests' detection of inconsistency between consecutive experiments with identical parameters, when the first experiment provides the prior for the second. In a related analysis, a quantitative measure is derived for judging the degree of tension between two different experiments with partially overlapping parameter vectors.

  16. Bayesian Model Averaging for Propensity Score Analysis.

    PubMed

    Kaplan, David; Chen, Jianshen

    2014-01-01

    This article considers Bayesian model averaging as a means of addressing uncertainty in the selection of variables in the propensity score equation. We investigate an approximate Bayesian model averaging approach based on the model-averaged propensity score estimates produced by the R package BMA but that ignores uncertainty in the propensity score. We also provide a fully Bayesian model averaging approach via Markov chain Monte Carlo sampling (MCMC) to account for uncertainty in both parameters and models. A detailed study of our approach examines the differences in the causal estimate when incorporating noninformative versus informative priors in the model averaging stage. We examine these approaches under common methods of propensity score implementation. In addition, we evaluate the impact of changing the size of Occam's window used to narrow down the range of possible models. We also assess the predictive performance of both Bayesian model averaging propensity score approaches and compare it with the case without Bayesian model averaging. Overall, results show that both Bayesian model averaging propensity score approaches recover the treatment effect estimates well and generally provide larger uncertainty estimates, as expected. Both Bayesian model averaging approaches offer slightly better prediction of the propensity score compared with the Bayesian approach with a single propensity score equation. Covariate balance checks for the case study show that both Bayesian model averaging approaches offer good balance. The fully Bayesian model averaging approach also provides posterior probability intervals of the balance indices.

  17. Bayesian population receptive field modelling.

    PubMed

    Zeidman, Peter; Silson, Edward Harry; Schwarzkopf, Dietrich Samuel; Baker, Chris Ian; Penny, Will

    2017-09-08

    We introduce a probabilistic (Bayesian) framework and associated software toolbox for mapping population receptive fields (pRFs) based on fMRI data. This generic approach is intended to work with stimuli of any dimension and is demonstrated and validated in the context of 2D retinotopic mapping. The framework enables the experimenter to specify generative (encoding) models of fMRI timeseries, in which experimental stimuli enter a pRF model of neural activity, which in turns drives a nonlinear model of neurovascular coupling and Blood Oxygenation Level Dependent (BOLD) response. The neuronal and haemodynamic parameters are estimated together on a voxel-by-voxel or region-of-interest basis using a Bayesian estimation algorithm (variational Laplace). This offers several novel contributions to receptive field modelling. The variance/covariance of parameters are estimated, enabling receptive fields to be plotted while properly representing uncertainty about pRF size and location. Variability in the haemodynamic response across the brain is accounted for. Furthermore, the framework introduces formal hypothesis testing to pRF analysis, enabling competing models to be evaluated based on their log model evidence (approximated by the variational free energy), which represents the optimal tradeoff between accuracy and complexity. Using simulations and empirical data, we found that parameters typically used to represent pRF size and neuronal scaling are strongly correlated, which is taken into account by the Bayesian methods we describe when making inferences. We used the framework to compare the evidence for six variants of pRF model using 7 T functional MRI data and we found a circular Difference of Gaussians (DoG) model to be the best explanation for our data overall. We hope this framework will prove useful for mapping stimulus spaces with any number of dimensions onto the anatomy of the brain. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  18. Flexible Bayesian Human Fecundity Models.

    PubMed

    Kim, Sungduk; Sundaram, Rajeshwari; Buck Louis, Germaine M; Pyper, Cecilia

    2012-12-01

    Human fecundity is an issue of considerable interest for both epidemiological and clinical audiences, and is dependent upon a couple's biologic capacity for reproduction coupled with behaviors that place a couple at risk for pregnancy. Bayesian hierarchical models have been proposed to better model the conception probabilities by accounting for the acts of intercourse around the day of ovulation, i.e., during the fertile window. These models can be viewed in the framework of a generalized nonlinear model with an exponential link. However, a fixed choice of link function may not always provide the best fit, leading to potentially biased estimates for probability of conception. Motivated by this, we propose a general class of models for fecundity by relaxing the choice of the link function under the generalized nonlinear model framework. We use a sample from the Oxford Conception Study (OCS) to illustrate the utility and fit of this general class of models for estimating human conception. Our findings reinforce the need for attention to be paid to the choice of link function in modeling conception, as it may bias the estimation of conception probabilities. Various properties of the proposed models are examined and a Markov chain Monte Carlo sampling algorithm was developed for implementing the Bayesian computations. The deviance information criterion measure and logarithm of pseudo marginal likelihood are used for guiding the choice of links. The supplemental material section contains technical details of the proof of the theorem stated in the paper, and contains further simulation results and analysis.

  19. Bayesian Networks for Social Modeling

    SciTech Connect

    Whitney, Paul D.; White, Amanda M.; Walsh, Stephen J.; Dalton, Angela C.; Brothers, Alan J.

    2011-03-28

    This paper describes a body of work developed over the past five years. The work addresses the use of Bayesian network (BN) models for representing and predicting social/organizational behaviors. The topics covered include model construction, validation, and use. These topics show the bulk of the lifetime of such model, beginning with construction, moving to validation and other aspects of model ‘critiquing’, and finally demonstrating how the modeling approach might be used to inform policy analysis. To conclude, we discuss limitations of using BN for this activity and suggest remedies to address those limitations. The primary benefits of using a well-developed computational, mathematical, and statistical modeling structure, such as BN, are 1) there are significant computational, theoretical and capability bases on which to build 2) ability to empirically critique the model, and potentially evaluate competing models for a social/behavioral phenomena.

  20. Random-effects models for serial observations with binary response

    SciTech Connect

    Stiratelli, R.; Laird, N.; Ware, J.H.

    1984-12-01

    This paper presents a general mixed model for the analysis of serial dichotomous responses provided by a panel of study participants. Each subject's serial responses are assumed to arise from a logistic model, but with regression coefficients that vary between subjects. The logistic regression parameters are assumed to be normally distributed in the population. Inference is based upon maximum likelihood estimation of fixed effects and variance components, and empirical Bayes estimation of random effects. Exact solutions are analytically and computationally infeasible, but an approximation based on the mode of the posterior distribution of the random parameters is proposed, and is implemented by means of the EM algorithm. This approximate method is compared with a simpler two-step method proposed by Korn and Whittemore, using data from a panel study of asthmatics originally described in that paper. One advantage of the estimation strategy described here is the ability to use all of the data, including that from subjects with insufficient data to permit fitting of a separate logistic regression model, as required by the Korn and Whittemore method. However, the new method is computationally intensive.

  1. A general joint model for longitudinal measurements and competing risks survival data with heterogeneous random effects

    PubMed Central

    Li, Gang; Elashoff, Robert M.; Pan, Jianxin

    2011-01-01

    This article studies a general joint model for longitudinal measurements and competing risks survival data. The model consists of a linear mixed effects sub-model for the longitudinal outcome, a proportional cause-specific hazards frailty sub-model for the competing risks survival data, and a regression sub-model for the variance–covariance matrix of the multivariate latent random effects based on a modified Cholesky decomposition. The model provides a useful approach to adjust for non-ignorable missing data due to dropout for the longitudinal outcome, enables analysis of the survival outcome with informative censoring and intermittently measured time-dependent covariates, as well as joint analysis of the longitudinal and survival outcomes. Unlike previously studied joint models, our model allows for heterogeneous random covariance matrices. It also offers a framework to assess the homogeneous covariance assumption of existing joint models. A Bayesian MCMC procedure is developed for parameter estimation and inference. Its performances and frequentist properties are investigated using simulations. A real data example is used to illustrate the usefulness of the approach. PMID:20549344

  2. Modeling Diagnostic Assessments with Bayesian Networks

    ERIC Educational Resources Information Center

    Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego

    2007-01-01

    This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…

  3. Modeling Diagnostic Assessments with Bayesian Networks

    ERIC Educational Resources Information Center

    Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego

    2007-01-01

    This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…

  4. Mixture Random-Effect IRT Models for Controlling Extreme Response Style on Rating Scales

    PubMed Central

    Huang, Hung-Yu

    2016-01-01

    Respondents are often requested to provide a response to Likert-type or rating-scale items during the assessment of attitude, interest, and personality to measure a variety of latent traits. Extreme response style (ERS), which is defined as a consistent and systematic tendency of a person to locate on a limited number of available rating-scale options, may distort the test validity. Several latent trait models have been proposed to address ERS, but all these models have limitations. Mixture random-effect item response theory (IRT) models for ERS are developed in this study to simultaneously identify the mixtures of latent classes from different ERS levels and detect the possible differential functioning items that result from different latent mixtures. The model parameters can be recovered fairly well in a series of simulations that use Bayesian estimation with the WinBUGS program. In addition, the model parameters in the developed models can be used to identify items that are likely to elicit ERS. The results show that a long test and large sample can improve the parameter estimation process; the precision of the parameter estimates increases with the number of response options, and the model parameter estimation outperforms the person parameter estimation. Ignoring the mixtures and ERS results in substantial rank-order changes in the target latent trait and a reduced classification accuracy of the response styles. An empirical survey of emotional intelligence in college students is presented to demonstrate the applications and implications of the new models. PMID:27853444

  5. A Bayesian nonlinear mixed-effects disease progression model.

    PubMed

    Kim, Seongho; Jang, Hyejeong; Wu, Dongfeng; Abrams, Judith

    2015-12-01

    A nonlinear mixed-effects approach is developed for disease progression models that incorporate variation in age in a Bayesian framework. We further generalize the probability model for sensitivity to depend on age at diagnosis, time spent in the preclinical state and sojourn time. The developed models are then applied to the Johns Hopkins Lung Project data and the Health Insurance Plan for Greater New York data using Bayesian Markov chain Monte Carlo and are compared with the estimation method that does not consider random-effects from age. Using the developed models, we obtain not only age-specific individual-level distributions, but also population-level distributions of sensitivity, sojourn time and transition probability.

  6. A Bayesian nonlinear mixed-effects disease progression model

    PubMed Central

    Kim, Seongho; Jang, Hyejeong; Wu, Dongfeng; Abrams, Judith

    2016-01-01

    A nonlinear mixed-effects approach is developed for disease progression models that incorporate variation in age in a Bayesian framework. We further generalize the probability model for sensitivity to depend on age at diagnosis, time spent in the preclinical state and sojourn time. The developed models are then applied to the Johns Hopkins Lung Project data and the Health Insurance Plan for Greater New York data using Bayesian Markov chain Monte Carlo and are compared with the estimation method that does not consider random-effects from age. Using the developed models, we obtain not only age-specific individual-level distributions, but also population-level distributions of sensitivity, sojourn time and transition probability. PMID:26798562

  7. Bayesian non parametric modelling of Higgs pair production

    NASA Astrophysics Data System (ADS)

    Scarpa, Bruno; Dorigo, Tommaso

    2017-03-01

    Statistical classification models are commonly used to separate a signal from a background. In this talk we face the problem of isolating the signal of Higgs pair production using the decay channel in which each boson decays into a pair of b-quarks. Typically in this context non parametric methods are used, such as Random Forests or different types of boosting tools. We remain in the same non-parametric framework, but we propose to face the problem following a Bayesian approach. A Dirichlet process is used as prior for the random effects in a logit model which is fitted by leveraging the Polya-Gamma data augmentation. Refinements of the model include the insertion in the simple model of P-splines to relate explanatory variables with the response and the use of Bayesian trees (BART) to describe the atoms in the Dirichlet process.

  8. Bayesian Models of Individual Differences

    PubMed Central

    Powell, Georgie; Meredith, Zoe; McMillin, Rebecca; Freeman, Tom C. A.

    2016-01-01

    According to Bayesian models, perception and cognition depend on the optimal combination of noisy incoming evidence with prior knowledge of the world. Individual differences in perception should therefore be jointly determined by a person’s sensitivity to incoming evidence and his or her prior expectations. It has been proposed that individuals with autism have flatter prior distributions than do nonautistic individuals, which suggests that prior variance is linked to the degree of autistic traits in the general population. We tested this idea by studying how perceived speed changes during pursuit eye movement and at low contrast. We found that individual differences in these two motion phenomena were predicted by differences in thresholds and autistic traits when combined in a quantitative Bayesian model. Our findings therefore support the flatter-prior hypothesis and suggest that individual differences in prior expectations are more systematic than previously thought. In order to be revealed, however, individual differences in sensitivity must also be taken into account. PMID:27770059

  9. A Monte Carlo method for Bayesian inference in frailty models.

    PubMed

    Clayton, D G

    1991-06-01

    Many analyses in epidemiological and prognostic studies and in studies of event history data require methods that allow for unobserved covariates or "frailties." Clayton and Cuzick (1985, Journal of the Royal Statistical Society, Series A 148, 82-117) proposed a generalization of the proportional hazards model that implemented such random effects, but the proof of the asymptotic properties of the method remains elusive, and practical experience suggests that the likelihoods may be markedly nonquadratic. This paper sets out a Bayesian representation of the model in the spirit of Kalbfleisch (1978, Journal of the Royal Statistical Society, Series B 40, 214-221) and discusses inference using Monte Carlo methods.

  10. Modeling multivariate survival data by a semiparametric random effects proportional odds model.

    PubMed

    Lam, K F; Lee, Y W; Leung, T L

    2002-06-01

    In this article, the focus is on the analysis of multivariate survival time data with various types of dependence structures. Examples of multivariate survival data include clustered data and repeated measurements from the same subject, such as the interrecurrence times of cancer tumors. A random effect semiparametric proportional odds model is proposed as an alternative to the proportional hazards model. The distribution of the random effects is assumed to be multivariate normal and the random effect is assumed to act additively to the baseline log-odds function. This class of models, which includes the usual shared random effects model, the additive variance components model, and the dynamic random effects model as special cases, is highly flexible and is capable of modeling a wide range of multivariate survival data. A unified estimation procedure is proposed to estimate the regression and dependence parameters simultaneously by means of a marginal-likelihood approach. Unlike the fully parametric case, the regression parameter estimate is not sensitive to the choice of correlation structure of the random effects. The marginal likelihood is approximated by the Monte Carlo method. Simulation studies are carried out to investigate the performance of the proposed method. The proposed method is applied to two well-known data sets, including clustered data and recurrent event times data.

  11. Properties of the Bayesian Knowledge Tracing Model

    ERIC Educational Resources Information Center

    van de Sande, Brett

    2013-01-01

    Bayesian Knowledge Tracing is used very widely to model student learning. It comes in two different forms: The first form is the Bayesian Knowledge Tracing "hidden Markov model" which predicts the probability of correct application of a skill as a function of the number of previous opportunities to apply that skill and the model…

  12. Properties of the Bayesian Knowledge Tracing Model

    ERIC Educational Resources Information Center

    van de Sande, Brett

    2013-01-01

    Bayesian Knowledge Tracing is used very widely to model student learning. It comes in two different forms: The first form is the Bayesian Knowledge Tracing "hidden Markov model" which predicts the probability of correct application of a skill as a function of the number of previous opportunities to apply that skill and the model…

  13. Nonlinear mixed-effects models for pharmacokinetic data analysis: assessment of the random-effects distribution.

    PubMed

    Drikvandi, Reza

    2017-02-13

    Nonlinear mixed-effects models are frequently used for pharmacokinetic data analysis, and they account for inter-subject variability in pharmacokinetic parameters by incorporating subject-specific random effects into the model. The random effects are often assumed to follow a (multivariate) normal distribution. However, many articles have shown that misspecifying the random-effects distribution can introduce bias in the estimates of parameters and affect inferences about the random effects themselves, such as estimation of the inter-subject variability. Because random effects are unobservable latent variables, it is difficult to assess their distribution. In a recent paper we developed a diagnostic tool based on the so-called gradient function to assess the random-effects distribution in mixed models. There we evaluated the gradient function for generalized liner mixed models and in the presence of a single random effect. However, assessing the random-effects distribution in nonlinear mixed-effects models is more challenging, especially when multiple random effects are present, and therefore the results from linear and generalized linear mixed models may not be valid for such nonlinear models. In this paper, we further investigate the gradient function and evaluate its performance for such nonlinear mixed-effects models which are common in pharmacokinetics and pharmacodynamics. We use simulations as well as real data from an intensive pharmacokinetic study to illustrate the proposed diagnostic tool.

  14. Bayesian inference for OPC modeling

    NASA Astrophysics Data System (ADS)

    Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.

    2016-03-01

    The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.

  15. Bayesian Calibration of Microsimulation Models.

    PubMed

    Rutter, Carolyn M; Miglioretti, Diana L; Savarino, James E

    2009-12-01

    Microsimulation models that describe disease processes synthesize information from multiple sources and can be used to estimate the effects of screening and treatment on cancer incidence and mortality at a population level. These models are characterized by simulation of individual event histories for an idealized population of interest. Microsimulation models are complex and invariably include parameters that are not well informed by existing data. Therefore, a key component of model development is the choice of parameter values. Microsimulation model parameter values are selected to reproduce expected or known results though the process of model calibration. Calibration may be done by perturbing model parameters one at a time or by using a search algorithm. As an alternative, we propose a Bayesian method to calibrate microsimulation models that uses Markov chain Monte Carlo. We show that this approach converges to the target distribution and use a simulation study to demonstrate its finite-sample performance. Although computationally intensive, this approach has several advantages over previously proposed methods, including the use of statistical criteria to select parameter values, simultaneous calibration of multiple parameters to multiple data sources, incorporation of information via prior distributions, description of parameter identifiability, and the ability to obtain interval estimates of model parameters. We develop a microsimulation model for colorectal cancer and use our proposed method to calibrate model parameters. The microsimulation model provides a good fit to the calibration data. We find evidence that some parameters are identified primarily through prior distributions. Our results underscore the need to incorporate multiple sources of variability (i.e., due to calibration data, unknown parameters, and estimated parameters and predicted values) when calibrating and applying microsimulation models.

  16. Bayesian model selection and isocurvature perturbations

    NASA Astrophysics Data System (ADS)

    Beltrán, María; García-Bellido, Juan; Lesgourgues, Julien; Liddle, Andrew R.; Slosar, Anže

    2005-03-01

    Present cosmological data are well explained assuming purely adiabatic perturbations, but an admixture of isocurvature perturbations is also permitted. We use a Bayesian framework to compare the performance of cosmological models including isocurvature modes with the purely adiabatic case; this framework automatically and consistently penalizes models which use more parameters to fit the data. We compute the Bayesian evidence for fits to a data set comprised of WMAP and other microwave anisotropy data, the galaxy power spectrum from 2dFGRS and SDSS, and Type Ia supernovae luminosity distances. We find that Bayesian model selection favors the purely adiabatic models, but so far only at low significance.

  17. Random-Effects Models for Meta-Analytic Structural Equation Modeling: Review, Issues, and Illustrations

    ERIC Educational Resources Information Center

    Cheung, Mike W.-L.; Cheung, Shu Fai

    2016-01-01

    Meta-analytic structural equation modeling (MASEM) combines the techniques of meta-analysis and structural equation modeling for the purpose of synthesizing correlation or covariance matrices and fitting structural equation models on the pooled correlation or covariance matrix. Both fixed-effects and random-effects models can be defined in MASEM.…

  18. Random-Effects Models for Meta-Analytic Structural Equation Modeling: Review, Issues, and Illustrations

    ERIC Educational Resources Information Center

    Cheung, Mike W.-L.; Cheung, Shu Fai

    2016-01-01

    Meta-analytic structural equation modeling (MASEM) combines the techniques of meta-analysis and structural equation modeling for the purpose of synthesizing correlation or covariance matrices and fitting structural equation models on the pooled correlation or covariance matrix. Both fixed-effects and random-effects models can be defined in MASEM.…

  19. Sparse Bayesian infinite factor models

    PubMed Central

    Bhattacharya, A.; Dunson, D. B.

    2011-01-01

    We focus on sparse modelling of high-dimensional covariance matrices using Bayesian latent factor models. We propose a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk towards zero as the column index increases. We use our prior on a parameter-expanded loading matrix to avoid the order dependence typical in factor analysis models and develop an efficient Gibbs sampler that scales well as data dimensionality increases. The gain in efficiency is achieved by the joint conjugacy property of the proposed prior, which allows block updating of the loadings matrix. We propose an adaptive Gibbs sampler for automatically truncating the infinite loading matrix through selection of the number of important factors. Theoretical results are provided on the support of the prior and truncation approximation bounds. A fast algorithm is proposed to produce approximate Bayes estimates. Latent factor regression methods are developed for prediction and variable selection in applications with high-dimensional correlated predictors. Operating characteristics are assessed through simulation studies, and the approach is applied to predict survival times from gene expression data. PMID:23049129

  20. Random effects coefficient of determination for mixed and meta-analysis models.

    PubMed

    Demidenko, Eugene; Sargent, James; Onega, Tracy

    2012-01-01

    The key feature of a mixed model is the presence of random effects. We have developed a coefficient, called the random effects coefficient of determination, [Formula: see text], that estimates the proportion of the conditional variance of the dependent variable explained by random effects. This coefficient takes values from 0 to 1 and indicates how strong the random effects are. The difference from the earlier suggested fixed effects coefficient of determination is emphasized. If [Formula: see text] is close to 0, there is weak support for random effects in the model because the reduction of the variance of the dependent variable due to random effects is small; consequently, random effects may be ignored and the model simplifies to standard linear regression. The value of [Formula: see text] apart from 0 indicates the evidence of the variance reduction in support of the mixed model. If random effects coefficient of determination is close to 1 the variance of random effects is very large and random effects turn into free fixed effects-the model can be estimated using the dummy variable approach. We derive explicit formulas for [Formula: see text] in three special cases: the random intercept model, the growth curve model, and meta-analysis model. Theoretical results are illustrated with three mixed model examples: (1) travel time to the nearest cancer center for women with breast cancer in the U.S., (2) cumulative time watching alcohol related scenes in movies among young U.S. teens, as a risk factor for early drinking onset, and (3) the classic example of the meta-analysis model for combination of 13 studies on tuberculosis vaccine.

  1. Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models.

    PubMed

    Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A; Burgueño, Juan; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo

    2017-01-05

    The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects [Formula: see text] that can be assessed by the Kronecker product of variance-covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model [Formula: see text] plus an extra component, F: , that captures random effects between environments that were not captured by the random effects [Formula: see text] We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with [Formula: see text] over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect [Formula: see text]. Copyright © 2017 Cuevas et al.

  2. Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models

    PubMed Central

    Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A.; Burgueño, Juan; Pérez-Rodríguez, Paulino; de los Campos, Gustavo

    2016-01-01

    The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects (u) that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model (u) plus an extra component, f, that captures random effects between environments that were not captured by the random effects u. We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u. PMID:27793970

  3. Bayesian Methods for High Dimensional Linear Models

    PubMed Central

    Mallick, Himel; Yi, Nengjun

    2013-01-01

    In this article, we present a selective overview of some recent developments in Bayesian model and variable selection methods for high dimensional linear models. While most of the reviews in literature are based on conventional methods, we focus on recently developed methods, which have proven to be successful in dealing with high dimensional variable selection. First, we give a brief overview of the traditional model selection methods (viz. Mallow’s Cp, AIC, BIC, DIC), followed by a discussion on some recently developed methods (viz. EBIC, regularization), which have occupied the minds of many statisticians. Then, we review high dimensional Bayesian methods with a particular emphasis on Bayesian regularization methods, which have been used extensively in recent years. We conclude by briefly addressing the asymptotic behaviors of Bayesian variable selection methods for high dimensional linear models under different regularity conditions. PMID:24511433

  4. Bayesian Modeling of a Human MMORPG Player

    NASA Astrophysics Data System (ADS)

    Synnaeve, Gabriel; Bessière, Pierre

    2011-03-01

    This paper describes an application of Bayesian programming to the control of an autonomous avatar in a multiplayer role-playing game (the example is based on World of Warcraft). We model a particular task, which consists of choosing what to do and to select which target in a situation where allies and foes are present. We explain the model in Bayesian programming and show how we could learn the conditional probabilities from data gathered during human-played sessions.

  5. A Bayesian, generalized frailty model for comet assays.

    PubMed

    Ghebretinsae, Aklilu Habteab; Faes, Christel; Molenberghs, Geert; De Boeck, Marlies; Geys, Helena

    2013-05-01

    This paper proposes a flexible modeling approach for so-called comet assay data regularly encountered in preclinical research. While such data consist of non-Gaussian outcomes in a multilevel hierarchical structure, traditional analyses typically completely or partly ignore this hierarchical nature by summarizing measurements within a cluster. Non-Gaussian outcomes are often modeled using exponential family models. This is true not only for binary and count data, but also for, example, time-to-event outcomes. Two important reasons for extending this family are for (1) the possible occurrence of overdispersion, meaning that the variability in the data may not be adequately described by the models, which often exhibit a prescribed mean-variance link, and (2) the accommodation of a hierarchical structure in the data, owing to clustering in the data. The first issue is dealt with through so-called overdispersion models. Clustering is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. In the case of time-to-event data, one encounters, for example, the gamma frailty model (Duchateau and Janssen, 2007 ). While both of these issues may occur simultaneously, models combining both are uncommon. Molenberghs et al. ( 2010 ) proposed a broad class of generalized linear models accommodating overdispersion and clustering through two separate sets of random effects. Here, we use this method to model data from a comet assay with a three-level hierarchical structure. Although a conjugate gamma random effect is used for the overdispersion random effect, both gamma and normal random effects are considered for the hierarchical random effect. Apart from model formulation, we place emphasis on Bayesian estimation. Our proposed method has an upper hand over the traditional analysis in that it (1) uses the appropriate distribution stipulated in the literature; (2) deals

  6. A likelihood reformulation method in non-normal random effects models.

    PubMed

    Liu, Lei; Yu, Zhangsheng

    2008-07-20

    In this paper, we propose a practical computational method to obtain the maximum likelihood estimates (MLE) for mixed models with non-normal random effects. By simply multiplying and dividing a standard normal density, we reformulate the likelihood conditional on the non-normal random effects to that conditional on the normal random effects. Gaussian quadrature technique, conveniently implemented in SAS Proc NLMIXED, can then be used to carry out the estimation process. Our method substantially reduces computational time, while yielding similar estimates to the probability integral transformation method (J. Comput. Graphical Stat. 2006; 15:39-57). Furthermore, our method can be applied to more general situations, e.g. finite mixture random effects or correlated random effects from Clayton copula. Simulations and applications are presented to illustrate our method.

  7. Modeling zero-inflated count data using a covariate-dependent random effect model.

    PubMed

    Wong, Kin-Yau; Lam, K F

    2013-04-15

    In various medical related researches, excessive zeros, which make the standard Poisson regression model inadequate, often exist in count data. We proposed a covariate-dependent random effect model to accommodate the excess zeros and the heterogeneity in the population simultaneously. This work is motivated by a data set from a survey on the dental health status of Hong Kong preschool children where the response variable is the number of decayed, missing, or filled teeth. The random effect has a sound biological interpretation as the overall oral health status or other personal qualities of an individual child that is unobserved and unable to be quantified easily. The overall measure of oral health status, responsible for accommodating the excessive zeros and also the heterogeneity among the children, is covariate dependent. This covariate-dependent random effect model allows one to distinguish whether a potential covariate has an effect on the conceived overall oral health condition of the children, that is, the random effect, or has a direct effect on the magnitude of the counts, or both. We proposed a multiple imputation approach for estimation of the parameters. We discussed the choice of the imputation size. We evaluated the performance of the proposed estimation method through simulation studies, and we applied the model and method to the dental data.

  8. Random effects logistic models for analysing efficacy of a longitudinal randomized treatment with non-adherence.

    PubMed

    Small, Dylan S; Ten Have, Thomas R; Joffe, Marshall M; Cheng, Jing

    2006-06-30

    We present a random effects logistic approach for estimating the efficacy of treatment for compliers in a randomized trial with treatment non-adherence and longitudinal binary outcomes. We use our approach to analyse a primary care depression intervention trial. The use of a random effects model to estimate efficacy supplements intent-to-treat longitudinal analyses based on random effects logistic models that are commonly used in primary care depression research. Our estimation approach is an extension of Nagelkerke et al.'s instrumental variables approximation for cross-sectional binary outcomes. Our approach is easily implementable with standard random effects logistic regression software. We show through a simulation study that our approach provides reasonably accurate inferences for the setting of the depression trial under model assumptions. We also evaluate the sensitivity of our approach to model assumptions for the depression trial. Copyright (c) 2005 John Wiley & Sons, Ltd.

  9. Bayesian Data-Model Fit Assessment for Structural Equation Modeling

    ERIC Educational Resources Information Center

    Levy, Roy

    2011-01-01

    Bayesian approaches to modeling are receiving an increasing amount of attention in the areas of model construction and estimation in factor analysis, structural equation modeling (SEM), and related latent variable models. However, model diagnostics and model criticism remain relatively understudied aspects of Bayesian SEM. This article describes…

  10. Bayesian Data-Model Fit Assessment for Structural Equation Modeling

    ERIC Educational Resources Information Center

    Levy, Roy

    2011-01-01

    Bayesian approaches to modeling are receiving an increasing amount of attention in the areas of model construction and estimation in factor analysis, structural equation modeling (SEM), and related latent variable models. However, model diagnostics and model criticism remain relatively understudied aspects of Bayesian SEM. This article describes…

  11. Bayesian modeling of unknown diseases for biosurveillance.

    PubMed

    Shen, Yanna; Cooper, Gregory F

    2009-11-14

    This paper investigates Bayesian modeling of unknown causes of events in the context of disease-outbreak detection. We introduce a Bayesian approach that models and detects both (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A key contribution of this paper is that it introduces a Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has broad applicability in medical informatics, where the space of known causes of outcomes of interest is seldom complete.

  12. Current Challenges in Bayesian Model Choice

    NASA Astrophysics Data System (ADS)

    Clyde, M. A.; Berger, J. O.; Bullard, F.; Ford, E. B.; Jefferys, W. H.; Luo, R.; Paulo, R.; Loredo, T.

    2007-11-01

    Model selection (and the related issue of model uncertainty) arises in many astronomical problems, and, in particular, has been one of the focal areas of the Exoplanet working group under the SAMSI (Statistics and Applied Mathematical Sciences Institute) Astrostatistcs Exoplanet program. We provide an overview of the Bayesian approach to model selection and highlight the challenges involved in implementing Bayesian model choice in four stylized problems. We review some of the current methods used by statisticians and astronomers and present recent developments in the area. We discuss the applicability, computational challenges, and performance of suggested methods and conclude with recommendations and open questions.

  13. An Integrated Bayesian Model for DIF Analysis

    ERIC Educational Resources Information Center

    Soares, Tufi M.; Goncalves, Flavio B.; Gamerman, Dani

    2009-01-01

    In this article, an integrated Bayesian model for differential item functioning (DIF) analysis is proposed. The model is integrated in the sense of modeling the responses along with the DIF analysis. This approach allows DIF detection and explanation in a simultaneous setup. Previous empirical studies and/or subjective beliefs about the item…

  14. Posterior Predictive Model Checking in Bayesian Networks

    ERIC Educational Resources Information Center

    Crawford, Aaron

    2014-01-01

    This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex…

  15. Posterior Predictive Model Checking in Bayesian Networks

    ERIC Educational Resources Information Center

    Crawford, Aaron

    2014-01-01

    This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex…

  16. Estimating anatomical trajectories with Bayesian mixed-effects modeling.

    PubMed

    Ziegler, G; Penny, W D; Ridgway, G R; Ourselin, S; Friston, K J

    2015-11-01

    We introduce a mass-univariate framework for the analysis of whole-brain structural trajectories using longitudinal Voxel-Based Morphometry data and Bayesian inference. Our approach to developmental and aging longitudinal studies characterizes heterogeneous structural growth/decline between and within groups. In particular, we propose a probabilistic generative model that parameterizes individual and ensemble average changes in brain structure using linear mixed-effects models of age and subject-specific covariates. Model inversion uses Expectation Maximization (EM), while voxelwise (empirical) priors on the size of individual differences are estimated from the data. Bayesian inference on individual and group trajectories is realized using Posterior Probability Maps (PPM). In addition to parameter inference, the framework affords comparisons of models with varying combinations of model order for fixed and random effects using model evidence. We validate the model in simulations and real MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) project. We further demonstrate how subject specific characteristics contribute to individual differences in longitudinal volume changes in healthy subjects, Mild Cognitive Impairment (MCI), and Alzheimer's Disease (AD). Copyright © 2015. Published by Elsevier Inc.

  17. Estimating anatomical trajectories with Bayesian mixed-effects modeling

    PubMed Central

    Ziegler, G.; Penny, W.D.; Ridgway, G.R.; Ourselin, S.; Friston, K.J.

    2015-01-01

    We introduce a mass-univariate framework for the analysis of whole-brain structural trajectories using longitudinal Voxel-Based Morphometry data and Bayesian inference. Our approach to developmental and aging longitudinal studies characterizes heterogeneous structural growth/decline between and within groups. In particular, we propose a probabilistic generative model that parameterizes individual and ensemble average changes in brain structure using linear mixed-effects models of age and subject-specific covariates. Model inversion uses Expectation Maximization (EM), while voxelwise (empirical) priors on the size of individual differences are estimated from the data. Bayesian inference on individual and group trajectories is realized using Posterior Probability Maps (PPM). In addition to parameter inference, the framework affords comparisons of models with varying combinations of model order for fixed and random effects using model evidence. We validate the model in simulations and real MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) project. We further demonstrate how subject specific characteristics contribute to individual differences in longitudinal volume changes in healthy subjects, Mild Cognitive Impairment (MCI), and Alzheimer's Disease (AD). PMID:26190405

  18. Bayesian modeling of flexible cognitive control

    PubMed Central

    Jiang, Jiefeng; Heller, Katherine; Egner, Tobias

    2014-01-01

    “Cognitive control” describes endogenous guidance of behavior in situations where routine stimulus-response associations are suboptimal for achieving a desired goal. The computational and neural mechanisms underlying this capacity remain poorly understood. We examine recent advances stemming from the application of a Bayesian learner perspective that provides optimal prediction for control processes. In reviewing the application of Bayesian models to cognitive control, we note that an important limitation in current models is a lack of a plausible mechanism for the flexible adjustment of control over conflict levels changing at varying temporal scales. We then show that flexible cognitive control can be achieved by a Bayesian model with a volatility-driven learning mechanism that modulates dynamically the relative dependence on recent and remote experiences in its prediction of future control demand. We conclude that the emergent Bayesian perspective on computational mechanisms of cognitive control holds considerable promise, especially if future studies can identify neural substrates of the variables encoded by these models, and determine the nature (Bayesian or otherwise) of their neural implementation. PMID:24929218

  19. Bayesian modeling of flexible cognitive control.

    PubMed

    Jiang, Jiefeng; Heller, Katherine; Egner, Tobias

    2014-10-01

    "Cognitive control" describes endogenous guidance of behavior in situations where routine stimulus-response associations are suboptimal for achieving a desired goal. The computational and neural mechanisms underlying this capacity remain poorly understood. We examine recent advances stemming from the application of a Bayesian learner perspective that provides optimal prediction for control processes. In reviewing the application of Bayesian models to cognitive control, we note that an important limitation in current models is a lack of a plausible mechanism for the flexible adjustment of control over conflict levels changing at varying temporal scales. We then show that flexible cognitive control can be achieved by a Bayesian model with a volatility-driven learning mechanism that modulates dynamically the relative dependence on recent and remote experiences in its prediction of future control demand. We conclude that the emergent Bayesian perspective on computational mechanisms of cognitive control holds considerable promise, especially if future studies can identify neural substrates of the variables encoded by these models, and determine the nature (Bayesian or otherwise) of their neural implementation. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. A Bayesian Nonparametric Meta-Analysis Model

    ERIC Educational Resources Information Center

    Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.

    2015-01-01

    In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…

  1. A Bayesian Nonparametric Meta-Analysis Model

    ERIC Educational Resources Information Center

    Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.

    2015-01-01

    In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…

  2. Cross-Classified Random Effects Models in Institutional Research

    ERIC Educational Resources Information Center

    Meyers, Laura E.

    2012-01-01

    Multilevel modeling offers researchers a rich array of tools that can be used for a variety of purposes, such as analyzing specific institutional issues, looking for macro-level trends, and helping to shape and inform educational policy. One of the more complex multilevel modeling tools available to institutional researchers is cross-classified…

  3. Cross-Classified Random Effects Models in Institutional Research

    ERIC Educational Resources Information Center

    Meyers, Laura E.

    2012-01-01

    Multilevel modeling offers researchers a rich array of tools that can be used for a variety of purposes, such as analyzing specific institutional issues, looking for macro-level trends, and helping to shape and inform educational policy. One of the more complex multilevel modeling tools available to institutional researchers is cross-classified…

  4. Heterogeneous Factor Analysis Models: A Bayesian Approach.

    ERIC Educational Resources Information Center

    Ansari, Asim; Jedidi, Kamel; Dube, Laurette

    2002-01-01

    Developed Markov Chain Monte Carlo procedures to perform Bayesian inference, model checking, and model comparison in heterogeneous factor analysis. Tested the approach with synthetic data and data from a consumption emotion study involving 54 consumers. Results show that traditional psychometric methods cannot fully capture the heterogeneity in…

  5. A spatial error model with continuous random effects and an application to growth convergence

    NASA Astrophysics Data System (ADS)

    Laurini, Márcio Poletti

    2017-10-01

    We propose a spatial error model with continuous random effects based on Matérn covariance functions and apply this model for the analysis of income convergence processes (β -convergence). The use of a model with continuous random effects permits a clearer visualization and interpretation of the spatial dependency patterns, avoids the problems of defining neighborhoods in spatial econometrics models, and allows projecting the spatial effects for every possible location in the continuous space, circumventing the existing aggregations in discrete lattice representations. We apply this model approach to analyze the economic growth of Brazilian municipalities between 1991 and 2010 using unconditional and conditional formulations and a spatiotemporal model of convergence. The results indicate that the estimated spatial random effects are consistent with the existence of income convergence clubs for Brazilian municipalities in this period.

  6. A spatial error model with continuous random effects and an application to growth convergence

    NASA Astrophysics Data System (ADS)

    Laurini, Márcio Poletti

    2017-07-01

    We propose a spatial error model with continuous random effects based on Matérn covariance functions and apply this model for the analysis of income convergence processes (β -convergence). The use of a model with continuous random effects permits a clearer visualization and interpretation of the spatial dependency patterns, avoids the problems of defining neighborhoods in spatial econometrics models, and allows projecting the spatial effects for every possible location in the continuous space, circumventing the existing aggregations in discrete lattice representations. We apply this model approach to analyze the economic growth of Brazilian municipalities between 1991 and 2010 using unconditional and conditional formulations and a spatiotemporal model of convergence. The results indicate that the estimated spatial random effects are consistent with the existence of income convergence clubs for Brazilian municipalities in this period.

  7. A Mixture Proportional Hazards Model with Random Effects for Response Times in Tests

    ERIC Educational Resources Information Center

    Ranger, Jochen; Kuhn, Jörg-Tobias

    2016-01-01

    In this article, a new model for test response times is proposed that combines latent class analysis and the proportional hazards model with random effects in a similar vein as the mixture factor model. The model assumes the existence of different latent classes. In each latent class, the response times are distributed according to a…

  8. A Mixture Proportional Hazards Model with Random Effects for Response Times in Tests

    ERIC Educational Resources Information Center

    Ranger, Jochen; Kuhn, Jörg-Tobias

    2016-01-01

    In this article, a new model for test response times is proposed that combines latent class analysis and the proportional hazards model with random effects in a similar vein as the mixture factor model. The model assumes the existence of different latent classes. In each latent class, the response times are distributed according to a…

  9. Survey of Bayesian Models for Modelling of Stochastic Temporal Processes

    SciTech Connect

    Ng, B

    2006-10-12

    This survey gives an overview of popular generative models used in the modeling of stochastic temporal systems. In particular, this survey is organized into two parts. The first part discusses the discrete-time representations of dynamic Bayesian networks and dynamic relational probabilistic models, while the second part discusses the continuous-time representation of continuous-time Bayesian networks.

  10. SEMIPARAMETRIC TRANSFORMATION MODELS WITH RANDOM EFFECTS FOR CLUSTERED FAILURE TIME DATA

    PubMed Central

    Zeng, Donglin; Lin, D. Y.; Lin, Xihong

    2009-01-01

    We propose a general class of semiparametric transformation models with random effects to formulate the effects of possibly time-dependent covariates on clustered or correlated failure times. This class encompasses all commonly used transformation models, including proportional hazards and proportional odds models, and it accommodates a variety of random-effects distributions, particularly Gaussian distributions. We show that the nonparametric maximum likelihood estimators of the model parameters are consistent, asymptotically normal and asymptotically efficient. We develop the corresponding likelihood-based inference procedures. Simulation studies demonstrate that the proposed methods perform well in practical situations. An illustration with a well-known diabetic retinopathy study is provided. PMID:19809573

  11. Bayesian modeling in virtual high throughput screening.

    PubMed

    Klon, Anthony E

    2009-06-01

    Naïve Bayesian classifiers are a relatively recent addition to the arsenal of tools available to computational chemists. These classifiers fall into a class of algorithms referred to broadly as machine learning algorithms. Bayesian classifiers may be used in conjunction with classical modeling techniques to assist in the rapid virtual screening of large compound libraries in a systematic manner with a minimum of human intervention. This approach allows computational scientists to concentrate their efforts on their core strengths of model building. Bayesian classifiers have an added advantage of being able to handle a variety of numerical or binary data such as physicochemical properties or molecular fingerprints, making the addition of new parameters to existing models a relatively straightforward process. As a result, during a drug discovery project these classifiers can better evolve with the needs of the projects from general models in the lead finding stages to increasingly precise models in the lead optimization stages that are of particular interest to a specific medicinal chemistry team. Although other machine learning algorithms abound, Bayesian classifiers have been shown to compare favorably under most working conditions and have been shown to be tolerant of noisy experimental data.

  12. Crash risk analysis for Shanghai urban expressways: A Bayesian semi-parametric modeling approach.

    PubMed

    Yu, Rongjie; Wang, Xuesong; Yang, Kui; Abdel-Aty, Mohamed

    2016-10-01

    Urban expressway systems have been developed rapidly in recent years in China; it has become one key part of the city roadway networks as carrying large traffic volume and providing high traveling speed. Along with the increase of traffic volume, traffic safety has become a major issue for Chinese urban expressways due to the frequent crash occurrence and the non-recurrent congestions caused by them. For the purpose of unveiling crash occurrence mechanisms and further developing Active Traffic Management (ATM) control strategies to improve traffic safety, this study developed disaggregate crash risk analysis models with loop detector traffic data and historical crash data. Bayesian random effects logistic regression models were utilized as it can account for the unobserved heterogeneity among crashes. However, previous crash risk analysis studies formulated random effects distributions in a parametric approach, which assigned them to follow normal distributions. Due to the limited information known about random effects distributions, subjective parametric setting may be incorrect. In order to construct more flexible and robust random effects to capture the unobserved heterogeneity, Bayesian semi-parametric inference technique was introduced to crash risk analysis in this study. Models with both inference techniques were developed for total crashes; semi-parametric models were proved to provide substantial better model goodness-of-fit, while the two models shared consistent coefficient estimations. Later on, Bayesian semi-parametric random effects logistic regression models were developed for weekday peak hour crashes, weekday non-peak hour crashes, and weekend non-peak hour crashes to investigate different crash occurrence scenarios. Significant factors that affect crash risk have been revealed and crash mechanisms have been concluded.

  13. Revisiting Fixed- and Random-Effects Models: Some Considerations for Policy-Relevant Education Research

    ERIC Educational Resources Information Center

    Clarke, Paul; Crawford, Claire; Steele, Fiona; Vignoles, Anna

    2015-01-01

    The use of fixed (FE) and random effects (RE) in two-level hierarchical linear regression is discussed in the context of education research. We compare the robustness of FE models with the modelling flexibility and potential efficiency of those from RE models. We argue that the two should be seen as complementary approaches. We then compare both…

  14. Revisiting Fixed- and Random-Effects Models: Some Considerations for Policy-Relevant Education Research

    ERIC Educational Resources Information Center

    Clarke, Paul; Crawford, Claire; Steele, Fiona; Vignoles, Anna

    2015-01-01

    The use of fixed (FE) and random effects (RE) in two-level hierarchical linear regression is discussed in the context of education research. We compare the robustness of FE models with the modelling flexibility and potential efficiency of those from RE models. We argue that the two should be seen as complementary approaches. We then compare both…

  15. Hierarchical Bayesian Models of Subtask Learning

    ERIC Educational Resources Information Center

    Anglim, Jeromy; Wynton, Sarah K. A.

    2015-01-01

    The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking…

  16. Hierarchical Bayesian Models of Subtask Learning

    ERIC Educational Resources Information Center

    Anglim, Jeromy; Wynton, Sarah K. A.

    2015-01-01

    The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking…

  17. Objective Bayesian model selection for Cox regression.

    PubMed

    Held, Leonhard; Gravestock, Isaac; Sabanés Bové, Daniel

    2016-12-20

    There is now a large literature on objective Bayesian model selection in the linear model based on the g-prior. The methodology has been recently extended to generalized linear models using test-based Bayes factors. In this paper, we show that test-based Bayes factors can also be applied to the Cox proportional hazards model. If the goal is to select a single model, then both the maximum a posteriori and the median probability model can be calculated. For clinical prediction of survival, we shrink the model-specific log hazard ratio estimates with subsequent calculation of the Breslow estimate of the cumulative baseline hazard function. A Bayesian model average can also be employed. We illustrate the proposed methodology with the analysis of survival data on primary biliary cirrhosis patients and the development of a clinical prediction model for future cardiovascular events based on data from the Second Manifestations of ARTerial disease (SMART) cohort study. Cross-validation is applied to compare the predictive performance with alternative model selection approaches based on Harrell's c-Index, the calibration slope and the integrated Brier score. Finally, a novel application of Bayesian variable selection to optimal conditional prediction via landmarking is described. Copyright © 2016 John Wiley & Sons, Ltd.

  18. Mixed-Effects Modeling with Crossed Random Effects for Subjects and Items

    ERIC Educational Resources Information Center

    Baayen, R. H.; Davidson, D. J.; Bates, D. M.

    2008-01-01

    This paper provides an introduction to mixed-effects models for the analysis of repeated measurement data with subjects and items as crossed random effects. A worked-out example of how to use recent software for mixed-effects modeling is provided. Simulation studies illustrate the advantages offered by mixed-effects analyses compared to…

  19. Normativity, interpretation, and Bayesian models

    PubMed Central

    Oaksford, Mike

    2014-01-01

    It has been suggested that evaluative normativity should be expunged from the psychology of reasoning. A broadly Davidsonian response to these arguments is presented. It is suggested that two distinctions, between different types of rationality, are more permeable than this argument requires and that the fundamental objection is to selecting theories that make the most rational sense of the data. It is argued that this is inevitable consequence of radical interpretation where understanding others requires assuming they share our own norms of reasoning. This requires evaluative normativity and it is shown that when asked to evaluate others’ arguments participants conform to rational Bayesian norms. It is suggested that logic and probability are not in competition and that the variety of norms is more limited than the arguments against evaluative normativity suppose. Moreover, the universality of belief ascription suggests that many of our norms are universal and hence evaluative. It is concluded that the union of evaluative normativity and descriptive psychology implicit in Davidson and apparent in the psychology of reasoning is a good thing. PMID:24860519

  20. Hierarchical Bayesian model updating for structural identification

    NASA Astrophysics Data System (ADS)

    Behmanesh, Iman; Moaveni, Babak; Lombaert, Geert; Papadimitriou, Costas

    2015-12-01

    A new probabilistic finite element (FE) model updating technique based on Hierarchical Bayesian modeling is proposed for identification of civil structural systems under changing ambient/environmental conditions. The performance of the proposed technique is investigated for (1) uncertainty quantification of model updating parameters, and (2) probabilistic damage identification of the structural systems. Accurate estimation of the uncertainty in modeling parameters such as mass or stiffness is a challenging task. Several Bayesian model updating frameworks have been proposed in the literature that can successfully provide the "parameter estimation uncertainty" of model parameters with the assumption that there is no underlying inherent variability in the updating parameters. However, this assumption may not be valid for civil structures where structural mass and stiffness have inherent variability due to different sources of uncertainty such as changing ambient temperature, temperature gradient, wind speed, and traffic loads. Hierarchical Bayesian model updating is capable of predicting the overall uncertainty/variability of updating parameters by assuming time-variability of the underlying linear system. A general solution based on Gibbs Sampler is proposed to estimate the joint probability distributions of the updating parameters. The performance of the proposed Hierarchical approach is evaluated numerically for uncertainty quantification and damage identification of a 3-story shear building model. Effects of modeling errors and incomplete modal data are considered in the numerical study.

  1. Joint modeling of survival time and longitudinal outcomes with flexible random effects.

    PubMed

    Choi, Jaeun; Zeng, Donglin; Olshan, Andrew F; Cai, Jianwen

    2017-08-30

    Joint models with shared Gaussian random effects have been conventionally used in analysis of longitudinal outcome and survival endpoint in biomedical or public health research. However, misspecifying the normality assumption of random effects can lead to serious bias in parameter estimation and future prediction. In this paper, we study joint models of general longitudinal outcomes and survival endpoint but allow the underlying distribution of shared random effect to be completely unknown. For inference, we propose to use a mixture of Gaussian distributions as an approximation to this unknown distribution and adopt an Expectation-Maximization (EM) algorithm for computation. Either AIC and BIC criteria are adopted for selecting the number of mixtures. We demonstrate the proposed method via a number of simulation studies. We illustrate our approach with the data from the Carolina Head and Neck Cancer Study (CHANCE).

  2. Calibration of crash risk models on freeways with limited real-time traffic data using Bayesian meta-analysis and Bayesian inference approach.

    PubMed

    Xu, Chengcheng; Wang, Wei; Liu, Pan; Li, Zhibin

    2015-12-01

    This study aimed to develop a real-time crash risk model with limited data in China by using Bayesian meta-analysis and Bayesian inference approach. A systematic review was first conducted by using three different Bayesian meta-analyses, including the fixed effect meta-analysis, the random effect meta-analysis, and the meta-regression. The meta-analyses provided a numerical summary of the effects of traffic variables on crash risks by quantitatively synthesizing results from previous studies. The random effect meta-analysis and the meta-regression produced a more conservative estimate for the effects of traffic variables compared with the fixed effect meta-analysis. Then, the meta-analyses results were used as informative priors for developing crash risk models with limited data. Three different meta-analyses significantly affect model fit and prediction accuracy. The model based on meta-regression can increase the prediction accuracy by about 15% as compared to the model that was directly developed with limited data. Finally, the Bayesian predictive densities analysis was used to identify the outliers in the limited data. It can further improve the prediction accuracy by 5.0%.

  3. Road network safety evaluation using Bayesian hierarchical joint model.

    PubMed

    Wang, Jie; Huang, Helai

    2016-05-01

    Safety and efficiency are commonly regarded as two significant performance indicators of transportation systems. In practice, road network planning has focused on road capacity and transport efficiency whereas the safety level of a road network has received little attention in the planning stage. This study develops a Bayesian hierarchical joint model for road network safety evaluation to help planners take traffic safety into account when planning a road network. The proposed model establishes relationships between road network risk and micro-level variables related to road entities and traffic volume, as well as socioeconomic, trip generation and network density variables at macro level which are generally used for long term transportation plans. In addition, network spatial correlation between intersections and their connected road segments is also considered in the model. A road network is elaborately selected in order to compare the proposed hierarchical joint model with a previous joint model and a negative binomial model. According to the results of the model comparison, the hierarchical joint model outperforms the joint model and negative binomial model in terms of the goodness-of-fit and predictive performance, which indicates the reasonableness of considering the hierarchical data structure in crash prediction and analysis. Moreover, both random effects at the TAZ level and the spatial correlation between intersections and their adjacent segments are found to be significant, supporting the employment of the hierarchical joint model as an alternative in road-network-level safety modeling as well.

  4. Semi-parametric estimation of random effects in a logistic regression model using conditional inference.

    PubMed

    Petersen, Jørgen Holm

    2016-01-15

    This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied. For each term in the composite likelihood, a conditional likelihood is used that eliminates the influence of the random effects, which results in a composite conditional likelihood consisting of only one-dimensional integrals that may be solved numerically. Good properties of the resulting estimator are described in a small simulation study.

  5. Bayesian network modelling of upper gastrointestinal bleeding

    NASA Astrophysics Data System (ADS)

    Aisha, Nazziwa; Shohaimi, Shamarina; Adam, Mohd Bakri

    2013-09-01

    Bayesian networks are graphical probabilistic models that represent causal and other relationships between domain variables. In the context of medical decision making, these models have been explored to help in medical diagnosis and prognosis. In this paper, we discuss the Bayesian network formalism in building medical support systems and we learn a tree augmented naive Bayes Network (TAN) from gastrointestinal bleeding data. The accuracy of the TAN in classifying the source of gastrointestinal bleeding into upper or lower source is obtained. The TAN achieves a high classification accuracy of 86% and an area under curve of 92%. A sensitivity analysis of the model shows relatively high levels of entropy reduction for color of the stool, history of gastrointestinal bleeding, consistency and the ratio of blood urea nitrogen to creatinine. The TAN facilitates the identification of the source of GIB and requires further validation.

  6. Estimation of the Nonlinear Random Coefficient Model when Some Random Effects Are Separable

    ERIC Educational Resources Information Center

    du Toit, Stephen H. C.; Cudeck, Robert

    2009-01-01

    A method is presented for marginal maximum likelihood estimation of the nonlinear random coefficient model when the response function has some linear parameters. This is done by writing the marginal distribution of the repeated measures as a conditional distribution of the response given the nonlinear random effects. The resulting distribution…

  7. Random-Effects Models for Analyzing Clustered Data from a Nutrition Education Intervention.

    ERIC Educational Resources Information Center

    Woodruff, Susan I.

    1997-01-01

    Analyses of data from a nutrition education program were compared for multiple regression analysis using individual subject data, multiple regression using classroom data, two-level random effects model (REM) with subjects clustered within classrooms, and two-level REM with subjects clustered within sites. Advantages of REM are discussed. (SLD)

  8. The Impact of Five Missing Data Treatments on a Cross-Classified Random Effects Model

    ERIC Educational Resources Information Center

    Hoelzle, Braden R.

    2012-01-01

    The present study compared the performance of five missing data treatment methods within a Cross-Classified Random Effects Model environment under various levels and patterns of missing data given a specified sample size. Prior research has shown the varying effect of missing data treatment options within the context of numerous statistical…

  9. Estimation of the Nonlinear Random Coefficient Model when Some Random Effects Are Separable

    ERIC Educational Resources Information Center

    du Toit, Stephen H. C.; Cudeck, Robert

    2009-01-01

    A method is presented for marginal maximum likelihood estimation of the nonlinear random coefficient model when the response function has some linear parameters. This is done by writing the marginal distribution of the repeated measures as a conditional distribution of the response given the nonlinear random effects. The resulting distribution…

  10. The Impact of Five Missing Data Treatments on a Cross-Classified Random Effects Model

    ERIC Educational Resources Information Center

    Hoelzle, Braden R.

    2012-01-01

    The present study compared the performance of five missing data treatment methods within a Cross-Classified Random Effects Model environment under various levels and patterns of missing data given a specified sample size. Prior research has shown the varying effect of missing data treatment options within the context of numerous statistical…

  11. Fixed- and random-effects meta-analytic structural equation modeling: examples and analyses in R.

    PubMed

    Cheung, Mike W-L

    2014-03-01

    Meta-analytic structural equation modeling (MASEM) combines the ideas of meta-analysis and structural equation modeling for the purpose of synthesizing correlation or covariance matrices and fitting structural equation models on the pooled correlation or covariance matrix. Cheung and Chan (Psychological Methods 10:40-64, 2005b, Structural Equation Modeling 16:28-53, 2009) proposed a two-stage structural equation modeling (TSSEM) approach to conducting MASEM that was based on a fixed-effects model by assuming that all studies have the same population correlation or covariance matrices. The main objective of this article is to extend the TSSEM approach to a random-effects model by the inclusion of study-specific random effects. Another objective is to demonstrate the procedures with two examples using the metaSEM package implemented in the R statistical environment. Issues related to and future directions for MASEM are discussed.

  12. Baseline and treatment effect heterogeneity for survival times between centers using a random effects accelerated failure time model with flexible error distribution.

    PubMed

    Komárek, Arnost; Lesaffre, Emmanuel; Legrand, Catherine

    2007-12-30

    Nowadays, most clinical trials are conducted in different centers and even in different countries. In most multi-center studies, the primary analysis assumes that the treatment effect is constant over centers. However, it is also recommended to perform an exploratory analysis to highlight possible center by treatment interaction, especially when several countries are involved. We propose in this paper an exploratory Bayesian approach to quantify this interaction in the context of survival data. To this end we used and generalized a random effects accelerated failure time model. The generalization consists in using a penalized Gaussian mixture as an error distribution on top of multivariate random effects that are assumed to follow a normal distribution. For computational convenience, the computations are based on Markov chain Monte Carlo techniques. The proposed method is illustrated on the disease-free survival times of early breast cancer patients collected in the EORTC trial 10854. Copyright (c) 2007 John Wiley & Sons, Ltd.

  13. Bayesian Recurrent Neural Network for Language Modeling.

    PubMed

    Chien, Jen-Tzung; Ku, Yuan-Chu

    2016-02-01

    A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.

  14. Full Bayes Poisson gamma, Poisson lognormal, and zero inflated random effects models: Comparing the precision of crash frequency estimates.

    PubMed

    Aguero-Valverde, Jonathan

    2013-01-01

    In recent years, complex statistical modeling approaches have being proposed to handle the unobserved heterogeneity and the excess of zeros frequently found in crash data, including random effects and zero inflated models. This research compares random effects, zero inflated, and zero inflated random effects models using a full Bayes hierarchical approach. The models are compared not just in terms of goodness-of-fit measures but also in terms of precision of posterior crash frequency estimates since the precision of these estimates is vital for ranking of sites for engineering improvement. Fixed-over-time random effects models are also compared to independent-over-time random effects models. For the crash dataset being analyzed, it was found that once the random effects are included in the zero inflated models, the probability of being in the zero state is drastically reduced, and the zero inflated models degenerate to their non zero inflated counterparts. Also by fixing the random effects over time the fit of the models and the precision of the crash frequency estimates are significantly increased. It was found that the rankings of the fixed-over-time random effects models are very consistent among them. In addition, the results show that by fixing the random effects over time, the standard errors of the crash frequency estimates are significantly reduced for the majority of the segments on the top of the ranking. Copyright © 2012 Elsevier Ltd. All rights reserved.

  15. Bayesian model selection analysis of WMAP3

    SciTech Connect

    Parkinson, David; Mukherjee, Pia; Liddle, Andrew R.

    2006-06-15

    We present a Bayesian model selection analysis of WMAP3 data using our code CosmoNest. We focus on the density perturbation spectral index n{sub S} and the tensor-to-scalar ratio r, which define the plane of slow-roll inflationary models. We find that while the Bayesian evidence supports the conclusion that n{sub S}{ne}1, the data are not yet powerful enough to do so at a strong or decisive level. If tensors are assumed absent, the current odds are approximately 8 to 1 in favor of n{sub S}{ne}1 under our assumptions, when WMAP3 data is used together with external data sets. WMAP3 data on its own is unable to distinguish between the two models. Further, inclusion of r as a parameter weakens the conclusion against the Harrison-Zel'dovich case (n{sub S}=1, r=0), albeit in a prior-dependent way. In appendices we describe the CosmoNest code in detail, noting its ability to supply posterior samples as well as to accurately compute the Bayesian evidence. We make a first public release of CosmoNest, now available at www.cosmonest.org.

  16. A random effects variance shift model for detecting and accommodating outliers in meta-analysis

    PubMed Central

    2011-01-01

    Background Meta-analysis typically involves combining the estimates from independent studies in order to estimate a parameter of interest across a population of studies. However, outliers often occur even under the random effects model. The presence of such outliers could substantially alter the conclusions in a meta-analysis. This paper proposes a methodology for identifying and, if desired, downweighting studies that do not appear representative of the population they are thought to represent under the random effects model. Methods An outlier is taken as an observation (study result) with an inflated random effect variance. We used the likelihood ratio test statistic as an objective measure for determining whether observations have inflated variance and are therefore considered outliers. A parametric bootstrap procedure was used to obtain the sampling distribution of the likelihood ratio test statistics and to account for multiple testing. Our methods were applied to three illustrative and contrasting meta-analytic data sets. Results For the three meta-analytic data sets our methods gave robust inferences when the identified outliers were downweighted. Conclusions The proposed methodology provides a means to identify and, if desired, downweight outliers in meta-analysis. It does not eliminate them from the analysis however and we consider the proposed approach preferable to simply removing any or all apparently outlying results. We do not however propose that our methods in any way replace or diminish the standard random effects methodology that has proved so useful, rather they are helpful when used in conjunction with the random effects model. PMID:21324180

  17. Analysis of an incomplete longitudinal composite variable using a marginalized random effects model and multiple imputation.

    PubMed

    Gosho, Masahiko; Maruo, Kazushi; Ishii, Ryota; Hirakawa, Akihiro

    2016-11-16

    The total score, which is calculated as the sum of scores in multiple items or questions, is repeatedly measured in longitudinal clinical studies. A mixed effects model for repeated measures method is often used to analyze these data; however, if one or more individual items are not measured, the method cannot be directly applied to the total score. We develop two simple and interpretable procedures that infer fixed effects for a longitudinal continuous composite variable. These procedures consider that the items that compose the total score are multivariate longitudinal continuous data and, simultaneously, handle subject-level and item-level missing data. One procedure is based on a multivariate marginalized random effects model with a multiple of Kronecker product covariance matrices for serial time dependence and correlation among items. The other procedure is based on a multiple imputation approach with a multivariate normal model. In terms of the type-1 error rate and the bias of treatment effect in total score, the marginalized random effects model and multiple imputation procedures performed better than the standard mixed effects model for repeated measures analysis with listwise deletion and single imputations for handling item-level missing data. In particular, the mixed effects model for repeated measures with listwise deletion resulted in substantial inflation of the type-1 error rate. The marginalized random effects model and multiple imputation methods provide for a more efficient analysis by fully utilizing the partially available data, compared to the mixed effects model for repeated measures method with listwise deletion.

  18. Bayesian structural equation modeling in sport and exercise psychology.

    PubMed

    Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus

    2015-08-01

    Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.

  19. Three case studies in the Bayesian analysis of cognitive models.

    PubMed

    Lee, Michael D

    2008-02-01

    Bayesian statistical inference offers a principled and comprehensive approach for relating psychological models to data. This article presents Bayesian analyses of three influential psychological models: multidimensional scaling models of stimulus representation, the generalized context model of category learning, and a signal detection theory model of decision making. In each case, the model is recast as a probabilistic graphical model and is evaluated in relation to a previously considered data set. In each case, it is shown that Bayesian inference is able to provide answers to important theoretical and empirical questions easily and coherently. The generality of the Bayesian approach and its potential for the understanding of models and data in psychology are discussed.

  20. Bayesian Nonparametric Models for Multiway Data Analysis.

    PubMed

    Xu, Zenglin; Yan, Feng; Qi, Yuan

    2015-02-01

    Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches-such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)-amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g., missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models for multiway data analysis. We name these models InfTucker. These new models essentially conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or t processes with nonlinear covariance functions. Moreover, on network data, our models reduce to nonparametric stochastic blockmodels and can be used to discover latent groups and predict missing interactions. To learn the models efficiently from data, we develop a variational inference technique and explore properties of the Kronecker product for computational efficiency. Compared with a classical variational implementation, this technique reduces both time and space complexities by several orders of magnitude. On real multiway and network data, our new models achieved significantly higher prediction accuracy than state-of-art tensor decomposition methods and blockmodels.

  1. The impact of omitting the interaction between crossed factors in cross-classified random effects modelling.

    PubMed

    Shi, Yuying; Leite, Walter; Algina, James

    2010-02-01

    Cross-classified random effects modelling (CCREM) is a special case of multi-level modelling where the units of one level are nested within two cross-classified factors. Typically, CCREM analyses omit the random interaction effect of the cross-classified factors. We investigate the impact of the omission of the interaction effect on parameter estimates and standard errors. Results from a Monte Carlo simulation study indicate that, for fixed effects, both coefficients estimates and accompanied standard error estimates are not biased. For random effects, results are affected at level 2 but not at level 1 by the presence of an interaction variance and/or a correlation between the residual of level two factors. Results from the analysis of the Early Childhood Longitudinal Study and the National Educational Longitudinal Study agree with those obtained from simulated data. We recommend that researchers attempt to include interaction effects of cross-classified factors in their models.

  2. Bayesian variable selection for latent class models.

    PubMed

    Ghosh, Joyee; Herring, Amy H; Siega-Riz, Anna Maria

    2011-09-01

    In this article, we develop a latent class model with class probabilities that depend on subject-specific covariates. One of our major goals is to identify important predictors of latent classes. We consider methodology that allows estimation of latent classes while allowing for variable selection uncertainty. We propose a Bayesian variable selection approach and implement a stochastic search Gibbs sampler for posterior computation to obtain model-averaged estimates of quantities of interest such as marginal inclusion probabilities of predictors. Our methods are illustrated through simulation studies and application to data on weight gain during pregnancy, where it is of interest to identify important predictors of latent weight gain classes.

  3. A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.

    PubMed

    Houseman, E Andres; Virji, M Abbas

    2017-08-01

    Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates

  4. Bayesian Approach for Flexible Modeling of Semicompeting Risks Data

    PubMed Central

    Han, Baoguang; Yu, Menggang; Dignam, James J.; Rathouz, Paul J.

    2016-01-01

    Summary Semicompeting risks data arise when two types of events, non-terminal and terminal, are observed. When the terminal event occurs first, it censors the non-terminal event, but not vice versa. To account for possible dependent censoring of the non-terminal event by the terminal event and to improve prediction of the terminal event using the non-terminal event information, it is crucial to model their association properly. Motivated by a breast cancer clinical trial data analysis, we extend the well-known illness-death models to allow flexible random effects to capture heterogeneous association structures in the data. Our extension also represents a generalization of the popular shared frailty models that usually assume that the non-terminal event does not affect the hazards of the terminal event beyond a frailty term. We propose a unified Bayesian modeling approach that can utilize existing software packages for both model fitting and individual specific event prediction. The approach is demonstrated via both simulation studies and a breast cancer data set analysis. PMID:25274445

  5. Bayesian Analysis of Mass Spectrometry Proteomics Data using Wavelet Based Functional Mixed Models

    PubMed Central

    Morris, Jeffrey S.; Brown, Philip J.; Herrick, Richard C.; Baggerly, Keith A.; Coombes, Kevin R.

    2008-01-01

    In this paper, we analyze MALDI-TOF mass spectrometry proteomic data using Bayesian wavelet-based functional mixed models. By modeling mass spectra as functions, this approach avoids reliance on peak detection methods. The flexibility of this framework in modeling non-parametric fixed and random effect functions enables it to model the effects of multiple factors simultaneously, allowing one to perform inference on multiple factors of interest using the same model fit, while adjusting for clinical or experimental covariates that may affect both the intensities and locations of peaks in the spectra. From the model output, we identify spectral regions that are differentially expressed across experimental conditions, while controlling the Bayesian FDR, in a way that takes both statistical and clinical significance into account. We apply this method to two cancer studies. PMID:17888041

  6. Moving beyond qualitative evaluations of Bayesian models of cognition.

    PubMed

    Hemmer, Pernille; Tauber, Sean; Steyvers, Mark

    2015-06-01

    Bayesian models of cognition provide a powerful way to understand the behavior and goals of individuals from a computational point of view. Much of the focus in the Bayesian cognitive modeling approach has been on qualitative model evaluations, where predictions from the models are compared to data that is often averaged over individuals. In many cognitive tasks, however, there are pervasive individual differences. We introduce an approach to directly infer individual differences related to subjective mental representations within the framework of Bayesian models of cognition. In this approach, Bayesian data analysis methods are used to estimate cognitive parameters and motivate the inference process within a Bayesian cognitive model. We illustrate this integrative Bayesian approach on a model of memory. We apply the model to behavioral data from a memory experiment involving the recall of heights of people. A cross-validation analysis shows that the Bayesian memory model with inferred subjective priors predicts withheld data better than a Bayesian model where the priors are based on environmental statistics. In addition, the model with inferred priors at the individual subject level led to the best overall generalization performance, suggesting that individual differences are important to consider in Bayesian models of cognition.

  7. A Bayesian Shrinkage Approach for AMMI Models.

    PubMed

    da Silva, Carlos Pereira; de Oliveira, Luciano Antonio; Nuvunga, Joel Jorge; Pamplona, Andrezza Kéllen Alves; Balestre, Marcio

    2015-01-01

    Linear-bilinear models, especially the additive main effects and multiplicative interaction (AMMI) model, are widely applicable to genotype-by-environment interaction (GEI) studies in plant breeding programs. These models allow a parsimonious modeling of GE interactions, retaining a small number of principal components in the analysis. However, one aspect of the AMMI model that is still debated is the selection criteria for determining the number of multiplicative terms required to describe the GE interaction pattern. Shrinkage estimators have been proposed as selection criteria for the GE interaction components. In this study, a Bayesian approach was combined with the AMMI model with shrinkage estimators for the principal components. A total of 55 maize genotypes were evaluated in nine different environments using a complete blocks design with three replicates. The results show that the traditional Bayesian AMMI model produces low shrinkage of singular values but avoids the usual pitfalls in determining the credible intervals in the biplot. On the other hand, Bayesian shrinkage AMMI models have difficulty with the credible interval for model parameters, but produce stronger shrinkage of the principal components, converging to GE matrices that have more shrinkage than those obtained using mixed models. This characteristic allowed more parsimonious models to be chosen, and resulted in models being selected that were similar to those obtained by the Cornelius F-test (α = 0.05) in traditional AMMI models and cross validation based on leave-one-out. This characteristic allowed more parsimonious models to be chosen and more GEI pattern retained on the first two components. The resulting model chosen by posterior distribution of singular value was also similar to those produced by the cross-validation approach in traditional AMMI models. Our method enables the estimation of credible interval for AMMI biplot plus the choice of AMMI model based on direct posterior

  8. A Bayesian Shrinkage Approach for AMMI Models

    PubMed Central

    de Oliveira, Luciano Antonio; Nuvunga, Joel Jorge; Pamplona, Andrezza Kéllen Alves

    2015-01-01

    Linear-bilinear models, especially the additive main effects and multiplicative interaction (AMMI) model, are widely applicable to genotype-by-environment interaction (GEI) studies in plant breeding programs. These models allow a parsimonious modeling of GE interactions, retaining a small number of principal components in the analysis. However, one aspect of the AMMI model that is still debated is the selection criteria for determining the number of multiplicative terms required to describe the GE interaction pattern. Shrinkage estimators have been proposed as selection criteria for the GE interaction components. In this study, a Bayesian approach was combined with the AMMI model with shrinkage estimators for the principal components. A total of 55 maize genotypes were evaluated in nine different environments using a complete blocks design with three replicates. The results show that the traditional Bayesian AMMI model produces low shrinkage of singular values but avoids the usual pitfalls in determining the credible intervals in the biplot. On the other hand, Bayesian shrinkage AMMI models have difficulty with the credible interval for model parameters, but produce stronger shrinkage of the principal components, converging to GE matrices that have more shrinkage than those obtained using mixed models. This characteristic allowed more parsimonious models to be chosen, and resulted in models being selected that were similar to those obtained by the Cornelius F-test (α = 0.05) in traditional AMMI models and cross validation based on leave-one-out. This characteristic allowed more parsimonious models to be chosen and more GEI pattern retained on the first two components. The resulting model chosen by posterior distribution of singular value was also similar to those produced by the cross-validation approach in traditional AMMI models. Our method enables the estimation of credible interval for AMMI biplot plus the choice of AMMI model based on direct posterior

  9. Model Comparison of Bayesian Semiparametric and Parametric Structural Equation Models

    ERIC Educational Resources Information Center

    Song, Xin-Yuan; Xia, Ye-Mao; Pan, Jun-Hao; Lee, Sik-Yum

    2011-01-01

    Structural equation models have wide applications. One of the most important issues in analyzing structural equation models is model comparison. This article proposes a Bayesian model comparison statistic, namely the "L[subscript nu]"-measure for both semiparametric and parametric structural equation models. For illustration purposes, we consider…

  10. Model Comparison of Bayesian Semiparametric and Parametric Structural Equation Models

    ERIC Educational Resources Information Center

    Song, Xin-Yuan; Xia, Ye-Mao; Pan, Jun-Hao; Lee, Sik-Yum

    2011-01-01

    Structural equation models have wide applications. One of the most important issues in analyzing structural equation models is model comparison. This article proposes a Bayesian model comparison statistic, namely the "L[subscript nu]"-measure for both semiparametric and parametric structural equation models. For illustration purposes, we consider…

  11. Bayesian Model Selection with Network Based Diffusion Analysis

    PubMed Central

    Whalen, Andrew; Hoppitt, William J. E.

    2016-01-01

    A number of recent studies have used Network Based Diffusion Analysis (NBDA) to detect the role of social transmission in the spread of a novel behavior through a population. In this paper we present a unified framework for performing NBDA in a Bayesian setting, and demonstrate how the Watanabe Akaike Information Criteria (WAIC) can be used for model selection. We present a specific example of applying this method to Time to Acquisition Diffusion Analysis (TADA). To examine the robustness of this technique, we performed a large scale simulation study and found that NBDA using WAIC could recover the correct model of social transmission under a wide range of cases, including under the presence of random effects, individual level variables, and alternative models of social transmission. This work suggests that NBDA is an effective and widely applicable tool for uncovering whether social transmission underpins the spread of a novel behavior, and may still provide accurate results even when key model assumptions are relaxed. PMID:27092089

  12. A new threshold dose-response model including random effects for data from developmental toxicity studies.

    PubMed

    Hunt, Daniel L; Rai, Shesh N

    2005-01-01

    Usually, in teratological dose finding studies, there are not only threshold effects but also extra variations that cannot be accounted for by the beta-binomial model alone. The beta-binomial model assumes correlation between fetuses in the same litter. The general random effect threshold (RE) model allows the additional variability that arises due to correlation and between litter variability to be modeled, in combination with threshold in the model. The goal of this research was to investigate a threshold dose-response model with random effects (RE) to model the variability that exists between litters of animals in studies of toxic agents. Data from a developmental toxicity study of a toxic agent were analysed, using the proposed RE threshold dose-response model, which is an extension of logit in form. Also, an approximate likelihood function was used to derive parameter estimates from this model, and tests were performed to determine the significance of the model parameters, in particular, the RE parameter. A simulation study was conducted to assess the performance of the RE threshold model in estimating the model parameters. 2005 John Wiley & Sons, Ltd.

  13. Model feedback in Bayesian propensity score estimation.

    PubMed

    Zigler, Corwin M; Watts, Krista; Yeh, Robert W; Wang, Yun; Coull, Brent A; Dominici, Francesca

    2013-03-01

    Methods based on the propensity score comprise one set of valuable tools for comparative effectiveness research and for estimating causal effects more generally. These methods typically consist of two distinct stages: (1) a propensity score stage where a model is fit to predict the propensity to receive treatment (the propensity score), and (2) an outcome stage where responses are compared in treated and untreated units having similar values of the estimated propensity score. Traditional techniques conduct estimation in these two stages separately; estimates from the first stage are treated as fixed and known for use in the second stage. Bayesian methods have natural appeal in these settings because separate likelihoods for the two stages can be combined into a single joint likelihood, with estimation of the two stages carried out simultaneously. One key feature of joint estimation in this context is "feedback" between the outcome stage and the propensity score stage, meaning that quantities in a model for the outcome contribute information to posterior distributions of quantities in the model for the propensity score. We provide a rigorous assessment of Bayesian propensity score estimation to show that model feedback can produce poor estimates of causal effects absent strategies that augment propensity score adjustment with adjustment for individual covariates. We illustrate this phenomenon with a simulation study and with a comparative effectiveness investigation of carotid artery stenting versus carotid endarterectomy among 123,286 Medicare beneficiaries hospitlized for stroke in 2006 and 2007.

  14. Residual spatial correlation between geographically referenced observations: a Bayesian hierarchical modeling approach.

    PubMed

    Boyd, Heather A; Flanders, W Dana; Addiss, David G; Waller, Lance A

    2005-07-01

    Analytic methods commonly used in epidemiology do not account for spatial correlation between observations. In regression analyses, this omission can bias parameter estimates and yield incorrect standard error estimates. We present a Bayesian hierarchical model (BHM) approach that accounts for spatial correlation, and illustrate its strengths and weaknesses by applying this modeling approach to data on Wuchereria bancrofti infection in Haiti. A program to eliminate lymphatic filariasis in Haiti assessed prevalence of W. bancrofti infection in 57 schools across Leogane Commune. We analyzed the spatial pattern in the prevalence data using semi-variograms and correlograms. We then modeled the data using (1) standard logistic regression (GLM); (2) non-Bayesian logistic generalized linear mixed models (GLMMs) with school-specific nonspatial random effects; (3) BHMs with school-specific nonspatial random effects; and (4) BHMs with spatial random effects. An exponential semi-variogram with an effective range of 2.15 km best fit the data. GLMM and nonspatial BHM point estimates were comparable and also were generally similar with the marginal GLM point estimates. In contrast, compared with the nonspatial mixed model results, spatial BHM point estimates were markedly attenuated. The clear spatial pattern evident in the Haitian W. bancrofti prevalence data and the observation that point estimates and standard errors differed depending on the modeling approach indicate that it is important to account for residual spatial correlation in analyses of W. bancrofti infection data. Bayesian hierarchical models provide a flexible, readily implementable approach to modeling spatially correlated data. However, our results also illustrate that spatial smoothing must be applied with care.

  15. Experience With Bayesian Image Based Surface Modeling

    NASA Technical Reports Server (NTRS)

    Stutz, John C.

    2005-01-01

    Bayesian surface modeling from images requires modeling both the surface and the image generation process, in order to optimize the models by comparing actual and generated images. Thus it differs greatly, both conceptually and in computational difficulty, from conventional stereo surface recovery techniques. But it offers the possibility of using any number of images, taken under quite different conditions, and by different instruments that provide independent and often complementary information, to generate a single surface model that fuses all available information. I describe an implemented system, with a brief introduction to the underlying mathematical models and the compromises made for computational efficiency. I describe successes and failures achieved on actual imagery, where we went wrong and what we did right, and how our approach could be improved. Lastly I discuss how the same approach can be extended to distinct types of instruments, to achieve true sensor fusion.

  16. A Hierarchical Bayesian Model for Crowd Emotions

    PubMed Central

    Urizar, Oscar J.; Baig, Mirza S.; Barakova, Emilia I.; Regazzoni, Carlo S.; Marcenaro, Lucio; Rauterberg, Matthias

    2016-01-01

    Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366

  17. Crash Frequency Analysis Using Hurdle Models with Random Effects Considering Short-Term Panel Data.

    PubMed

    Chen, Feng; Ma, Xiaoxiang; Chen, Suren; Yang, Lin

    2016-10-26

    Random effect panel data hurdle models are established to research the daily crash frequency on a mountainous section of highway I-70 in Colorado. Road Weather Information System (RWIS) real-time traffic and weather and road surface conditions are merged into the models incorporating road characteristics. The random effect hurdle negative binomial (REHNB) model is developed to study the daily crash frequency along with three other competing models. The proposed model considers the serial correlation of observations, the unbalanced panel-data structure, and dominating zeroes. Based on several statistical tests, the REHNB model is identified as the most appropriate one among four candidate models for a typical mountainous highway. The results show that: (1) the presence of over-dispersion in the short-term crash frequency data is due to both excess zeros and unobserved heterogeneity in the crash data; and (2) the REHNB model is suitable for this type of data. Moreover, time-varying variables including weather conditions, road surface conditions and traffic conditions are found to play importation roles in crash frequency. Besides the methodological advancements, the proposed technology bears great potential for engineering applications to develop short-term crash frequency models by utilizing detailed data from field monitoring data such as RWIS, which is becoming more accessible around the world.

  18. Crash Frequency Analysis Using Hurdle Models with Random Effects Considering Short-Term Panel Data

    PubMed Central

    Chen, Feng; Ma, Xiaoxiang; Chen, Suren; Yang, Lin

    2016-01-01

    Random effect panel data hurdle models are established to research the daily crash frequency on a mountainous section of highway I-70 in Colorado. Road Weather Information System (RWIS) real-time traffic and weather and road surface conditions are merged into the models incorporating road characteristics. The random effect hurdle negative binomial (REHNB) model is developed to study the daily crash frequency along with three other competing models. The proposed model considers the serial correlation of observations, the unbalanced panel-data structure, and dominating zeroes. Based on several statistical tests, the REHNB model is identified as the most appropriate one among four candidate models for a typical mountainous highway. The results show that: (1) the presence of over-dispersion in the short-term crash frequency data is due to both excess zeros and unobserved heterogeneity in the crash data; and (2) the REHNB model is suitable for this type of data. Moreover, time-varying variables including weather conditions, road surface conditions and traffic conditions are found to play importation roles in crash frequency. Besides the methodological advancements, the proposed technology bears great potential for engineering applications to develop short-term crash frequency models by utilizing detailed data from field monitoring data such as RWIS, which is becoming more accessible around the world. PMID:27792209

  19. Spatially-dependent Bayesian model selection for disease mapping.

    PubMed

    Carroll, Rachel; Lawson, Andrew B; Faes, Christel; Kirby, Russell S; Aregay, Mehreteab; Watjou, Kevin

    2016-01-01

    In disease mapping where predictor effects are to be modeled, it is often the case that sets of predictors are fixed, and the aim is to choose between fixed model sets. Model selection methods, both Bayesian model selection and Bayesian model averaging, are approaches within the Bayesian paradigm for achieving this aim. In the spatial context, model selection could have a spatial component in the sense that some models may be more appropriate for certain areas of a study region than others. In this work, we examine the use of spatially referenced Bayesian model averaging and Bayesian model selection via a large-scale simulation study accompanied by a small-scale case study. Our results suggest that BMS performs well when a strong regression signature is found.

  20. Bayesian Hierarchical Models to Augment the Mediterranean Forecast System

    DTIC Science & Technology

    2016-06-07

    Bayesian Hierarchical Models to Augment the Mediterranean Forecast System Ralph F. Milliff Colorado Research Associates Division NorthWest...last year. Our goal is to develop an ensemble ocean forecast methodology, using Bayesian Hierarchical Modelling (BHM) tools. The ocean ensemble...geostrophy model introduced by Royle et al. (1998). The second objective involves the accurate representation of forecast error covariance evolution in

  1. Hopes and Cautions in Implementing Bayesian Structural Equation Modeling

    ERIC Educational Resources Information Center

    MacCallum, Robert C.; Edwards, Michael C.; Cai, Li

    2012-01-01

    Muthen and Asparouhov (2012) have proposed and demonstrated an approach to model specification and estimation in structural equation modeling (SEM) using Bayesian methods. Their contribution builds on previous work in this area by (a) focusing on the translation of conventional SEM models into a Bayesian framework wherein parameters fixed at zero…

  2. Bayesian Fundamentalism or Enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition.

    PubMed

    Jones, Matt; Love, Bradley C

    2011-08-01

    The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls

  3. Blast-related mild traumatic brain injury: a Bayesian random-effects meta-analysis on the cognitive outcomes of concussion among military personnel.

    PubMed

    Karr, Justin E; Areshenkoff, Corson N; Duggan, Emily C; Garcia-Barrera, Mauricio A

    2014-12-01

    Throughout their careers, many soldiers experience repeated blasts exposures from improvised explosive devices, which often involve head injury. Consequentially, blast-related mild Traumatic Brain Injury (mTBI) has become prevalent in modern conflicts, often occuring co-morbidly with psychiatric illness (e.g., post-traumatic stress disorder [PTSD]). In turn, a growing body of research has begun to explore the cognitive and psychiatric sequelae of blast-related mTBI. The current meta-analysis aimed to evaluate the chronic effects of blast-related mTBI on cognitive performance. A systematic review identified 9 studies reporting 12 samples meeting eligibility criteria. A Bayesian random-effects meta-analysis was conducted with cognitive construct and PTSD symptoms explored as moderators. The overall posterior mean effect size and Highest Density Interval (HDI) came to d = -0.12 [-0.21, -0.04], with executive function (-0.16 [-0.31, 0.00]), verbal delayed memory (-0.19 [-0.44, 0.06]) and processing speed (-0.11 [-0.26, 0.01]) presenting as the most sensitive cognitive domains to blast-related mTBI. When dividing executive function into diverse sub-constructs (i.e., working memory, inhibition, set-shifting), set-shifting presented the largest effect size (-0.33 [-0.55, -0.05]). PTSD symptoms did not predict cognitive effects sizes, β PTSD  = -0.02 [-0.23, 0.20]. The results indicate a subtle, but chronic cognitive impairment following mTBI, especially in set-shifting, a relevant aspect of executive attention. These findings are consistent with past meta-analyses on multiple mTBI and correspond with past neuroimaging research on the cognitive correlates of white matter damage common in mTBI. However, all studies had cross-sectional designs, which resulted in universally low quality ratings and limited the conclusions inferable from this meta-analysis.

  4. Improving randomness characterization through Bayesian model selection.

    PubMed

    Díaz Hernández Rojas, Rafael; Solís, Aldo; Angulo Martínez, Alí M; U'Ren, Alfred B; Hirsch, Jorge G; Marsili, Matteo; Pérez Castillo, Isaac

    2017-06-08

    Random number generation plays an essential role in technology with important applications in areas ranging from cryptography to Monte Carlo methods, and other probabilistic algorithms. All such applications require high-quality sources of random numbers, yet effective methods for assessing whether a source produce truly random sequences are still missing. Current methods either do not rely on a formal description of randomness (NIST test suite) on the one hand, or are inapplicable in principle (the characterization derived from the Algorithmic Theory of Information), on the other, for they require testing all the possible computer programs that could produce the sequence to be analysed. Here we present a rigorous method that overcomes these problems based on Bayesian model selection. We derive analytic expressions for a model's likelihood which is then used to compute its posterior distribution. Our method proves to be more rigorous than NIST's suite and Borel-Normality criterion and its implementation is straightforward. We applied our method to an experimental device based on the process of spontaneous parametric downconversion to confirm it behaves as a genuine quantum random number generator. As our approach relies on Bayesian inference our scheme transcends individual sequence analysis, leading to a characterization of the source itself.

  5. Bayesian Inference for Nonnegative Matrix Factorisation Models

    PubMed Central

    Cemgil, Ali Taylan

    2009-01-01

    We describe nonnegative matrix factorisation (NMF) with a Kullback-Leibler (KL) error measure in a statistical framework, with a hierarchical generative model consisting of an observation and a prior component. Omitting the prior leads to the standard KL-NMF algorithms as special cases, where maximum likelihood parameter estimation is carried out via the Expectation-Maximisation (EM) algorithm. Starting from this view, we develop full Bayesian inference via variational Bayes or Monte Carlo. Our construction retains conjugacy and enables us to develop more powerful models while retaining attractive features of standard NMF such as monotonic convergence and easy implementation. We illustrate our approach on model order selection and image reconstruction. PMID:19536273

  6. Bayesian model of human color constancy

    PubMed Central

    Brainard, David H.; Longère, Philippe; Delahunt, Peter B.; Freeman, William T.; Kraft, James M.; Xiao, Bei

    2008-01-01

    Vision is difficult because images are ambiguous about the structure of the world. For object color, the ambiguity arises because the same object reflects a different spectrum to the eye under different illuminations. Human vision typically does a good job of resolving this ambiguity—an ability known as color constancy. The past 20 years have seen an explosion of work on color constancy, with advances in both experimental methods and computational algorithms. Here, we connect these two lines of research by developing a quantitative model of human color constancy. The model includes an explicit link between psychophysical data and illuminant estimates obtained via a Bayesian algorithm. The model is fit to the data through a parameterization of the prior distribution of illuminant spectral properties. The fit to the data is good, and the derived prior provides a succinct description of human performance. PMID:17209734

  7. Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.

    PubMed

    Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F

    2013-04-01

    In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.

  8. Merging Digital Surface Models Implementing Bayesian Approaches

    NASA Astrophysics Data System (ADS)

    Sadeq, H.; Drummond, J.; Li, Z.

    2016-06-01

    In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades). It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.

  9. Spatial panel data models of aquaculture production in West Sumatra province with random-effects

    NASA Astrophysics Data System (ADS)

    Sartika, Wimi; Susetyo, Budi; Syafitri, Utami Dyah

    2017-03-01

    Spatial Panel Regression is a statistical model that used to analyze the effect of several independent variables on the dependent variable based on using panel data and take the spatial effect into account. There are two approaches on predicting spatial panel data, Fixed Effect Spatial Autoregressive (SAR-FE) and Random Effect Spatial Autoregressive (SAR-RE). SAR-FE has the assumption that the intercept has vary acrros spatial unit, while the SAR-RE's assumption is the interception is on residual model and it only has a general intercept. The purpose of this study is to modeling the production of West Sumatra fishery using Spatial Panel Regression. The model uses secondary data which is published by "Badan Pusat Statistik" on the results of aquaculture production in West Sumatra. The test results shown that the level of West Sumatra 2004-2012 aquaculture production was precisely modeled by the approach of Spatial Autoregressive Random Effect. From SAR-RE model, the most influence factors on aquaculture production in West Sumatra province in 2004-2012 was the number of motor boats, the area of fish seeds, fish seed production, and the number of fishermen public waters.

  10. A Bayesian model of psychosis symptom trajectory in Alzheimer's disease

    PubMed Central

    Seltman, Howard J.; Mitchell, Shaina; Sweet, Robert A.

    2015-01-01

    Objective Psychosis, like other neuropsychiatric symptoms of dementia, has many features that make predictive modeling of its onset difficult. For example, psychosis onset is associated with both the absolute degree of cognitive impairment and the rate of cognitive decline. Moreover, psychotic symptoms, while more likely than not to persist over time within individuals, may remit and recur. To facilitate predictive modeling of psychosis for personalized clinical decision making, including evaluating the role of risk genes in its onset, we have developed a novel Bayesian model of the dual trajectories of cognition and psychosis symptoms. Methods Cognition was modeled as a four-parameter logistic curve with random effects for all four parameters and possible covariates for the rate and time of fall. Psychosis was modeled as a continuous-time hidden Markov model with a latent never-psychotic class and states for pre-psychotic, actively psychotic and remitted psychosis. Covariates can affect the probability of being in the never-psychotic class. Covariates and the level of cognition can affect the transition rates for the hidden Markov model. Results The model characteristics were confirmed using simulated data. Results from 434 AD patients show that a decline in cognition is associated with an increased rate of transition to the psychotic state. Conclusions The model allows declining cognition as an input for psychosis prediction, while incorporating the full uncertainty of the interpolated cognition values. The techniques used can be used in future genetic studies of AD and are generalizable to the study of other neuropsychiatric symptoms in dementia. PMID:26216660

  11. A Bayesian model of psychosis symptom trajectory in Alzheimer's disease.

    PubMed

    Seltman, Howard J; Mitchell, Shaina; Sweet, Robert A

    2016-02-01

    Psychosis, like other neuropsychiatric symptoms of dementia, has many features that make predictive modeling of its onset difficult. For example, psychosis onset is associated with both the absolute degree of cognitive impairment and the rate of cognitive decline. Moreover, psychotic symptoms, while more likely than not to persist over time within individuals, may remit and recur. To facilitate predictive modeling of psychosis for personalized clinical decision making, including evaluating the role of risk genes in its onset, we have developed a novel Bayesian model of the dual trajectories of cognition and psychosis symptoms. Cognition was modeled as a four-parameter logistic curve with random effects for all four parameters and possible covariates for the rate and time of fall. Psychosis was modeled as a continuous-time hidden Markov model with a latent never-psychotic class and states for pre-psychotic, actively psychotic and remitted psychosis. Covariates can affect the probability of being in the never-psychotic class. Covariates and the level of cognition can affect the transition rates for the hidden Markov model. The model characteristics were confirmed using simulated data. Results from 434 AD patients show that a decline in cognition is associated with an increased rate of transition to the psychotic state. The model allows declining cognition as an input for psychosis prediction, while incorporating the full uncertainty of the interpolated cognition values. The techniques used can be used in future genetic studies of AD and are generalizable to the study of other neuropsychiatric symptoms in dementia. Copyright © 2015 John Wiley & Sons, Ltd.

  12. Bayesian model of Snellen visual acuity.

    PubMed

    Nestares, Oscar; Navarro, Rafael; Antona, Beatriz

    2003-07-01

    A Bayesian model of Snellen visual acuity (VA) has been developed that, as far as we know, is the first one that includes the three main stages of VA: (1) optical degradations, (2) neural image representation and contrast thresholding, and (3) character recognition. The retinal image of a Snellen test chart is obtained from experimental wave-aberration data. Then a subband image decomposition with a set of visual channels tuned to different spatial frequencies and orientations is applied to the retinal image, as in standard computational models of early cortical image representation. A neural threshold is applied to the contrast responses to include the effect of the neural contrast sensitivity. The resulting image representation is the base of a Bayesian pattern-recognition method robust to the presence of optical aberrations. The model is applied to images containing sets of letter optotypes at different scales, and the number of correct answers is obtained at each scale; the final output is the decimal Snellen VA. The model has no free parameters to adjust. The main input data are the eye's optical aberrations, and standard values are used for all other parameters, including the Stiles-Crawford effect, visual channels, and neural contrast threshold, when no subject specific values are available. When aberrations are large, Snellen VA involving pattern recognition differs from grating acuity, which is based on a simpler detection (or orientation-discrimination) task and hence is basically unaffected by phase distortions introduced by the optical transfer function. A preliminary test of the model in one subject produced close agreement between actual measurements and predicted VA values. Two examples are also included: (1) application of the method to the prediction of the VAin refractive-surgery patients and (2) simulation of the VA attainable by correcting ocular aberrations.

  13. Bayesian model of Snellen visual acuity

    NASA Astrophysics Data System (ADS)

    Nestares, Oscar; Navarro, Rafael; Antona, Beatriz

    2003-07-01

    A Bayesian model of Snellen visual acuity (VA) has been developed that, as far as we know, is the first one that includes the three main stages of VA: (1) optical degradations, (2) neural image representation and contrast thresholding, and (3) character recognition. The retinal image of a Snellen test chart is obtained from experimental wave-aberration data. Then a subband image decomposition with a set of visual channels tuned to different spatial frequencies and orientations is applied to the retinal image, as in standard computational models of early cortical image representation. A neural threshold is applied to the contrast responses to include the effect of the neural contrast sensitivity. The resulting image representation is the base of a Bayesian pattern-recognition method robust to the presence of optical aberrations. The model is applied to images containing sets of letter optotypes at different scales, and the number of correct answers is obtained at each scale; the final output is the decimal Snellen VA. The model has no free parameters to adjust. The main input data are the eyes optical aberrations, and standard values are used for all other parameters, including the StilesCrawford effect, visual channels, and neural contrast threshold, when no subject specific values are available. When aberrations are large, Snellen VA involving pattern recognition differs from grating acuity, which is based on a simpler detection (or orientation-discrimination) task and hence is basically unaffected by phase distortions introduced by the optical transfer function. A preliminary test of the model in one subject produced close agreement between actual measurements and predicted VA values. Two examples are also included: (1) application of the method to the prediction of the VA in refractive-surgery patients and (2) simulation of the VA attainable by correcting ocular aberrations. 2003 Optical Society of America

  14. Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures.

    PubMed

    Orbanz, Peter; Roy, Daniel M

    2015-02-01

    The natural habitat of most Bayesian methods is data represented by exchangeable sequences of observations, for which de Finetti's theorem provides the theoretical foundation. Dirichlet process clustering, Gaussian process regression, and many other parametric and nonparametric Bayesian models fall within the remit of this framework; many problems arising in modern data analysis do not. This article provides an introduction to Bayesian models of graphs, matrices, and other data that can be modeled by random structures. We describe results in probability theory that generalize de Finetti's theorem to such data and discuss their relevance to nonparametric Bayesian modeling. With the basic ideas in place, we survey example models available in the literature; applications of such models include collaborative filtering, link prediction, and graph and network analysis. We also highlight connections to recent developments in graph theory and probability, and sketch the more general mathematical foundation of Bayesian methods for other types of data beyond sequences and arrays.

  15. A Bayesian Analysis of Finite Mixtures in the LISREL Model.

    ERIC Educational Resources Information Center

    Zhu, Hong-Tu; Lee, Sik-Yum

    2001-01-01

    Proposes a Bayesian framework for estimating finite mixtures of the LISREL model. The model augments the observed data of the manifest variables with the latent variables and allocation variables and uses the Gibbs sampler to obtain the Bayesian solution. Discusses other associated statistical inferences. (SLD)

  16. Modeling Error Distributions of Growth Curve Models through Bayesian Methods

    ERIC Educational Resources Information Center

    Zhang, Zhiyong

    2016-01-01

    Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is…

  17. NIMROD: a program for inference via a normal approximation of the posterior in models with random effects based on ordinary differential equations.

    PubMed

    Prague, Mélanie; Commenges, Daniel; Guedj, Jérémie; Drylewicz, Julia; Thiébaut, Rodolphe

    2013-08-01

    Models based on ordinary differential equations (ODE) are widespread tools for describing dynamical systems. In biomedical sciences, data from each subject can be sparse making difficult to precisely estimate individual parameters by standard non-linear regression but information can often be gained from between-subjects variability. This makes natural the use of mixed-effects models to estimate population parameters. Although the maximum likelihood approach is a valuable option, identifiability issues favour Bayesian approaches which can incorporate prior knowledge in a flexible way. However, the combination of difficulties coming from the ODE system and from the presence of random effects raises a major numerical challenge. Computations can be simplified by making a normal approximation of the posterior to find the maximum of the posterior distribution (MAP). Here we present the NIMROD program (normal approximation inference in models with random effects based on ordinary differential equations) devoted to the MAP estimation in ODE models. We describe the specific implemented features such as convergence criteria and an approximation of the leave-one-out cross-validation to assess the model quality of fit. In pharmacokinetics models, first, we evaluate the properties of this algorithm and compare it with FOCE and MCMC algorithms in simulations. Then, we illustrate NIMROD use on Amprenavir pharmacokinetics data from the PUZZLE clinical trial in HIV infected patients. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  18. A multivariate Bayesian model for embryonic growth.

    PubMed

    Willemsen, Sten P; Eilers, Paul H C; Steegers-Theunissen, Régine P M; Lesaffre, Emmanuel

    2015-04-15

    Most longitudinal growth curve models evaluate the evolution of each of the anthropometric measurements separately. When applied to a 'reference population', this exercise leads to univariate reference curves against which new individuals can be evaluated. However, growth should be evaluated in totality, that is, by evaluating all body characteristics jointly. Recently, Cole et al. suggested the Superimposition by Translation and Rotation (SITAR) model, which expresses individual growth curves by three subject-specific parameters indicating their deviation from a flexible overall growth curve. This model allows the characterization of normal growth in a flexible though compact manner. In this paper, we generalize the SITAR model in a Bayesian way to multiple dimensions. The multivariate SITAR model allows us to create multivariate reference regions, which is advantageous for prediction. The usefulness of the model is illustrated on longitudinal measurements of embryonic growth obtained in the first semester of pregnancy, collected in the ongoing Rotterdam Predict study. Further, we demonstrate how the model can be used to find determinants of embryonic growth.

  19. Bayesian model selection for LISA pathfinder

    NASA Astrophysics Data System (ADS)

    Karnesis, Nikolaos; Nofrarias, Miquel; Sopuerta, Carlos F.; Gibert, Ferran; Armano, Michele; Audley, Heather; Congedo, Giuseppe; Diepholz, Ingo; Ferraioli, Luigi; Hewitson, Martin; Hueller, Mauro; Korsakova, Natalia; McNamara, Paul W.; Plagnol, Eric; Vitale, Stefano

    2014-03-01

    The main goal of the LISA Pathfinder (LPF) mission is to fully characterize the acceleration noise models and to test key technologies for future space-based gravitational-wave observatories similar to the eLISA concept. The data analysis team has developed complex three-dimensional models of the LISA Technology Package (LTP) experiment onboard the LPF. These models are used for simulations, but, more importantly, they will be used for parameter estimation purposes during flight operations. One of the tasks of the data analysis team is to identify the physical effects that contribute significantly to the properties of the instrument noise. A way of approaching this problem is to recover the essential parameters of a LTP model fitting the data. Thus, we want to define the simplest model that efficiently explains the observations. To do so, adopting a Bayesian framework, one has to estimate the so-called Bayes factor between two competing models. In our analysis, we use three main different methods to estimate it: the reversible jump Markov chain Monte Carlo method, the Schwarz criterion, and the Laplace approximation. They are applied to simulated LPF experiments in which the most probable LTP model that explains the observations is recovered. The same type of analysis presented in this paper is expected to be followed during flight operations. Moreover, the correlation of the output of the aforementioned methods with the design of the experiment is explored.

  20. Note on an Identity Between Two Unbiased Variance Estimators for the Grand Mean in a Simple Random Effects Model.

    PubMed

    Levin, Bruce; Leu, Cheng-Shiun

    2013-01-01

    We demonstrate the algebraic equivalence of two unbiased variance estimators for the sample grand mean in a random sample of subjects from an infinite population where subjects provide repeated observations following a homoscedastic random effects model.

  1. Understanding Random Effects in Group-Based Trajectory Modeling: An Application of Moffitt's Developmental Taxonomy.

    PubMed

    Saunders, Jessica M

    2010-01-01

    The group-based trajectory modeling approach is a systematic way of categorizing subjects into different groups based on their developmental trajectories using formal and objective statistical criteria. With the recent advancement in methods and statistical software, modeling possibilities are almost limitless; however, parallel advances in theory development have not kept pace. This paper examines some of the modeling options that are becoming more widespread and how they impact both empirical and theoretical findings. The key issue that is explored is the impact of adding random effects to the latent growth factors and how this alters the meaning of a group. The paper argues that technical specification should be guided by theory, and Moffitt's developmental taxonomy is used as an illustration of how modeling decisions can be matched to theory.

  2. A Tutorial Introduction to Bayesian Models of Cognitive Development

    ERIC Educational Resources Information Center

    Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei

    2011-01-01

    We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…

  3. Implementing Relevance Feedback in the Bayesian Network Retrieval Model.

    ERIC Educational Resources Information Center

    de Campos, Luis M.; Fernandez-Luna, Juan M.; Huete, Juan F.

    2003-01-01

    Discussion of relevance feedback in information retrieval focuses on a proposal for the Bayesian Network Retrieval Model. Bases the proposal on the propagation of partial evidences in the Bayesian network, representing new information obtained from the user's relevance judgments to compute the posterior relevance probabilities of the documents…

  4. Bayesian Student Modeling and the Problem of Parameter Specification.

    ERIC Educational Resources Information Center

    Millan, Eva; Agosta, John Mark; Perez de la Cruz, Jose Luis

    2001-01-01

    Discusses intelligent tutoring systems and the application of Bayesian networks to student modeling. Considers reasons for not using Bayesian networks, including the computational complexity of the algorithms and the difficulty of knowledge acquisition, and proposes an approach to simplify knowledge acquisition that applies causal independence to…

  5. A Tutorial Introduction to Bayesian Models of Cognitive Development

    ERIC Educational Resources Information Center

    Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei

    2011-01-01

    We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…

  6. A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

    EPA Science Inventory

    We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...

  7. A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

    EPA Science Inventory

    We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...

  8. Maternity length of stay modelling by gamma mixture regression with random effects.

    PubMed

    Lee, Andy H; Wang, Kui; Yau, Kelvin K W; McLachlan, Geoffrey J; Ng, Shu Kay

    2007-08-01

    Maternity length of stay (LOS) is an important measure of hospital activity, but its empirical distribution is often positively skewed. A two-component gamma mixture regression model has been proposed to analyze the heterogeneous maternity LOS. The problem is that observations collected from the same hospital are often correlated, which can lead to spurious associations and misleading inferences. To account for the inherent correlation, random effects are incorporated within the linear predictors of the two-component gamma mixture regression model. An EM algorithm is developed for the residual maximum quasi-likelihood estimation of the regression coefficients and variance component parameters. The approach enables the correct identification and assessment of risk factors affecting the short-stay and long-stay patient subgroups. In addition, the predicted random effects can provide information on the inter-hospital variations after adjustment for patient characteristics and health provision factors. A simulation study shows that the estimators obtained via the EM algorithm perform well in all the settings considered. Application to a set of maternity LOS data for women having obstetrical delivery with multiple complicating diagnoses is illustrated. ((c) 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim).

  9. Technology diffusion in hospitals: a log odds random effects regression model.

    PubMed

    Blank, Jos L T; Valdmanis, Vivian G

    2015-01-01

    This study identifies the factors that affect the diffusion of hospital innovations. We apply a log odds random effects regression model on hospital micro data. We introduce the concept of clustering innovations and the application of a log odds random effects regression model to describe the diffusion of technologies. We distinguish a number of determinants, such as service, physician, and environmental, financial and organizational characteristics of the 60 Dutch hospitals in our sample. On the basis of this data set on Dutch general hospitals over the period 1995-2002, we conclude that there is a relation between a number of determinants and the diffusion of innovations underlining conclusions from earlier research. Positive effects were found on the basis of the size of the hospitals, competition and a hospital's commitment to innovation. It appears that if a policy is developed to further diffuse innovations, the external effects of demand and market competition need to be examined, which would de facto lead to an efficient use of technology. For the individual hospital, instituting an innovations office appears to be the most prudent course of action. © 2013 The Authors. International Journal of Health Planning and Management published by John Wiley & Sons, Ltd.

  10. Advances in Bayesian Modeling in Educational Research

    ERIC Educational Resources Information Center

    Levy, Roy

    2016-01-01

    In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…

  11. Advances in Bayesian Modeling in Educational Research

    ERIC Educational Resources Information Center

    Levy, Roy

    2016-01-01

    In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…

  12. Bayesian analysis. II. Signal detection and model selection

    NASA Astrophysics Data System (ADS)

    Bretthorst, G. Larry

    In the preceding. paper, Bayesian analysis was applied to the parameter estimation problem, given quadrature NMR data. Here Bayesian analysis is extended to the problem of selecting the model which is most probable in view of the data and all the prior information. In addition to the analytic calculation, two examples are given. The first example demonstrates how to use Bayesian probability theory to detect small signals in noise. The second example uses Bayesian probability theory to compute the probability of the number of decaying exponentials in simulated T1 data. The Bayesian answer to this question is essentially a microcosm of the scientific method and a quantitative statement of Ockham's razor: theorize about possible models, compare these to experiment, and select the simplest model that "best" fits the data.

  13. Analysis of household data on influenza epidemic with Bayesian hierarchical model.

    PubMed

    Hsu, C Y; Yen, A M F; Chen, L S; Chen, H H

    2015-03-01

    Data used for modelling the household transmission of infectious diseases, such as influenza, have inherent multilevel structures and correlated property, which make the widely used conventional infectious disease transmission models (including the Greenwood model and the Reed-Frost model) not directly applicable within the context of a household (due to the crowded domestic condition or socioeconomic status of the household). Thus, at the household level, the effects resulting from individual-level factors, such as vaccination, may be confounded or modified in some way. We proposed the Bayesian hierarchical random-effects (random intercepts and random slopes) model under the context of generalised linear model to capture heterogeneity and variation on the individual, generation, and household levels. It was applied to empirical surveillance data on the influenza epidemic in Taiwan. The parameters of interest were estimated by using the Markov chain Monte Carlo method in conjunction with the Bayesian directed acyclic graphical models. Comparisons between models were made using the deviance information criterion. Based on the result of the random-slope Bayesian hierarchical method under the context of the Reed-Frost transmission model, the regression coefficient regarding the protective effect of vaccination varied statistically significantly from household to household. The result of such a heterogeneity was robust to the use of different prior distributions (including non-informative, sceptical, and enthusiastic ones). By integrating out the uncertainty of the parameters of the posterior distribution, the predictive distribution was computed to forecast the number of influenza cases allowing for random-household effect.

  14. A mixture model with random-effects components for clustering correlated gene-expression profiles.

    PubMed

    Ng, S K; McLachlan, G J; Wang, K; Ben-Tovim Jones, L; Ng, S-W

    2006-07-15

    The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes. We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation)and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real datasets, representing typical microarray experimental designs, covering time-course, repeated-measurement and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are

  15. Bayesian analysis of the backreaction models

    SciTech Connect

    Kurek, Aleksandra; Bolejko, Krzysztof; Szydlowski, Marek

    2010-03-15

    We present a Bayesian analysis of four different types of backreaction models, which are based on the Buchert equations. In this approach, one considers a solution to the Einstein equations for a general matter distribution and then an average of various observable quantities is taken. Such an approach became of considerable interest when it was shown that it could lead to agreement with observations without resorting to dark energy. In this paper we compare the {Lambda}CDM model and the backreaction models with type Ia supernovae, baryon acoustic oscillations, and cosmic microwave background data, and find that the former is favored. However, the tested models were based on some particular assumptions about the relation between the average spatial curvature and the backreaction, as well as the relation between the curvature and curvature index. In this paper we modified the latter assumption, leaving the former unchanged. We find that, by varying the relation between the curvature and curvature index, we can obtain a better fit. Therefore, some further work is still needed--in particular, the relation between the backreaction and the curvature should be revisited in order to fully determine the feasibility of the backreaction models to mimic dark energy.

  16. Bayesian inverse modeling for quantitative precipitation estimation

    NASA Astrophysics Data System (ADS)

    Schinagl, Katharina; Rieger, Christian; Simmer, Clemens; Xie, Xinxin; Friederichs, Petra

    2017-04-01

    Polarimetric radars provide us with a richness of precipitation related measurements. Especially the high spatial and temporal resolution make the data an important information, e.g. for hydrological modeling. However, uncertainties in the precipitation estimates are large. Their systematic assessment and quantification is thus of great importance. Polarimetric radar observables like horizontal and vertical reflectivity ZH and ZV , cross-correlation coefficient ρHV and specific differential phase KDP are related to the drop size distribution (DSD) in the scan. This relation is described by forward operators which are integrals over the DSD and scattering terms. Given the polarimetric observables, the respective forward operators and assumptions about the measurement errors, we investigate the uncertainty in the DSD parameter estimation and based on it the uncertainty of precipitation estimates. We assume that the DSD follows a Gamma model, N(D) = N0Dμ exp(-ΛD), where all three parameters are variable. This model allows us to account for the high variability of the DSD. We employ the framework of Bayesian inverse methods to derive the posterior distribution of the DSD parameters. The inverse problem is investigated in a simulated environment (SE) using the COSMO-DE numerical weather prediction model. The advantage of the SE is that - unlike in a real world application - we know the parameters we want to estimate. Thus, building the inverse model into the SE gives us the opportunity of verifying our results against the COSMO-simulated DSD-values.

  17. Bayesian modeling of differential gene expression.

    PubMed

    Lewin, Alex; Richardson, Sylvia; Marshall, Clare; Glazier, Anne; Aitman, Tim

    2006-03-01

    We present a Bayesian hierarchical model for detecting differentially expressing genes that includes simultaneous estimation of array effects, and show how to use the output for choosing lists of genes for further investigation. We give empirical evidence that expression-level dependent array effects are needed, and explore different nonlinear functions as part of our model-based approach to normalization. The model includes gene-specific variances but imposes some necessary shrinkage through a hierarchical structure. Model criticism via posterior predictive checks is discussed. Modeling the array effects (normalization) simultaneously with differential expression gives fewer false positive results. To choose a list of genes, we propose to combine various criteria (for instance, fold change and overall expression) into a single indicator variable for each gene. The posterior distribution of these variables is used to pick the list of genes, thereby taking into account uncertainty in parameter estimates. In an application to mouse knockout data, Gene Ontology annotations over- and underrepresented among the genes on the chosen list are consistent with biological expectations.

  18. Hierarchical Bayesian models of subtask learning.

    PubMed

    Anglim, Jeromy; Wynton, Sarah K A

    2015-07-01

    The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking task, which logged participant actions, enabling measurement of strategy use and subtask performance. Model comparison was performed using deviance information criterion (DIC), posterior predictive checks, plots of model fits, and model recovery simulations. Results showed that although learning tended to be monotonically decreasing and decelerating, and approaching an asymptote for all subtasks, there was substantial inconsistency in learning curves both at the group- and individual-levels. This inconsistency was most apparent when constraining both the rate and the ratio of learning to asymptote to be equal across subtasks, thereby giving learning curves only 1 parameter for scaling. The inclusion of 6 strategy covariates provided improved prediction of subtask performance capturing different subtask learning processes and subtask trade-offs. In addition, strategy use partially explained the inconsistency in subtask learning. Overall, the model provided a more nuanced representation of how complex tasks can be decomposed in terms of simpler learning mechanisms.

  19. BAYESIAN ANALYSIS OF REPEATED EVENTS USING EVENT-DEPENDENT FRAILTY MODELS: AN APPLICATION TO BEHAVIORAL OBSERVATION DATA

    PubMed Central

    Snyder, James

    2009-01-01

    In social interaction studies, one commonly encounters repeated displays of behaviors along with their duration data. Statistical methods for the analysis of such data use either parametric (e.g., Weibull) or semi-nonparametric (e.g., Cox) proportional hazard models, modified to include random effects (frailty) which account for the correlation of repeated occurrences of behaviors within a unit (dyad). However, dyad-specific random effects by themselves are not able to account for the ordering of event occurrences within dyads. The occurrence of an event (behavior) can make further occurrences of the same behavior to be more or less likely during an interaction. This paper develops event-dependent random effects models for analyzing repeated behaviors data using a Bayesian approach. The models are illustrated by a dataset relating to emotion regulation in families with children who have behavioral or emotional problems. PMID:20161593

  20. Hierarchical Bayesian modeling of random and residual variance-covariance matrices in bivariate mixed effects models.

    PubMed

    Bello, Nora M; Steibel, Juan P; Tempelman, Robert J

    2010-06-01

    Bivariate mixed effects models are often used to jointly infer upon covariance matrices for both random effects (u) and residuals (e) between two different phenotypes in order to investigate the architecture of their relationship. However, these (co)variances themselves may additionally depend upon covariates as well as additional sets of exchangeable random effects that facilitate borrowing of strength across a large number of clusters. We propose a hierarchical Bayesian extension of the classical bivariate mixed effects model by embedding additional levels of mixed effects modeling of reparameterizations of u-level and e-level (co)variances between two traits. These parameters are based upon a recently popularized square-root-free Cholesky decomposition and are readily interpretable, each conveniently facilitating a generalized linear model characterization. Using Markov Chain Monte Carlo methods, we validate our model based on a simulation study and apply it to a joint analysis of milk yield and calving interval phenotypes in Michigan dairy cows. This analysis indicates that the e-level relationship between the two traits is highly heterogeneous across herds and depends upon systematic herd management factors.

  1. Stochastic model updating utilizing Bayesian approach and Gaussian process model

    NASA Astrophysics Data System (ADS)

    Wan, Hua-Ping; Ren, Wei-Xin

    2016-03-01

    Stochastic model updating (SMU) has been increasingly applied in quantifying structural parameter uncertainty from responses variability. SMU for parameter uncertainty quantification refers to the problem of inverse uncertainty quantification (IUQ), which is a nontrivial task. Inverse problem solved with optimization usually brings about the issues of gradient computation, ill-conditionedness, and non-uniqueness. Moreover, the uncertainty present in response makes the inverse problem more complicated. In this study, Bayesian approach is adopted in SMU for parameter uncertainty quantification. The prominent strength of Bayesian approach for IUQ problem is that it solves IUQ problem in a straightforward manner, which enables it to avoid the previous issues. However, when applied to engineering structures that are modeled with a high-resolution finite element model (FEM), Bayesian approach is still computationally expensive since the commonly used Markov chain Monte Carlo (MCMC) method for Bayesian inference requires a large number of model runs to guarantee the convergence. Herein we reduce computational cost in two aspects. On the one hand, the fast-running Gaussian process model (GPM) is utilized to approximate the time-consuming high-resolution FEM. On the other hand, the advanced MCMC method using delayed rejection adaptive Metropolis (DRAM) algorithm that incorporates local adaptive strategy with global adaptive strategy is employed for Bayesian inference. In addition, we propose the use of the powerful variance-based global sensitivity analysis (GSA) in parameter selection to exclude non-influential parameters from calibration parameters, which yields a reduced-order model and thus further alleviates the computational burden. A simulated aluminum plate and a real-world complex cable-stayed pedestrian bridge are presented to illustrate the proposed framework and verify its feasibility.

  2. Scale Mixture Models with Applications to Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Qin, Zhaohui S.; Damien, Paul; Walker, Stephen

    2003-11-01

    Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.

  3. An introduction and integration of cross-classified, multiple membership, and dynamic group random-effects models.

    PubMed

    Cafri, Guy; Hedeker, Donald; Aarons, Gregory A

    2015-12-01

    In longitudinal studies, time-varying group membership and group effects are important issues that need to be addressed. In this article we describe use of cross-classified and multiple membership random-effects models to address time-varying group membership, and dynamic group random-effects models to address time-varying group effects. We propose new models that integrate features of existing models, evaluate these models through simulation, provide guidance on how to fit these models, and apply the models in 2 real data examples. The discussion focuses on challenges in the application of these models.

  4. An Introduction and Integration of Cross-Classified, Multiple Membership, and Dynamic Group Random-Effects Models

    PubMed Central

    Cafri, Guy; Hedeker, Donald; Aarons, Gregory A.

    2016-01-01

    In longitudinal studies, time-varying group membership and group effects are important issues that need to be addressed. In this article we describe use of cross-classified and multiple membership random-effect models to address time varying group membership, and dynamic group random-effect models to address time-varying group effects. We propose new models that integrate features of existing models, evaluate these models through simulation, provide guidance on how to fit these models, and apply the models in two real data examples. The discussion focuses on challenges in the application of these models. PMID:26237504

  5. SNP_NLMM: A SAS Macro to Implement a Flexible Random Effects Density for Generalized Linear and Nonlinear Mixed Models

    PubMed Central

    Vock, David M.; Davidian, Marie; Tsiatis, Anastasios A.

    2014-01-01

    Generalized linear and nonlinear mixed models (GMMMs and NLMMs) are commonly used to represent non-Gaussian or nonlinear longitudinal or clustered data. A common assumption is that the random effects are Gaussian. However, this assumption may be unrealistic in some applications, and misspecification of the random effects density may lead to maximum likelihood parameter estimators that are inconsistent, biased, and inefficient. Because testing if the random effects are Gaussian is difficult, previous research has recommended using a flexible random effects density. However, computational limitations have precluded widespread use of flexible random effects densities for GLMMs and NLMMs. We develop a SAS macro, SNP_NLMM, that overcomes the computational challenges to fit GLMMs and NLMMs where the random effects are assumed to follow a smooth density that can be represented by the seminonparametric formulation proposed by Gallant and Nychka (1987). The macro is flexible enough to allow for any density of the response conditional on the random effects and any nonlinear mean trajectory. We demonstrate the SNP_NLMM macro on a GLMM of the disease progression of toenail infection and on a NLMM of intravenous drug concentration over time. PMID:24688453

  6. A guide to Bayesian model selection for ecologists

    USGS Publications Warehouse

    Hooten, Mevin B.; Hobbs, N.T.

    2015-01-01

    The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.

  7. Constructive Epistemic Modeling: A Hierarchical Bayesian Model Averaging Method

    NASA Astrophysics Data System (ADS)

    Tsai, F. T. C.; Elshall, A. S.

    2014-12-01

    Constructive epistemic modeling is the idea that our understanding of a natural system through a scientific model is a mental construct that continually develops through learning about and from the model. Using the hierarchical Bayesian model averaging (HBMA) method [1], this study shows that segregating different uncertain model components through a BMA tree of posterior model probabilities, model prediction, within-model variance, between-model variance and total model variance serves as a learning tool [2]. First, the BMA tree of posterior model probabilities permits the comparative evaluation of the candidate propositions of each uncertain model component. Second, systemic model dissection is imperative for understanding the individual contribution of each uncertain model component to the model prediction and variance. Third, the hierarchical representation of the between-model variance facilitates the prioritization of the contribution of each uncertain model component to the overall model uncertainty. We illustrate these concepts using the groundwater modeling of a siliciclastic aquifer-fault system. The sources of uncertainty considered are from geological architecture, formation dip, boundary conditions and model parameters. The study shows that the HBMA analysis helps in advancing knowledge about the model rather than forcing the model to fit a particularly understanding or merely averaging several candidate models. [1] Tsai, F. T.-C., and A. S. Elshall (2013), Hierarchical Bayesian model averaging for hydrostratigraphic modeling: Uncertainty segregation and comparative evaluation. Water Resources Research, 49, 5520-5536, doi:10.1002/wrcr.20428. [2] Elshall, A.S., and F. T.-C. Tsai (2014). Constructive epistemic modeling of groundwater flow with geological architecture and boundary condition uncertainty under Bayesian paradigm, Journal of Hydrology, 517, 105-119, doi: 10.1016/j.jhydrol.2014.05.027.

  8. Bayesian Case-deletion Model Complexity and Information Criterion

    PubMed Central

    Zhu, Hongtu; Ibrahim, Joseph G.; Chen, Qingxia

    2015-01-01

    We establish a connection between Bayesian case influence measures for assessing the influence of individual observations and Bayesian predictive methods for evaluating the predictive performance of a model and comparing different models fitted to the same dataset. Based on such a connection, we formally propose a new set of Bayesian case-deletion model complexity (BCMC) measures for quantifying the effective number of parameters in a given statistical model. Its properties in linear models are explored. Adding some functions of BCMC to a conditional deviance function leads to a Bayesian case-deletion information criterion (BCIC) for comparing models. We systematically investigate some properties of BCIC and its connection with other information criteria, such as the Deviance Information Criterion (DIC). We illustrate the proposed methodology on linear mixed models with simulations and a real data example. PMID:26180578

  9. Bayesian Case-deletion Model Complexity and Information Criterion.

    PubMed

    Zhu, Hongtu; Ibrahim, Joseph G; Chen, Qingxia

    2014-10-01

    We establish a connection between Bayesian case influence measures for assessing the influence of individual observations and Bayesian predictive methods for evaluating the predictive performance of a model and comparing different models fitted to the same dataset. Based on such a connection, we formally propose a new set of Bayesian case-deletion model complexity (BCMC) measures for quantifying the effective number of parameters in a given statistical model. Its properties in linear models are explored. Adding some functions of BCMC to a conditional deviance function leads to a Bayesian case-deletion information criterion (BCIC) for comparing models. We systematically investigate some properties of BCIC and its connection with other information criteria, such as the Deviance Information Criterion (DIC). We illustrate the proposed methodology on linear mixed models with simulations and a real data example.

  10. Bayesian information criterion for censored survival models.

    PubMed

    Volinsky, C T; Raftery, A E

    2000-03-01

    We investigate the Bayesian Information Criterion (BIC) for variable selection in models for censored survival data. Kass and Wasserman (1995, Journal of the American Statistical Association 90, 928-934) showed that BIC provides a close approximation to the Bayes factor when a unit-information prior on the parameter space is used. We propose a revision of the penalty term in BIC so that it is defined in terms of the number of uncensored events instead of the number of observations. For a simple censored data model, this revision results in a better approximation to the exact Bayes factor based on a conjugate unit-information prior. In the Cox proportional hazards regression model, we propose defining BIC in terms of the maximized partial likelihood. Using the number of deaths rather than the number of individuals in the BIC penalty term corresponds to a more realistic prior on the parameter space and is shown to improve predictive performance for assessing stroke risk in the Cardiovascular Health Study.

  11. A shared random effects model for censored medical costs and mortality.

    PubMed

    Liu, Lei; Wolfe, Robert A; Kalbfleisch, John D

    2007-01-15

    In this paper, we propose a model for medical costs recorded at regular time intervals, e.g. every month, as repeated measures in the presence of a terminating event, such as death. Prior models have related monthly medical costs to time since entry, with extra costs at the final observations at the time of death. Our joint model for monthly medical costs and survival time incorporates two important new features. First, medical cost and survival may be correlated because more 'frail' patients tend to accumulate medical costs faster and die earlier. A joint random effects model is proposed to account for the correlation between medical costs and survival by a shared random effect. Second, monthly medical costs usually increase during the time period prior to death because of the intensive care for dying patients. We present a method for estimating the pattern of cost prior to death, which is applicable if the pattern can be characterized as an additive effect that is limited to a fixed time interval, say b units of time before death. This 'turn back time' method for censored observations censors cost data b units of time before the actual censoring time, while keeping the actual censoring time for the survival data. Time-dependent covariates can be included. Maximum likelihood estimation and inference are carried out through a Monte Carlo EM algorithm with a Metropolis-Hastings sampler in the E-step. An analysis of monthly outpatient EPO medical cost data for dialysis patients is presented to illustrate the proposed methods.

  12. A matrix-based method of moments for fitting the multivariate random effects model for meta-analysis and meta-regression.

    PubMed

    Jackson, Dan; White, Ian R; Riley, Richard D

    2013-03-01

    Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. A matrix-based method of moments for fitting the multivariate random effects model for meta-analysis and meta-regression

    PubMed Central

    Jackson, Dan; White, Ian R; Riley, Richard D

    2013-01-01

    Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213

  14. Bayesian Modeling for Genetic Anticipation in Presence of Mutational Heterogeneity: A Case-Study in Lynch Syndrome

    PubMed Central

    Boonstra, Philip S.; Mukherjee, Bhramar; Taylor, Jeremy M. G.; Nilbert, Mef; Moreno, Victor M.; Gruber, Stephen B.

    2011-01-01

    Summary Genetic anticipation, described by earlier age of onset (AOO) and more aggressive symptoms in successive generations, is a phenomenon noted in certain hereditary diseases. Its extent may vary between families and/or between mutation sub-types known to be associated with the disease phenotype. In this paper, we posit a Bayesian approach to infer genetic anticipation under flexible random effects models for censored data that capture the effect of successive generations on AOO. Primary interest lies in the random effects. Misspecifying the distribution of random effects may result in incorrect inferential conclusions. We compare the fit of four candidate random effects distributions via Bayesian model fit diagnostics. A related statistical issue here is isolating the confounding effect of changes in secular trends, screening and medical practices that may affect time to disease detection across birth cohorts. Using historic cancer registry data, we borrow from relative survival analysis methods to adjust for changes in age-specific incidence across birth cohorts. Our motivating case-study comes from a Danish cancer register of 124 families with mutations in mismatch repair genes known to cause hereditary non-polyposis colorectal cancer, also called Lynch syndrome. We find evidence for a decrease in AOO between generations in this study. Our model predicts family level anticipation effects which are potentially useful in genetic counseling clinics for high risk families. PMID:21627626

  15. Fitting parametric random effects models in very large data sets with application to VHA national data.

    PubMed

    Gebregziabher, Mulugeta; Egede, Leonard; Gilbert, Gregory E; Hunt, Kelly; Nietert, Paul J; Mauldin, Patrick

    2012-10-24

    With the current focus on personalized medicine, patient/subject level inference is often of key interest in translational research. As a result, random effects models (REM) are becoming popular for patient level inference. However, for very large data sets that are characterized by large sample size, it can be difficult to fit REM using commonly available statistical software such as SAS since they require inordinate amounts of computer time and memory allocations beyond what are available preventing model convergence. For example, in a retrospective cohort study of over 800,000 Veterans with type 2 diabetes with longitudinal data over 5 years, fitting REM via generalized linear mixed modeling using currently available standard procedures in SAS (e.g. PROC GLIMMIX) was very difficult and same problems exist in Stata's gllamm or R's lme packages. Thus, this study proposes and assesses the performance of a meta regression approach and makes comparison with methods based on sampling of the full data. We use both simulated and real data from a national cohort of Veterans with type 2 diabetes (n=890,394) which was created by linking multiple patient and administrative files resulting in a cohort with longitudinal data collected over 5 years. The outcome of interest was mean annual HbA1c measured over a 5 years period. Using this outcome, we compared parameter estimates from the proposed random effects meta regression (REMR) with estimates based on simple random sampling and VISN (Veterans Integrated Service Networks) based stratified sampling of the full data. Our results indicate that REMR provides parameter estimates that are less likely to be biased with tighter confidence intervals when the VISN level estimates are homogenous. When the interest is to fit REM in repeated measures data with very large sample size, REMR can be used as a good alternative. It leads to reasonable inference for both Gaussian and non-Gaussian responses if parameter estimates are

  16. Nonparametric Bayesian Modeling for Automated Database Schema Matching

    SciTech Connect

    Ferragut, Erik M; Laska, Jason A

    2015-01-01

    The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.

  17. When mechanism matters: Bayesian forecasting using models of ecological diffusion

    USGS Publications Warehouse

    Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.

    2017-01-01

    Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.

  18. Gene analysis for longitudinal family data using random-effects models.

    PubMed

    Houwing-Duistermaat, Jeanine J; Helmer, Quinta; Balliu, Bruna; van den Akker, Erik; Tsonaka, Roula; Uh, Hae-Won

    2014-01-01

    We have extended our recently developed 2-step approach for gene-based analysis to the family design and to the analysis of rare variants. The goal of this approach is to study the joint effect of multiple single-nucleotide polymorphisms that belong to a gene. First, the information in a gene is summarized by 2 variables, namely the empirical Bayes estimate capturing common variation and the number of rare variants. By using random effects for the common variants, our approach acknowledges the within-gene correlations. In the second step, the 2 summaries were included as covariates in linear mixed models. To test the null hypothesis of no association, a multivariate Wald test was applied. We analyzed the simulated data sets to assess the performance of the method. Then we applied the method to the real data set and identified a significant association between FRMD4B and diastolic blood pressure (p-value = 8.3 × 10(-12)).

  19. Calibrating Bayesian Network Representations of Social-Behavioral Models

    SciTech Connect

    Whitney, Paul D.; Walsh, Stephen J.

    2010-04-08

    While human behavior has long been studied, recent and ongoing advances in computational modeling present opportunities for recasting research outcomes in human behavior. In this paper we describe how Bayesian networks can represent outcomes of human behavior research. We demonstrate a Bayesian network that represents political radicalization research – and show a corresponding visual representation of aspects of this research outcome. Since Bayesian networks can be quantitatively compared with external observations, the representation can also be used for empirical assessments of the research which the network summarizes. For a political radicalization model based on published research, we show this empirical comparison with data taken from the Minorities at Risk Organizational Behaviors database.

  20. Predicting expressway crash frequency using a random effect negative binomial model: A case study in China.

    PubMed

    Ma, Zhuanglin; Zhang, Honglu; Chien, Steven I-Jy; Wang, Jin; Dong, Chunjiao

    2017-01-01

    To investigate the relationship between crash frequency and potential influence factors, the accident data for events occurring on a 50km long expressway in China, including 567 crash records (2006-2008), were collected and analyzed. Both the fixed-length and the homogeneous longitudinal grade methods were applied to divide the study expressway section into segments. A negative binomial (NB) model and a random effect negative binomial (RENB) model were developed to predict crash frequency. The parameters of both models were determined using the maximum likelihood (ML) method, and the mixed stepwise procedure was applied to examine the significance of explanatory variables. Three explanatory variables, including longitudinal grade, road width, and ratio of longitudinal grade and curve radius (RGR), were found as significantly affecting crash frequency. The marginal effects of significant explanatory variables to the crash frequency were analyzed. The model performance was determined by the relative prediction error and the cumulative standardized residual. The results show that the RENB model outperforms the NB model. It was also found that the model performance with the fixed-length segment method is superior to that with the homogeneous longitudinal grade segment method.

  1. Multivariate Bayesian Models of Extreme Rainfall

    NASA Astrophysics Data System (ADS)

    Rahill-Marier, B.; Devineni, N.; Lall, U.; Farnham, D.

    2013-12-01

    Accounting for spatial heterogeneity in extreme rainfall has important ramifications in hydrological design and climate models alike. Traditional methods, including areal reduction factors and kriging, are sensitive to catchment shape assumptions and return periods, and do not explicitly model spatial dependence between between data points. More recent spatially dense rainfall simulators depend on newer data sources such as radar and may struggle to reproduce extremes because of physical assumptions in the model and short historical records. Rain gauges offer the longest historical record, key when considering rainfall extremes and changes over time, and particularly relevant in today's environment of designing for climate change. In this paper we propose a probabilistic approach of accounting for spatial dependence using the lengthy but spatially disparate hourly rainfall network in the greater New York City area. We build a hierarchical Bayesian model allowing extremes at one station to co-vary with concurrent rainfall fields occurring at other stations. Subsequently we pool across the extreme rainfall fields of all stations, and demonstrate that the expected catchment-wide events are significantly lower when considering spatial fields instead of maxima-only fields. We additionally demonstrate the importance of using concurrent spatial fields, rather than annual maxima, in producing covariance matrices that describe true storm dynamics. This approach is also unique in that it considers short duration storms - from one hour to twenty-four hours - rather than the daily values typically derived from rainfall gauges. The same methodology can be extended to include the radar fields available in the past decade. The hierarchical multilevel approach lends itself easily to integration of long-record parameters and short-record parameters at a station or regional level. In addition climate covariates can be introduced to support the relationship of spatial covariance with

  2. Bayesian approach to decompression sickness model parameter estimation.

    PubMed

    Howle, L E; Weber, P W; Nichols, J M

    2017-03-01

    We examine both maximum likelihood and Bayesian approaches for estimating probabilistic decompression sickness model parameters. Maximum likelihood estimation treats parameters as fixed values and determines the best estimate through repeated trials, whereas the Bayesian approach treats parameters as random variables and determines the parameter probability distributions. We would ultimately like to know the probability that a parameter lies in a certain range rather than simply make statements about the repeatability of our estimator. Although both represent powerful methods of inference, for models with complex or multi-peaked likelihoods, maximum likelihood parameter estimates can prove more difficult to interpret than the estimates of the parameter distributions provided by the Bayesian approach. For models of decompression sickness, we show that while these two estimation methods are complementary, the credible intervals generated by the Bayesian approach are more naturally suited to quantifying uncertainty in the model parameters. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. BAYESIAN METHODS FOR REGIONAL-SCALE EUTROPHICATION MODELS. (R830887)

    EPA Science Inventory

    We demonstrate a Bayesian classification and regression tree (CART) approach to link multiple environmental stressors to biological responses and quantify uncertainty in model predictions. Such an approach can: (1) report prediction uncertainty, (2) be consistent with the amou...

  4. BAYESIAN METHODS FOR REGIONAL-SCALE EUTROPHICATION MODELS. (R830887)

    EPA Science Inventory

    We demonstrate a Bayesian classification and regression tree (CART) approach to link multiple environmental stressors to biological responses and quantify uncertainty in model predictions. Such an approach can: (1) report prediction uncertainty, (2) be consistent with the amou...

  5. Refined-scale panel data crash rate analysis using random-effects tobit model.

    PubMed

    Chen, Feng; Ma, XiaoXiang; Chen, Suren

    2014-12-01

    Random effects tobit models are developed in predicting hourly crash rates with refined-scale panel data structure in both temporal and spatial domains. The proposed models address left-censoring effects of crash rates data while accounting for unobserved heterogeneity across groups and serial correlations within group in the meantime. The utilization of panel data in both refined temporal and spatial scales (hourly record and 1-mile roadway segments on average) exhibits strong potential on capturing the nature of time-varying and spatially varying contributing variables that is usually ignored in traditional aggregated traffic accident modeling. 1-year accident data and detailed traffic, environment, road geometry and surface condition data from a segment of I-25 in Colorado are adopted to demonstrate the proposed methodology. To better understand significantly different characteristics of crashes, two separate models, one for daytime and another for nighttime, have been developed. The results show major difference in contributing factors towards crash rate between daytime and nighttime models, implying considerable needs to investigate daytime and nighttime crashes separately using refined-scale data. After the models are developed, a comprehensive review of various contributing factors is made, followed by discussions on some interesting findings.

  6. Trade-offs between accuracy and interpretability in von Bertalanffy random-effects models of growth.

    PubMed

    Vincenzi, Simone; Crivelli, Alain J; Munch, Stephan; Skaug, Hans J; Mangel, Marc

    2016-07-01

    Better understanding of variation in growth will always be an important problem in ecology. Individual variation in growth can arise from a variety of processes; for example, individuals within a population vary in their intrinsic metabolic rates and behavioral traits, which may influence their foraging dynamics and access to resources. However, when adopting a growth model, we face trade-offs between model complexity, biological interpretability of parameters, and goodness of fit. We explore how different formulations of the von Bertalanffy growth function (vBGF) with individual random effects and environmental predictors affect these trade-offs. In the vBGF, the growth of an organism results from a dynamic balance between anabolic and catabolic processes. We start from a formulation of the vBGF that models the anabolic coefficient (q) as a function of the catabolic coefficient (k), a coefficient related to the properties of the environment (γ) and a parameter that determines the relative importance of behavior and environment in determining growth (ψ). We treat the vBGF parameters as a function of individual random effects and environmental variables. We use simulations to show how different functional forms and individual or group variability in the growth function's parameters provide a very flexible description of growth trajectories. We then consider a case study of two fish populations of Salmo marmoratus and Salmo trutta to test the goodness of fit and predictive power of the models, along with the biological interpretability of vBGF's parameters when using different model formulations. The best models, according to AIC, included individual variability in both k and γ and cohort as predictor of growth trajectories, and are consistent with the hypothesis that habitat selection is more important than behavioral and metabolic traits in determining lifetime growth trajectories of the two fish species. Model predictions of individual growth trajectories were

  7. A Generalizable Hierarchical Bayesian Model for Persistent SAR Change Detection

    DTIC Science & Technology

    2012-04-01

    6] K. Ranney and M. Soumekh, “Signal subspace change detection in averaged multilook sar imagery,” Geoscience and Remote Sensing, IEEE Transactions on...A Generalizable Hierarchical Bayesian Model for Persistent SAR Change Detection Gregory E. Newstadta, Edmund G. Zelniob, and Alfred O. Hero IIIa...Base, OH, 45433, USA ABSTRACT This paper proposes a hierarchical Bayesian model for multiple-pass, multiple antenna synthetic aperture radar ( SAR

  8. A tutorial introduction to Bayesian models of cognitive development.

    PubMed

    Perfors, Amy; Tenenbaum, Joshua B; Griffiths, Thomas L; Xu, Fei

    2011-09-01

    We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the what, the how, and the why of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for developmentalists. We emphasize a qualitative understanding of Bayesian inference, but also include information about additional resources for those interested in the cognitive science applications, mathematical foundations, or machine learning details in more depth. In addition, we discuss some important interpretation issues that often arise when evaluating Bayesian models in cognitive science. Copyright © 2010 Elsevier B.V. All rights reserved.

  9. Bayesian model evidence as a model evaluation metric

    NASA Astrophysics Data System (ADS)

    Guthke, Anneli; Höge, Marvin; Nowak, Wolfgang

    2017-04-01

    When building environmental systems models, we are typically confronted with the questions of how to choose an appropriate model (i.e., which processes to include or neglect) and how to measure its quality. Various metrics have been proposed that shall guide the modeller towards a most robust and realistic representation of the system under study. Criteria for evaluation often address aspects of accuracy (absence of bias) or of precision (absence of unnecessary variance) and need to be combined in a meaningful way in order to address the inherent bias-variance dilemma. We suggest using Bayesian model evidence (BME) as a model evaluation metric that implicitly performs a tradeoff between bias and variance. BME is typically associated with model weights in the context of Bayesian model averaging (BMA). However, it can also be seen as a model evaluation metric in a single-model context or in model comparison. It combines a measure for goodness of fit with a penalty for unjustifiable complexity. Unjustifiable refers to the fact that the appropriate level of model complexity is limited by the amount of information available for calibration. Derived in a Bayesian context, BME naturally accounts for measurement errors in the calibration data as well as for input and parameter uncertainty. BME is therefore perfectly suitable to assess model quality under uncertainty. We will explain in detail and with schematic illustrations what BME measures, i.e. how complexity is defined in the Bayesian setting and how this complexity is balanced with goodness of fit. We will further discuss how BME compares to other model evaluation metrics that address accuracy and precision such as the predictive logscore or other model selection criteria such as the AIC, BIC or KIC. Although computationally more expensive than other metrics or criteria, BME represents an appealing alternative because it provides a global measure of model quality. Even if not applicable to each and every case, we aim

  10. Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.

    PubMed

    Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi

    2015-02-01

    We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.

  11. Fitting parametric random effects models in very large data sets with application to VHA national data

    PubMed Central

    2012-01-01

    Background With the current focus on personalized medicine, patient/subject level inference is often of key interest in translational research. As a result, random effects models (REM) are becoming popular for patient level inference. However, for very large data sets that are characterized by large sample size, it can be difficult to fit REM using commonly available statistical software such as SAS since they require inordinate amounts of computer time and memory allocations beyond what are available preventing model convergence. For example, in a retrospective cohort study of over 800,000 Veterans with type 2 diabetes with longitudinal data over 5 years, fitting REM via generalized linear mixed modeling using currently available standard procedures in SAS (e.g. PROC GLIMMIX) was very difficult and same problems exist in Stata’s gllamm or R’s lme packages. Thus, this study proposes and assesses the performance of a meta regression approach and makes comparison with methods based on sampling of the full data. Data We use both simulated and real data from a national cohort of Veterans with type 2 diabetes (n=890,394) which was created by linking multiple patient and administrative files resulting in a cohort with longitudinal data collected over 5 years. Methods and results The outcome of interest was mean annual HbA1c measured over a 5 years period. Using this outcome, we compared parameter estimates from the proposed random effects meta regression (REMR) with estimates based on simple random sampling and VISN (Veterans Integrated Service Networks) based stratified sampling of the full data. Our results indicate that REMR provides parameter estimates that are less likely to be biased with tighter confidence intervals when the VISN level estimates are homogenous. Conclusion When the interest is to fit REM in repeated measures data with very large sample size, REMR can be used as a good alternative. It leads to reasonable inference for both Gaussian and non

  12. Assessing Fit of Unidimensional Graded Response Models Using Bayesian Methods

    ERIC Educational Resources Information Center

    Zhu, Xiaowen; Stone, Clement A.

    2011-01-01

    The posterior predictive model checking method is a flexible Bayesian model-checking tool and has recently been used to assess fit of dichotomous IRT models. This paper extended previous research to polytomous IRT models. A simulation study was conducted to explore the performance of posterior predictive model checking in evaluating different…

  13. Evaluating Individualized Reading Programs: A Bayesian Model.

    ERIC Educational Resources Information Center

    Maxwell, Martha

    Simple Bayesian approaches can be applied to answer specific questions in evaluating an individualized reading program. A small reading and study skills program located in the counseling center of a major research university collected and compiled data on student characteristics such as class, number of sessions attended, grade point average, and…

  14. Bayesian model averaging of Bayesian network classifiers over multiple node-orders: application to sparse datasets.

    PubMed

    Hwang, Kyu-Baek; Zhang, Byoung-Tak

    2005-12-01

    Bayesian model averaging (BMA) can resolve the overfitting problem by explicitly incorporating the model uncertainty into the analysis procedure. Hence, it can be used to improve the generalization performance of Bayesian network classifiers. Until now, BMA of Bayesian network classifiers has only been performed in some restricted forms, e.g., the model is averaged given a single node-order, because of its heavy computational burden. However, it can be hard to obtain a good node-order when the available training dataset is sparse. To alleviate this problem, we propose BMA of Bayesian network classifiers over several distinct node-orders obtained using the Markov chain Monte Carlo sampling technique. The proposed method was examined using two synthetic problems and four real-life datasets. First, we show that the proposed method is especially effective when the given dataset is very sparse. The classification accuracy of averaging over multiple node-orders was higher in most cases than that achieved using a single node-order in our experiments. We also present experimental results for test datasets with unobserved variables, where the quality of the averaged node-order is more important. Through these experiments, we show that the difference in classification performance between the cases of multiple node-orders and single node-order is related to the level of noise, confirming the relative benefit of averaging over multiple node-orders for incomplete data. We conclude that BMA of Bayesian network classifiers over multiple node-orders has an apparent advantage when the given dataset is sparse and noisy, despite the method's heavy computational cost.

  15. Technical note: Bayesian calibration of dynamic ruminant nutrition models.

    PubMed

    Reed, K F; Arhonditsis, G B; France, J; Kebreab, E

    2016-08-01

    Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling.

  16. Using consensus bayesian network to model the reactive oxygen species regulatory pathway.

    PubMed

    Hu, Liangdong; Wang, Limin

    2013-01-01

    Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway.

  17. Using Consensus Bayesian Network to Model the Reactive Oxygen Species Regulatory Pathway

    PubMed Central

    Hu, Liangdong; Wang, Limin

    2013-01-01

    Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the Bayesian network from microarray data directly. Although large numbers of Bayesian network learning algorithms have been developed, when applying them to learn Bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn Bayesian networks contain too few microarray data. In this paper, we propose a consensus Bayesian network which is constructed by combining Bayesian networks from relevant literatures and Bayesian networks learned from microarray data. It would have a higher accuracy than the Bayesian networks learned from one database. In the experiment, we validated the Bayesian network combination algorithm on several classic machine learning databases and used the consensus Bayesian network to model the 's ROS pathway. PMID:23457624

  18. Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring

    Treesearch

    Carlos Carroll; Devin S. Johnson; Jeffrey R. Dunk; William J. Zielinski

    2010-01-01

    Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data’s spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and...

  19. Bayesian Estimation of the Logistic Positive Exponent IRT Model

    ERIC Educational Resources Information Center

    Bolfarine, Heleno; Bazan, Jorge Luis

    2010-01-01

    A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…

  20. Metrics for evaluating performance and uncertainty of Bayesian network models

    Treesearch

    Bruce G. Marcot

    2012-01-01

    This paper presents a selected set of existing and new metrics for gauging Bayesian network model performance and uncertainty. Selected existing and new metrics are discussed for conducting model sensitivity analysis (variance reduction, entropy reduction, case file simulation); evaluating scenarios (influence analysis); depicting model complexity (numbers of model...

  1. Bayesian Estimation of the Logistic Positive Exponent IRT Model

    ERIC Educational Resources Information Center

    Bolfarine, Heleno; Bazan, Jorge Luis

    2010-01-01

    A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…

  2. Joint spatial Bayesian modeling for studies combining longitudinal and cross-sectional data

    PubMed Central

    Lawson, Andrew B; Carroll, Rachel; Castro, Marcia

    2017-01-01

    Design for intervention studies may combine longitudinal data collected from sampled locations over several survey rounds and cross-sectional data from other locations in the study area. In this case, modeling the impact of the intervention requires an approach that can accommodate both types of data, accounting for the dependence between individuals followed up over time. Inadequate modeling can mask intervention effects, with serious implications for policy making. In this paper we use data from a large-scale larviciding intervention for malaria control implemented in Dar es Salaam, United Republic of Tanzania, collected over a period of almost 5 years. We apply a longitudinal Bayesian spatial model to the Dar es Salaam data, combining follow-up and cross-sectional data, treating the correlation in longitudinal observations separately, and controlling for potential confounders. An innovative feature of this modeling is the use of Ornstein–Uhlenbeck process to model random time effects. We contrast the results with other Bayesian modeling formulations, including cross-sectional approaches that consider individual-level random effects to account for subjects followed up in two or more surveys. The longitudinal modeling approach indicates that the intervention significantly reduced the prevalence of malaria infection in Dar es Salaam by 20% whereas the joint model did not suggest significance within the results. Our results suggest that the longitudinal model is to be preferred when longitudinal information is available at the individual level. PMID:24713159

  3. Spontaneous temporal changes and variability of peripheral nerve conduction analyzed using a random effects model.

    PubMed

    Krøigård, Thomas; Gaist, David; Otto, Marit; Højlund, Dorthe; Selmar, Peter E; Sindrup, Søren H

    2014-08-01

    The reproducibility of variables commonly included in studies of peripheral nerve conduction in healthy individuals has not previously been analyzed using a random effects regression model. We examined the temporal changes and variability of standard nerve conduction measures in the leg. Peroneal nerve distal motor latency, motor conduction velocity, and compound motor action potential amplitude; sural nerve sensory action potential amplitude and sensory conduction velocity; and tibial nerve minimal F-wave latency were examined in 51 healthy subjects, aged 40 to 67 years. They were reexamined after 2 and 26 weeks. There was no change in the variables except for a minor decrease in sural nerve sensory action potential amplitude and a minor increase in tibial nerve minimal F-wave latency. Reproducibility was best for peroneal nerve distal motor latency and motor conduction velocity, sural nerve sensory conduction velocity, and tibial nerve minimal F-wave latency. Between-subject variability was greater than within-subject variability. Sample sizes ranging from 21 to 128 would be required to show changes twice the magnitude of the spontaneous changes observed in this study. Nerve conduction studies have a high reproducibility, and variables are mainly unaltered during 6 months. This study provides a solid basis for the planning of future clinical trials assessing changes in nerve conduction.

  4. Bayesian Models Leveraging Bioactivity and Cytotoxicity Information for Drug Discovery

    PubMed Central

    Ekins, Sean; Reynolds, Robert C.; Kim, Hiyun; Koo, Mi-Sun; Ekonomidis, Marilyn; Talaue, Meliza; Paget, Steve D.; Woolhiser, Lisa K.; Lenaerts, Anne J.; Bunin, Barry A.; Connell, Nancy; Freundlich, Joel S.

    2013-01-01

    SUMMARY Identification of unique leads represents a significant challenge in drug discovery. This hurdle is magnified in neglected diseases such as tuberculosis. We have leveraged public high-throughput screening (HTS) data, to experimentally validate virtual screening approach employing Bayesian models built with bioactivity information (single-event model) as well as bioactivity and cytotoxicity information (dual-event model). We virtually screen a commercial library and experimentally confirm actives with hit rates exceeding typical HTS results by 1-2 orders of magnitude. The first dual-event Bayesian model identified compounds with antitubercular whole-cell activity and low mammalian cell cytotoxicity from a published set of antimalarials. The most potent hit exhibits the in vitro activity and in vitro/in vivo safety profile of a drug lead. These Bayesian models offer significant economies in time and cost to drug discovery. PMID:23521795

  5. A review of Bayesian state-space modelling of capture-recapture-recovery data.

    PubMed

    King, Ruth

    2012-04-06

    Traditionally, state-space models are fitted to data where there is uncertainty in the observation or measurement of the system. State-space models are partitioned into an underlying system process describing the transitions of the true states of the system over time and the observation process linking the observations of the system to the true states. Open population capture-recapture-recovery data can be modelled in this framework by regarding the system process as the state of each individual observed within the study in terms of being alive or dead, and the observation process the recapture and/or recovery process. The traditional observation error of a state-space model is incorporated via the recapture/recovery probabilities being less than unity. The models can be fitted using a Bayesian data augmentation approach and in standard BUGS packages. Applying this state-space framework to such data permits additional complexities including individual heterogeneity to be fitted to the data at very little additional programming effort. We consider the efficiency of the state-space model fitting approach by considering a random effects model for capture-recapture data relating to dippers and compare different Bayesian model-fitting algorithms within WinBUGS.

  6. A review of Bayesian state-space modelling of capture–recapture–recovery data

    PubMed Central

    King, Ruth

    2012-01-01

    Traditionally, state-space models are fitted to data where there is uncertainty in the observation or measurement of the system. State-space models are partitioned into an underlying system process describing the transitions of the true states of the system over time and the observation process linking the observations of the system to the true states. Open population capture–recapture–recovery data can be modelled in this framework by regarding the system process as the state of each individual observed within the study in terms of being alive or dead, and the observation process the recapture and/or recovery process. The traditional observation error of a state-space model is incorporated via the recapture/recovery probabilities being less than unity. The models can be fitted using a Bayesian data augmentation approach and in standard BUGS packages. Applying this state-space framework to such data permits additional complexities including individual heterogeneity to be fitted to the data at very little additional programming effort. We consider the efficiency of the state-space model fitting approach by considering a random effects model for capture–recapture data relating to dippers and compare different Bayesian model-fitting algorithms within WinBUGS. PMID:23565333

  7. A Bayesian modeling approach for generalized semiparametric structural equation models.

    PubMed

    Song, Xin-Yuan; Lu, Zhao-Hua; Cai, Jing-Heng; Ip, Edward Hak-Sing

    2013-10-01

    In behavioral, biomedical, and psychological studies, structural equation models (SEMs) have been widely used for assessing relationships between latent variables. Regression-type structural models based on parametric functions are often used for such purposes. In many applications, however, parametric SEMs are not adequate to capture subtle patterns in the functions over the entire range of the predictor variable. A different but equally important limitation of traditional parametric SEMs is that they are not designed to handle mixed data types-continuous, count, ordered, and unordered categorical. This paper develops a generalized semiparametric SEM that is able to handle mixed data types and to simultaneously model different functional relationships among latent variables. A structural equation of the proposed SEM is formulated using a series of unspecified smooth functions. The Bayesian P-splines approach and Markov chain Monte Carlo methods are developed to estimate the smooth functions and the unknown parameters. Moreover, we examine the relative benefits of semiparametric modeling over parametric modeling using a Bayesian model-comparison statistic, called the complete deviance information criterion (DIC). The performance of the developed methodology is evaluated using a simulation study. To illustrate the method, we used a data set derived from the National Longitudinal Survey of Youth.

  8. Bayesian analysis for nonlinear mixed-effects models under heavy-tailed distributions.

    PubMed

    De la Cruz, Rolando

    2014-01-01

    A common assumption in nonlinear mixed-effects models is the normality of both random effects and within-subject errors. However, such assumptions make inferences vulnerable to the presence of outliers. More flexible distributions are therefore necessary for modeling both sources of variability in this class of models. In the present paper, I consider an extension of the nonlinear mixed-effects models in which random effects and within-subject errors are assumed to be distributed according to a rich class of parametric models that are often used for robust inference. The class of distributions I consider is the scale mixture of multivariate normal distributions that consist of a wide range of symmetric and continuous distributions. This class includes heavy-tailed multivariate distributions, such as the Student's t and slash and contaminated normal. With the scale mixture of multivariate normal distributions, robustification is achieved from the tail behavior of the different distributions. A Bayesian framework is adopted, and MCMC is used to carry out posterior analysis. Model comparison using different criteria was considered. The procedures are illustrated using a real dataset from a pharmacokinetic study. I contrast results from the normal and robust models and show how the implementation can be used to detect outliers.

  9. Bayesian Hierarchical Duration Model for Repeated Events : An Application to Behavioral Observations

    PubMed Central

    Dagne, Getachew A.; Snyder, James

    2009-01-01

    This paper presents a continuous-time Bayesian model for analyzing durations of behavior displays in social interactions. Duration data of social interactions are often complex because of repeated behaviors (events) at individual or group (e.g., dyad) level, multiple behaviors (multistates), and several choices of exit from a current event (competing risks). A multilevel, multistate model is proposed to adequately characterize the behavioral processes. The model incorporates dyad-specific and transition-specific random effects to account for heterogeneity among dyads and interdependence among competing risks. The proposed method is applied to child-parent observational data derived from the School Transitions Project to assess the relation of emotional expression in child-parent interaction to risk for early and persisting child conduct problems. PMID:20209032

  10. Common quandaries and their practical solutions in Bayesian network modeling

    Treesearch

    Bruce G. Marcot

    2017-01-01

    Use and popularity of Bayesian network (BN) modeling has greatly expanded in recent years, but many common problems remain. Here, I summarize key problems in BN model construction and interpretation,along with suggested practical solutions. Problems in BN model construction include parameterizing probability values, variable definition, complex network structures,...

  11. A General Bayesian Model for Testlets: Theory and Applications.

    ERIC Educational Resources Information Center

    Wang, Xiaohui; Bradlow, Eric T.; Wainer, Howard

    2002-01-01

    Proposes a modified version of commonly employed item response models in a fully Bayesian framework and obtains inferences under the model using Markov chain Monte Carlo techniques. Demonstrates use of the model in a series of simulations and with operational data from the North Carolina Test of Computer Skills and the Test of Spoken English…

  12. Bayesian non-parametrics and the probabilistic approach to modelling

    PubMed Central

    Ghahramani, Zoubin

    2013-01-01

    Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman’s coalescent, Dirichlet diffusion trees and Wishart processes. PMID:23277609

  13. Bayesian Network Models for Local Dependence among Observable Outcome Variables

    ERIC Educational Resources Information Center

    Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli

    2009-01-01

    Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context--ignores dependence among observables; (b) compensatory context--introduces…

  14. On the Bayesian Nonparametric Generalization of IRT-Type Models

    ERIC Educational Resources Information Center

    San Martin, Ernesto; Jara, Alejandro; Rolin, Jean-Marie; Mouchart, Michel

    2011-01-01

    We study the identification and consistency of Bayesian semiparametric IRT-type models, where the uncertainty on the abilities' distribution is modeled using a prior distribution on the space of probability measures. We show that for the semiparametric Rasch Poisson counts model, simple restrictions ensure the identification of a general…

  15. Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis

    ERIC Educational Resources Information Center

    Ansari, Asim; Iyengar, Raghuram

    2006-01-01

    We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…

  16. Bayesian Network Models for Local Dependence among Observable Outcome Variables

    ERIC Educational Resources Information Center

    Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli

    2009-01-01

    Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context--ignores dependence among observables; (b) compensatory context--introduces…

  17. On the Bayesian Nonparametric Generalization of IRT-Type Models

    ERIC Educational Resources Information Center

    San Martin, Ernesto; Jara, Alejandro; Rolin, Jean-Marie; Mouchart, Michel

    2011-01-01

    We study the identification and consistency of Bayesian semiparametric IRT-type models, where the uncertainty on the abilities' distribution is modeled using a prior distribution on the space of probability measures. We show that for the semiparametric Rasch Poisson counts model, simple restrictions ensure the identification of a general…

  18. Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis

    ERIC Educational Resources Information Center

    Ansari, Asim; Iyengar, Raghuram

    2006-01-01

    We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…

  19. Bayesian Case Influence Measures for Statistical Models with Missing Data

    PubMed Central

    Zhu, Hongtu; Ibrahim, Joseph G.; Cho, Hyunsoon; Tang, Niansheng

    2011-01-01

    We examine three Bayesian case influence measures including the φ-divergence, Cook's posterior mode distance and Cook's posterior mean distance for identifying a set of influential observations for a variety of statistical models with missing data including models for longitudinal data and latent variable models in the absence/presence of missing data. Since it can be computationally prohibitive to compute these Bayesian case influence measures in models with missing data, we derive simple first-order approximations to the three Bayesian case influence measures by using the Laplace approximation formula and examine the applications of these approximations to the identification of influential sets. All of the computations for the first-order approximations can be easily done using Markov chain Monte Carlo samples from the posterior distribution based on the full data. Simulated data and an AIDS dataset are analyzed to illustrate the methodology. PMID:23399928

  20. Bayesian failure probability model sensitivity study. Final report

    SciTech Connect

    Not Available

    1986-05-30

    The Office of the Manager, National Communications System (OMNCS) has developed a system-level approach for estimating the effects of High-Altitude Electromagnetic Pulse (HEMP) on the connectivity of telecommunications networks. This approach incorporates a Bayesian statistical model which estimates the HEMP-induced failure probabilities of telecommunications switches and transmission facilities. The purpose of this analysis is to address the sensitivity of the Bayesian model. This is done by systematically varying two model input parameters--the number of observations, and the equipment failure rates. Throughout the study, a non-informative prior distribution is used. The sensitivity of the Bayesian model to the noninformative prior distribution is investigated from a theoretical mathematical perspective.

  1. Back to basics for Bayesian model building in genomic selection.

    PubMed

    Kärkkäinen, Hanni P; Sillanpää, Mikko J

    2012-07-01

    Numerous Bayesian methods of phenotype prediction and genomic breeding value estimation based on multilocus association models have been proposed. Computationally the methods have been based either on Markov chain Monte Carlo or on faster maximum a posteriori estimation. The demand for more accurate and more efficient estimation has led to the rapid emergence of workable methods, unfortunately at the expense of well-defined principles for Bayesian model building. In this article we go back to the basics and build a Bayesian multilocus association model for quantitative and binary traits with carefully defined hierarchical parameterization of Student's t and Laplace priors. In this treatment we consider alternative model structures, using indicator variables and polygenic terms. We make the most of the conjugate analysis, enabled by the hierarchical formulation of the prior densities, by deriving the fully conditional posterior densities of the parameters and using the acquired known distributions in building fast generalized expectation-maximization estimation algorithms.

  2. Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies

    USDA-ARS?s Scientific Manuscript database

    False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises t...

  3. The Evaluation of Bias of the Weighted Random Effects Model Estimators. Research Report. ETS RR-11-13

    ERIC Educational Resources Information Center

    Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan

    2011-01-01

    Estimation of parameters of random effects models from samples collected via complex multistage designs is considered. One way to reduce estimation bias due to unequal probabilities of selection is to incorporate sampling weights. Many researchers have been proposed various weighting methods (Korn, & Graubard, 2003; Pfeffermann, Skinner,…

  4. Estimating Individual Influences of Behavioral Intentions: An Application of Random-Effects Modeling to the Theory of Reasoned Action.

    ERIC Educational Resources Information Center

    Hedeker, Donald; And Others

    1996-01-01

    Methods are proposed and described for estimating the degree to which relations among variables vary at the individual level. As an example, M. Fishbein and I. Ajzen's theory of reasoned action is examined. This article illustrates the use of empirical Bayes methods based on a random-effects regression model to estimate individual influences…

  5. On the Adequacy of Bayesian Evaluations of Categorization Models: Reply to Vanpaemel and Lee (2012)

    ERIC Educational Resources Information Center

    Wills, Andy J.; Pothos, Emmanuel M.

    2012-01-01

    Vanpaemel and Lee (2012) argued, and we agree, that the comparison of formal models can be facilitated by Bayesian methods. However, Bayesian methods neither precede nor supplant our proposals (Wills & Pothos, 2012), as Bayesian methods can be applied both to our proposals and to their polar opposites. Furthermore, the use of Bayesian methods to…

  6. On the Adequacy of Bayesian Evaluations of Categorization Models: Reply to Vanpaemel and Lee (2012)

    ERIC Educational Resources Information Center

    Wills, Andy J.; Pothos, Emmanuel M.

    2012-01-01

    Vanpaemel and Lee (2012) argued, and we agree, that the comparison of formal models can be facilitated by Bayesian methods. However, Bayesian methods neither precede nor supplant our proposals (Wills & Pothos, 2012), as Bayesian methods can be applied both to our proposals and to their polar opposites. Furthermore, the use of Bayesian methods to…

  7. Investigating the effective factors in creatinine changes among hemodialysis patients using the linear random effects model

    PubMed Central

    Shabankhani, B; Kazemnezhad, A; Zaeri, F

    2015-01-01

    Background and objectives:Out of 10 apparently healthy humans, one was somewhat suffering from one of the types of renal disease. Hemodialysis is known as the most applicable method of taking care of this group of patients. In addition, serum creatinine is an important mark in the performance of kidneys. The aim of the present study was to investigate the effective factors in creatinine and its effect on the performance of kidneys. Materials and methods: The present study is a longitudinal experiment in which 500 participants were randomly selected from the hemodialysis patients in Mazandaran Province. Creatinine variable was considered as the longitudinal responding variable, which was measured 3 times per year over a period of 6 years. The random effects model was also considered the most appropriate model for the collected data. Results:The total mean value of creatinine was 1.62 ± 0.49, among men 1.69 ± 0.46 and among women 35.1 ± 0.49. Variables of weight (p<0.001), age of disease diagnosis (p<0.001), time (p<0.001), gender (p<0.005), and cardiovascular diseases were significant and had effects on the trend of creatinine changes among the hemodialysis patients. Creatinine mean value had an increasing trend. Conclusion:Blood creatinine had a significant effect on the performance of kidneys, and the identification of variables that affected the creatinine level was highly helpful in controlling the performance of the kidneys. The results of most studies conducted on hemodialysis patients indicated that by measuring and controlling variables like weight, tobacco consumption, and control of related diseases like blood pressure could predict and control creatinine changes precisely. PMID:28255403

  8. Bayesian joint modeling of longitudinal measurements and time-to-event data using robust distributions.

    PubMed

    Baghfalaki, T; Ganjali, M; Hashemi, R

    2014-01-01

    Distributional assumptions of most of the existing methods for joint modeling of longitudinal measurements and time-to-event data cannot allow incorporation of outlier robustness. In this article, we develop and implement a joint modeling of longitudinal and time-to-event data using some powerful distributions for robust analyzing that are known as normal/independent distributions. These distributions include univariate and multivariate versions of the Student's t, the slash, and the contaminated normal distributions. The proposed model implements a linear mixed effects model under a normal/independent distribution assumption for both random effects and residuals of the longitudinal process. For the time-to-event process a parametric proportional hazard model with a Weibull baseline hazard is used. Also, a Bayesian approach using the Markov-chain Monte Carlo method is adopted for parameter estimation. Some simulation studies are performed to investigate the performance of the proposed method under presence and absence of outliers. Also, the proposed methods are applied for analyzing a real AIDS clinical trial, with the aim of comparing the efficiency and safety of two antiretroviral drugs, where CD4 count measurements are gathered as longitudinal outcomes. In these data, time to death or dropout is considered as the interesting time-to-event outcome variable. Different model structures are developed for analyzing these data sets, where model selection is performed by the deviance information criterion (DIC), expected Akaike information criterion (EAIC), and expected Bayesian information criterion (EBIC).

  9. Bayesian generalized linear mixed modeling of Tuberculosis using informative priors.

    PubMed

    Ojo, Oluwatobi Blessing; Lougue, Siaka; Woldegerima, Woldegebriel Assefa

    2017-01-01

    TB is rated as one of the world's deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014.

  10. Bayesian generalized linear mixed modeling of Tuberculosis using informative priors

    PubMed Central

    Woldegerima, Woldegebriel Assefa

    2017-01-01

    TB is rated as one of the world’s deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014. PMID:28257437

  11. Bayesian Plackett-Luce Mixture Models for Partially Ranked Data.

    PubMed

    Mollica, Cristina; Tardella, Luca

    2017-06-01

    The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett-Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett-Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett-Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.

  12. Bayesian analysis of a disability model for lung cancer survival.

    PubMed

    Armero, C; Cabras, S; Castellanos, M E; Perra, S; Quirós, A; Oruezábal, M J; Sánchez-Rubio, J

    2016-02-01

    Bayesian reasoning, survival analysis and multi-state models are used to assess survival times for Stage IV non-small-cell lung cancer patients and the evolution of the disease over time. Bayesian estimation is done using minimum informative priors for the Weibull regression survival model, leading to an automatic inferential procedure. Markov chain Monte Carlo methods have been used for approximating posterior distributions and the Bayesian information criterion has been considered for covariate selection. In particular, the posterior distribution of the transition probabilities, resulting from the multi-state model, constitutes a very interesting tool which could be useful to help oncologists and patients make efficient and effective decisions. © The Author(s) 2012.

  13. A mixture copula Bayesian network model for multimodal genomic data.

    PubMed

    Zhang, Qingyang; Shi, Xuan

    2017-01-01

    Gaussian Bayesian networks have become a widely used framework to estimate directed associations between joint Gaussian variables, where the network structure encodes the decomposition of multivariate normal density into local terms. However, the resulting estimates can be inaccurate when the normality assumption is moderately or severely violated, making it unsuitable for dealing with recent genomic data such as the Cancer Genome Atlas data. In the present paper, we propose a mixture copula Bayesian network model which provides great flexibility in modeling non-Gaussian and multimodal data for causal inference. The parameters in mixture copula functions can be efficiently estimated by a routine expectation-maximization algorithm. A heuristic search algorithm based on Bayesian information criterion is developed to estimate the network structure, and prediction can be further improved by the best-scoring network out of multiple predictions from random initial values. Our method outperforms Gaussian Bayesian networks and regular copula Bayesian networks in terms of modeling flexibility and prediction accuracy, as demonstrated using a cell signaling data set. We apply the proposed methods to the Cancer Genome Atlas data to study the genetic and epigenetic pathways that underlie serous ovarian cancer.

  14. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    PubMed

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected

  15. Involving Stakeholders in Building Integrated Fisheries Models Using Bayesian Methods

    NASA Astrophysics Data System (ADS)

    Haapasaari, Päivi; Mäntyniemi, Samu; Kuikka, Sakari

    2013-06-01

    A participatory Bayesian approach was used to investigate how the views of stakeholders could be utilized to develop models to help understand the Central Baltic herring fishery. In task one, we applied the Bayesian belief network methodology to elicit the causal assumptions of six stakeholders on factors that influence natural mortality, growth, and egg survival of the herring stock in probabilistic terms. We also integrated the expressed views into a meta-model using the Bayesian model averaging (BMA) method. In task two, we used influence diagrams to study qualitatively how the stakeholders frame the management problem of the herring fishery and elucidate what kind of causalities the different views involve. The paper combines these two tasks to assess the suitability of the methodological choices to participatory modeling in terms of both a modeling tool and participation mode. The paper also assesses the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology provides a flexible tool that can be adapted to different kinds of needs and challenges of participatory modeling. The ability of the approach to deal with small data sets makes it cost-effective in participatory contexts. However, the BMA methodology used in modeling the biological uncertainties is so complex that it needs further development before it can be introduced to wider use in participatory contexts.

  16. Involving stakeholders in building integrated fisheries models using Bayesian methods.

    PubMed

    Haapasaari, Päivi; Mäntyniemi, Samu; Kuikka, Sakari

    2013-06-01

    A participatory Bayesian approach was used to investigate how the views of stakeholders could be utilized to develop models to help understand the Central Baltic herring fishery. In task one, we applied the Bayesian belief network methodology to elicit the causal assumptions of six stakeholders on factors that influence natural mortality, growth, and egg survival of the herring stock in probabilistic terms. We also integrated the expressed views into a meta-model using the Bayesian model averaging (BMA) method. In task two, we used influence diagrams to study qualitatively how the stakeholders frame the management problem of the herring fishery and elucidate what kind of causalities the different views involve. The paper combines these two tasks to assess the suitability of the methodological choices to participatory modeling in terms of both a modeling tool and participation mode. The paper also assesses the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology provides a flexible tool that can be adapted to different kinds of needs and challenges of participatory modeling. The ability of the approach to deal with small data sets makes it cost-effective in participatory contexts. However, the BMA methodology used in modeling the biological uncertainties is so complex that it needs further development before it can be introduced to wider use in participatory contexts.

  17. Meta-analysis of diagnostic test data: a bivariate Bayesian modeling approach.

    PubMed

    Verde, Pablo E

    2010-12-30

    In the last decades, the amount of published results on clinical diagnostic tests has expanded very rapidly. The counterpart to this development has been the formal evaluation and synthesis of diagnostic results. However, published results present substantial heterogeneity and they can be regarded as so far removed from the classical domain of meta-analysis, that they can provide a rather severe test of classical statistical methods. Recently, bivariate random effects meta-analytic methods, which model the pairs of sensitivities and specificities, have been presented from the classical point of view. In this work a bivariate Bayesian modeling approach is presented. This approach substantially extends the scope of classical bivariate methods by allowing the structural distribution of the random effects to depend on multiple sources of variability. Meta-analysis is summarized by the predictive posterior distributions for sensitivity and specificity. This new approach allows, also, to perform substantial model checking, model diagnostic and model selection. Statistical computations are implemented in the public domain statistical software (WinBUGS and R) and illustrated with real data examples.

  18. Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models

    ERIC Educational Resources Information Center

    Price, Larry R.

    2012-01-01

    The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…

  19. Accurate phenotyping: Reconciling approaches through Bayesian model averaging

    PubMed Central

    Chen, Carla Chia-Ming; Mengersen, Kerrie Lee

    2017-01-01

    Genetic research into complex diseases is frequently hindered by a lack of clear biomarkers for phenotype ascertainment. Phenotypes for such diseases are often identified on the basis of clinically defined criteria; however such criteria may not be suitable for understanding the genetic composition of the diseases. Various statistical approaches have been proposed for phenotype definition; however our previous studies have shown that differences in phenotypes estimated using different approaches have substantial impact on subsequent analyses. Instead of obtaining results based upon a single model, we propose a new method, using Bayesian model averaging to overcome problems associated with phenotype definition. Although Bayesian model averaging has been used in other fields of research, this is the first study that uses Bayesian model averaging to reconcile phenotypes obtained using multiple models. We illustrate the new method by applying it to simulated genetic and phenotypic data for Kofendred personality disorder—an imaginary disease with several sub-types. Two separate statistical methods were used to identify clusters of individuals with distinct phenotypes: latent class analysis and grade of membership. Bayesian model averaging was then used to combine the two clusterings for the purpose of subsequent linkage analyses. We found that causative genetic loci for the disease produced higher LOD scores using model averaging than under either individual model separately. We attribute this improvement to consolidation of the cores of phenotype clusters identified using each individual method. PMID:28423058

  20. Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models

    ERIC Educational Resources Information Center

    Price, Larry R.

    2012-01-01

    The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…

  1. Bayesian Finite Mixtures for Nonlinear Modeling of Educational Data.

    ERIC Educational Resources Information Center

    Tirri, Henry; And Others

    A Bayesian approach for finding latent classes in data is discussed. The approach uses finite mixture models to describe the underlying structure in the data and demonstrate that the possibility of using full joint probability models raises interesting new prospects for exploratory data analysis. The concepts and methods discussed are illustrated…

  2. Bayesian Analysis of Order-Statistics Models for Ranking Data.

    ERIC Educational Resources Information Center

    Yu, Philip L. H.

    2000-01-01

    Studied the order-statistics models, extending the usual normal order-statistics model into one in which the underlying random variables followed a multivariate normal distribution. Used a Bayesian approach and the Gibbs sampling technique. Applied the proposed method to analyze presidential election data from the American Psychological…

  3. Bayesian Estimation of the DINA Model with Gibbs Sampling

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew

    2015-01-01

    A Bayesian model formulation of the deterministic inputs, noisy "and" gate (DINA) model is presented. Gibbs sampling is employed to simulate from the joint posterior distribution of item guessing and slipping parameters, subject attribute parameters, and latent class probabilities. The procedure extends concepts in Béguin and Glas,…

  4. Bayesian Semiparametric Structural Equation Models with Latent Variables

    ERIC Educational Resources Information Center

    Yang, Mingan; Dunson, David B.

    2010-01-01

    Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions of unknowns, including the latent variables. In…

  5. A Bayesian Approach for Analyzing Longitudinal Structural Equation Models

    ERIC Educational Resources Information Center

    Song, Xin-Yuan; Lu, Zhao-Hua; Hser, Yih-Ing; Lee, Sik-Yum

    2011-01-01

    This article considers a Bayesian approach for analyzing a longitudinal 2-level nonlinear structural equation model with covariates, and mixed continuous and ordered categorical variables. The first-level model is formulated for measures taken at each time point nested within individuals for investigating their characteristics that are dynamically…

  6. A Bayesian Approach for Analyzing Longitudinal Structural Equation Models

    ERIC Educational Resources Information Center

    Song, Xin-Yuan; Lu, Zhao-Hua; Hser, Yih-Ing; Lee, Sik-Yum

    2011-01-01

    This article considers a Bayesian approach for analyzing a longitudinal 2-level nonlinear structural equation model with covariates, and mixed continuous and ordered categorical variables. The first-level model is formulated for measures taken at each time point nested within individuals for investigating their characteristics that are dynamically…

  7. Bayesian Estimation of the DINA Model with Gibbs Sampling

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew

    2015-01-01

    A Bayesian model formulation of the deterministic inputs, noisy "and" gate (DINA) model is presented. Gibbs sampling is employed to simulate from the joint posterior distribution of item guessing and slipping parameters, subject attribute parameters, and latent class probabilities. The procedure extends concepts in Béguin and Glas,…

  8. Bayesian Semiparametric Structural Equation Models with Latent Variables

    ERIC Educational Resources Information Center

    Yang, Mingan; Dunson, David B.

    2010-01-01

    Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions of unknowns, including the latent variables. In…

  9. Improving satellite-based PM2.5 estimates in China using Gaussian processes modeling in a Bayesian hierarchical setting.

    PubMed

    Yu, Wenxi; Liu, Yang; Ma, Zongwei; Bi, Jun

    2017-08-01

    Using satellite-based aerosol optical depth (AOD) measurements and statistical models to estimate ground-level PM2.5 is a promising way to fill the areas that are not covered by ground PM2.5 monitors. The statistical models used in previous studies are primarily Linear Mixed Effects (LME) and Geographically Weighted Regression (GWR) models. In this study, we developed a new regression model between PM2.5 and AOD using Gaussian processes in a Bayesian hierarchical setting. Gaussian processes model the stochastic nature of the spatial random effects, where the mean surface and the covariance function is specified. The spatial stochastic process is incorporated under the Bayesian hierarchical framework to explain the variation of PM2.5 concentrations together with other factors, such as AOD, spatial and non-spatial random effects. We evaluate the results of our model and compare them with those of other, conventional statistical models (GWR and LME) by within-sample model fitting and out-of-sample validation (cross validation, CV). The results show that our model possesses a CV result (R(2) = 0.81) that reflects higher accuracy than that of GWR and LME (0.74 and 0.48, respectively). Our results indicate that Gaussian process models have the potential to improve the accuracy of satellite-based PM2.5 estimates.

  10. Bayesian methods for characterizing unknown parameters of material models

    SciTech Connect

    Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.

    2016-02-04

    A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed to characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.

  11. Bayesian methods for characterizing unknown parameters of material models

    DOE PAGES

    Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.

    2016-02-04

    A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less

  12. Bayesian log-periodic model for financial crashes

    NASA Astrophysics Data System (ADS)

    Rodríguez-Caballero, Carlos Vladimir; Knapik, Oskar

    2014-10-01

    This paper introduces a Bayesian approach in econophysics literature about financial bubbles in order to estimate the most probable time for a financial crash to occur. To this end, we propose using noninformative prior distributions to obtain posterior distributions. Since these distributions cannot be performed analytically, we develop a Markov Chain Monte Carlo algorithm to draw from posterior distributions. We consider three Bayesian models that involve normal and Student's t-distributions in the disturbances and an AR(1)-GARCH(1,1) structure only within the first case. In the empirical part of the study, we analyze a well-known example of financial bubble - the S&P 500 1987 crash - to show the usefulness of the three methods under consideration and crashes of Merval-94, Bovespa-97, IPCMX-94, Hang Seng-97 using the simplest method. The novelty of this research is that the Bayesian models provide 95% credible intervals for the estimated crash time.

  13. Bayesian comparison of voice coil impedance models for dynamic loudspeakers

    NASA Astrophysics Data System (ADS)

    Henderson, R. Wesley; Goggans, Paul M.

    2017-06-01

    Loudspeaker design requires accurate models of driver voice coil impedance. This paper examines three model classes (standard, Leach, and van Maanen) from the audio literature and compares them using Bayesian model comparison via nested sampling. Data is generated from impedance measurements of two commercial loudspeaker drivers. We conclude that, for most design tasks involving these drivers, the van Maanen model with 3 lossy inductance groups is the most appropriate model.

  14. Flexible Bayesian additive joint models with an application to type 1 diabetes research.

    PubMed

    Köhler, Meike; Umlauf, Nikolaus; Beyerlein, Andreas; Winkler, Christiane; Ziegler, Anette-Gabriele; Greven, Sonja

    2017-08-10

    The joint modeling of longitudinal and time-to-event data is an important tool of growing popularity to gain insights into the association between a biomarker and an event process. We develop a general framework of flexible additive joint models that allows the specification of a variety of effects, such as smooth nonlinear, time-varying and random effects, in the longitudinal and survival parts of the models. Our extensions are motivated by the investigation of the relationship between fluctuating disease-specific markers, in this case autoantibodies, and the progression to the autoimmune disease type 1 diabetes. Using Bayesian P-splines, we are in particular able to capture highly nonlinear subject-specific marker trajectories as well as a time-varying association between the marker and event process allowing new insights into disease progression. The model is estimated within a Bayesian framework and implemented in the R-package bamlss. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Model-based Bayesian inference for ROC data analysis

    NASA Astrophysics Data System (ADS)

    Lei, Tianhu; Bae, K. Ty

    2013-03-01

    This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.

  16. Modeling Grade IV Gas Emboli using a Limited Failure Population Model with Random Effects

    NASA Technical Reports Server (NTRS)

    Thompson, Laura A.; Conkin, Johnny; Chhikara, Raj S.; Powell, Michael R.

    2002-01-01

    Venous gas emboli (VGE) (gas bubbles in venous blood) are associated with an increased risk of decompression sickness (DCS) in hypobaric environments. A high grade of VGE can be a precursor to serious DCS. In this paper, we model time to Grade IV VGE considering a subset of individuals assumed to be immune from experiencing VGE. Our data contain monitoring test results from subjects undergoing up to 13 denitrogenation test procedures prior to exposure to a hypobaric environment. The onset time of Grade IV VGE is recorded as contained within certain time intervals. We fit a parametric (lognormal) mixture survival model to the interval-and right-censored data to account for the possibility of a subset of "cured" individuals who are immune to the event. Our model contains random subject effects to account for correlations between repeated measurements on a single individual. Model assessments and cross-validation indicate that this limited failure population mixture model is an improvement over a model that does not account for the potential of a fraction of cured individuals. We also evaluated some alternative mixture models. Predictions from the best fitted mixture model indicate that the actual process is reasonably approximated by a limited failure population model.

  17. Modeling Grade IV Gas Emboli using a Limited Failure Population Model with Random Effects

    NASA Astrophysics Data System (ADS)

    Thompson, Laura A.; Conkin, Johnny; Chhikara, Raj S.; Powell, Michael R.

    2002-05-01

    Venous gas emboli (VGE) (gas bubbles in venous blood) are associated with an increased risk of decompression sickness (DCS) in hypobaric environments. A high grade of VGE can be a precursor to serious DCS. In this paper, we model time to Grade IV VGE considering a subset of individuals assumed to be immune from experiencing VGE. Our data contain monitoring test results from subjects undergoing up to 13 denitrogenation test procedures prior to exposure to a hypobaric environment. The onset time of Grade IV VGE is recorded as contained within certain time intervals. We fit a parametric (lognormal) mixture survival model to the interval-and right-censored data to account for the possibility of a subset of "cured" individuals who are immune to the event. Our model contains random subject effects to account for correlations between repeated measurements on a single individual. Model assessments and cross-validation indicate that this limited failure population mixture model is an improvement over a model that does not account for the potential of a fraction of cured individuals. We also evaluated some alternative mixture models. Predictions from the best fitted mixture model indicate that the actual process is reasonably approximated by a limited failure population model.

  18. Evaluation of ultrastructure and random effects band recovery models for estimating relationships between survival and harvest rates in exploited populations

    USGS Publications Warehouse

    Otis, D.L.; White, Gary C.

    2004-01-01

    Increased population survival rate after an episode of seasonal exploitation is considered a type of compensatory population response. Lack of an increase is interpreted as evidence that exploitation results in added annual mortality in the population. Despite its importance to management of exploited species, there are limited statistical techniques for comparing relative support for these two alternative models. For exploited bird species, the most common technique is to use a fixed effect, deterministic ultrastructure model incorporated into band recovery models to estimate the relationship between harvest and survival rate. We present a new likelihood-based technique within a framework that assumes that survival and harvest are random effects that covary through time. We conducted a Monte Carlo simulation study under this framework to evaluate the performance of these two techniques. The ultrastructure models performed poorly in all simulated scenarios, due mainly to pathological distributional properties. The random effects estimators and their associated estimators of precision had relatively small negative bias under most scenarios, and profile likelihood intervals achieved nominal coverage. We suggest that the random effects estimation method approach has many advantages compared to the ultrastructure models, and that evaluation of robustness and generalization to more complex population structures are topics for additional research. ?? 2004 Museu de Cie??ncies Naturals.

  19. Analysis of non-ignorable missing and left-censored longitudinal data using a weighted random effects tobit model.

    PubMed

    Sattar, Abdus; Weissfeld, Lisa A; Molenberghs, Geert

    2011-11-30

    In a longitudinal study with response data collected during a hospital stay, observations may be missing because of the subject's discharge from the hospital prior to completion of the study or the death of the subject, resulting in non-ignorable missing data. In addition to non-ignorable missingness, there is left-censoring in the response measurements because of the inherent limit of detection. For analyzing non-ignorable missing and left-censored longitudinal data, we have proposed to extend the theory of random effects tobit regression model to weighted random effects tobit regression model. The weights are computed on the basis of inverse probability weighted augmented methodology. An extensive simulation study was performed to compare the performance of the proposed model with a number of competitive models. The simulation study shows that the estimates are consistent and that the root mean square errors of the estimates are minimal for the use of augmented inverse probability weights in the random effects tobit model. The proposed method is also applied to the non-ignorable missing and left-censored interleukin-6 biomarker data obtained from the Genetic and Inflammatory Markers of Sepsis study.

  20. Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.

    PubMed

    Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J

    2010-12-01

    Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies

  1. Shortlist B: A Bayesian Model of Continuous Speech Recognition

    ERIC Educational Resources Information Center

    Norris, Dennis; McQueen, James M.

    2008-01-01

    A Bayesian model of continuous speech recognition is presented. It is based on Shortlist (D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract prelexical and lexical representations, a feedforward…

  2. Measuring Learning Progressions Using Bayesian Modeling in Complex Assessments

    ERIC Educational Resources Information Center

    Rutstein, Daisy Wise

    2012-01-01

    This research examines issues regarding model estimation and robustness in the use of Bayesian Inference Networks (BINs) for measuring Learning Progressions (LPs). It provides background information on LPs and how they might be used in practice. Two simulation studies are performed, along with real data examples. The first study examines the case…

  3. Shortlist B: A Bayesian Model of Continuous Speech Recognition

    ERIC Educational Resources Information Center

    Norris, Dennis; McQueen, James M.

    2008-01-01

    A Bayesian model of continuous speech recognition is presented. It is based on Shortlist (D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract prelexical and lexical representations, a feedforward…

  4. Measuring Learning Progressions Using Bayesian Modeling in Complex Assessments

    ERIC Educational Resources Information Center

    Rutstein, Daisy Wise

    2012-01-01

    This research examines issues regarding model estimation and robustness in the use of Bayesian Inference Networks (BINs) for measuring Learning Progressions (LPs). It provides background information on LPs and how they might be used in practice. Two simulation studies are performed, along with real data examples. The first study examines the case…

  5. Bayesian Inference and Diagnostics for the Three Parameter Logistic Model.

    ERIC Educational Resources Information Center

    Leonard, Tom; Novick, Melvin R.

    This proposal attempts to follow in Allan Birnbaum's tradition by using Bayesian ideas to show that his mental test model possesses even broader applicability than previously realized. Birnbaum's two significant contributions to the theories of statistics and educational testing are: (1) the proof that the sufficiency and conditionality principles…

  6. Performance of time-varying predictors in multilevel models under an assumption of fixed or random effects.

    PubMed

    Baird, Rachel; Maxwell, Scott E

    2016-06-01

    Time-varying predictors in multilevel models are a useful tool for longitudinal research, whether they are the research variable of interest or they are controlling for variance to allow greater power for other variables. However, standard recommendations to fix the effect of time-varying predictors may make an assumption that is unlikely to hold in reality and may influence results. A simulation study illustrates that treating the time-varying predictor as fixed may allow analyses to converge, but the analyses have poor coverage of the true fixed effect when the time-varying predictor has a random effect in reality. A second simulation study shows that treating the time-varying predictor as random may have poor convergence, except when allowing negative variance estimates. Although negative variance estimates are uninterpretable, results of the simulation show that estimates of the fixed effect of the time-varying predictor are as accurate for these cases as for cases with positive variance estimates, and that treating the time-varying predictor as random and allowing negative variance estimates performs well whether the time-varying predictor is fixed or random in reality. Because of the difficulty of interpreting negative variance estimates, 2 procedures are suggested for selection between fixed-effect and random-effect models: comparing between fixed-effect and constrained random-effect models with a likelihood ratio test or fitting a fixed-effect model when an unconstrained random-effect model produces negative variance estimates. The performance of these 2 procedures is compared. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  7. Modeling Unreliable Observations in Bayesian Networks by Credal Networks

    NASA Astrophysics Data System (ADS)

    Antonucci, Alessandro; Piatti, Alberto

    Bayesian networks are probabilistic graphical models widely employed in AI for the implementation of knowledge-based systems. Standard inference algorithms can update the beliefs about a variable of interest in the network after the observation of some other variables. This is usually achieved under the assumption that the observations could reveal the actual states of the variables in a fully reliable way. We propose a procedure for a more general modeling of the observations, which allows for updating beliefs in different situations, including various cases of unreliable, incomplete, uncertain and also missing observations. This is achieved by augmenting the original Bayesian network with a number of auxiliary variables corresponding to the observations. For a flexible modeling of the observational process, the quantification of the relations between these auxiliary variables and those of the original Bayesian network is done by credal sets, i.e., convex sets of probability mass functions. Without any lack of generality, we show how this can be done by simply estimating the bounds of likelihoods of the observations for the different values of the observed variables. Overall, the Bayesian network is transformed into a credal network, for which a standard updating problem has to be solved. Finally, a number of transformations that might simplify the updating of the resulting credal network is provided.

  8. Empirical evaluation of scoring functions for Bayesian network model selection.

    PubMed

    Liu, Zhifa; Malone, Brandon; Yuan, Changhe

    2012-01-01

    In this work, we empirically evaluate the capability of various scoring functions of Bayesian networks for recovering true underlying structures. Similar investigations have been carried out before, but they typically relied on approximate learning algorithms to learn the network structures. The suboptimal structures found by the approximation methods have unknown quality and may affect the reliability of their conclusions. Our study uses an optimal algorithm to learn Bayesian network structures from datasets generated from a set of gold standard Bayesian networks. Because all optimal algorithms always learn equivalent networks, this ensures that only the choice of scoring function affects the learned networks. Another shortcoming of the previous studies stems from their use of random synthetic networks as test cases. There is no guarantee that these networks reflect real-world data. We use real-world data to generate our gold-standard structures, so our experimental design more closely approximates real-world situations. A major finding of our study suggests that, in contrast to results reported by several prior works, the Minimum Description Length (MDL) (or equivalently, Bayesian information criterion (BIC)) consistently outperforms other scoring functions such as Akaike's information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures. We believe this finding is a result of using both datasets generated from real-world applications rather than from random processes used in previous studies and learning algorithms to select high-scoring structures rather than selecting random models. Other findings of our study support existing work, e.g., large sample sizes result in learning structures closer to the true underlying structure; the BDeu score is sensitive to the parameter settings; and the fNML performs pretty well on small datasets. We also

  9. Empirical evaluation of scoring functions for Bayesian network model selection

    PubMed Central

    2012-01-01

    In this work, we empirically evaluate the capability of various scoring functions of Bayesian networks for recovering true underlying structures. Similar investigations have been carried out before, but they typically relied on approximate learning algorithms to learn the network structures. The suboptimal structures found by the approximation methods have unknown quality and may affect the reliability of their conclusions. Our study uses an optimal algorithm to learn Bayesian network structures from datasets generated from a set of gold standard Bayesian networks. Because all optimal algorithms always learn equivalent networks, this ensures that only the choice of scoring function affects the learned networks. Another shortcoming of the previous studies stems from their use of random synthetic networks as test cases. There is no guarantee that these networks reflect real-world data. We use real-world data to generate our gold-standard structures, so our experimental design more closely approximates real-world situations. A major finding of our study suggests that, in contrast to results reported by several prior works, the Minimum Description Length (MDL) (or equivalently, Bayesian information criterion (BIC)) consistently outperforms other scoring functions such as Akaike's information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures. We believe this finding is a result of using both datasets generated from real-world applications rather than from random processes used in previous studies and learning algorithms to select high-scoring structures rather than selecting random models. Other findings of our study support existing work, e.g., large sample sizes result in learning structures closer to the true underlying structure; the BDeu score is sensitive to the parameter settings; and the fNML performs pretty well on small datasets. We also

  10. A novel random effect model for GWAS meta-analysis and its application to trans-ethnic meta-analysis

    PubMed Central

    Shi, Jingchunzi; Lee, Seunggeun

    2016-01-01

    Summary Meta-analysis of trans-ethnic genome-wide association studies (GWAS) has proven to be a practical and profitable approach for identifying loci that contribute to the risk of complex diseases. However, the expected genetic effect heterogeneity cannot easily be accommodated through existing fixed-effects and random-effects methods. In response, we propose a novel random effect model for trans-ethnic meta-analysis with flexible modeling of the expected genetic effect heterogeneity across diverse populations. Specifically, we adopt a modified random effect model from the kernel regression framework, in which genetic effect coefficients are random variables whose correlation structure reflects the genetic distances across ancestry groups. In addition, we use the adaptive variance component test to achieve robust power regardless of the degree of genetic effect heterogeneity. Simulation studies show that our proposed method has well-calibrated type I error rates at very stringent significance levels and can improve power over the traditional meta-analysis methods. We re-analyzed the published type 2 diabetes GWAS meta-analysis (Consortium et al., 2014) and successfully identified one additional SNP that clearly exhibits genetic effect heterogeneity across different ancestry groups. Furthermore, our proposed method provides scalable computing time for genome-wide datasets, in which an analysis of one million SNPs would require less than 3 hours. PMID:26916671

  11. A novel random effect model for GWAS meta-analysis and its application to trans-ethnic meta-analysis.

    PubMed

    Shi, Jingchunzi; Lee, Seunggeun

    2016-09-01

    Meta-analysis of trans-ethnic genome-wide association studies (GWAS) has proven to be a practical and profitable approach for identifying loci that contribute to the risk of complex diseases. However, the expected genetic effect heterogeneity cannot easily be accommodated through existing fixed-effects and random-effects methods. In response, we propose a novel random effect model for trans-ethnic meta-analysis with flexible modeling of the expected genetic effect heterogeneity across diverse populations. Specifically, we adopt a modified random effect model from the kernel regression framework, in which genetic effect coefficients are random variables whose correlation structure reflects the genetic distances across ancestry groups. In addition, we use the adaptive variance component test to achieve robust power regardless of the degree of genetic effect heterogeneity. Simulation studies show that our proposed method has well-calibrated type I error rates at very stringent significance levels and can improve power over the traditional meta-analysis methods. We reanalyzed the published type 2 diabetes GWAS meta-analysis (Consortium et al., 2014) and successfully identified one additional SNP that clearly exhibits genetic effect heterogeneity across different ancestry groups. Furthermore, our proposed method provides scalable computing time for genome-wide datasets, in which an analysis of one million SNPs would require less than 3 hours. © 2016, The International Biometric Society.

  12. Probabilistic (Bayesian) Modeling of Gene Expression in Transplant Glomerulopathy

    PubMed Central

    Elster, Eric A.; Hawksworth, Jason S.; Cheng, Orlena; Leeser, David B.; Ring, Michael; Tadaki, Douglas K.; Kleiner, David E.; Eberhardt, John S.; Brown, Trevor S.; Mannon, Roslyn B.

    2010-01-01

    Transplant glomerulopathy (TG) is associated with rapid decline in glomerular filtration rate and poor outcome. We used low-density arrays with a novel probabilistic analysis to characterize relationships between gene transcripts and the development of TG in allograft recipients. Retrospective review identified TG in 10.8% of 963 core biopsies from 166 patients; patients with stable function were studied for comparison. The biopsies were analyzed for expression of 87 genes related to immune function and fibrosis by using real-time PCR, and a Bayesian model was generated and validated to predict histopathology based on gene expression. A total of 57 individual genes were increased in TG compared with stable function biopsies (P < 0.05). The Bayesian analysis identified critical relationships between ICAM-1, IL-10, CCL3, CD86, VCAM-1, MMP-9, MMP-7, and LAMC2 and allograft pathology. Moreover, Bayesian models predicted TG when derived from either immune function (area under the curve [95% confidence interval] of 0.875 [0.675 to 0.999], P = 0.004) or fibrosis (area under the curve [95% confidence interval] of 0.859 [0.754 to 0.963], P < 0.001) gene networks. Critical pathways in the Bayesian models were also analyzed by using the Fisher exact test and had P values <0.005. This study demonstrates that evaluating quantitative gene expression profiles with Bayesian modeling can identify significant transcriptional associations that have the potential to support the diagnostic capability of allograft histology. This integrated approach has broad implications in the field of transplant diagnostics. PMID:20688906

  13. Bayesian model evidence for order selection and correlation testing.

    PubMed

    Johnston, Leigh A; Mareels, Iven M Y; Egan, Gary F

    2011-01-01

    Model selection is a critical component of data analysis procedures, and is particularly difficult for small numbers of observations such as is typical of functional MRI datasets. In this paper we derive two Bayesian evidence-based model selection procedures that exploit the existence of an analytic form for the linear Gaussian model class. Firstly, an evidence information criterion is proposed as a model order selection procedure for auto-regressive models, outperforming the commonly employed Akaike and Bayesian information criteria in simulated data. Secondly, an evidence-based method for testing change in linear correlation between datasets is proposed, which is demonstrated to outperform both the traditional statistical test of the null hypothesis of no correlation change and the likelihood ratio test.

  14. APPLICATION OF BAYESIAN MONTE CARLO ANALYSIS TO A LAGRANGIAN PHOTOCHEMICAL AIR QUALITY MODEL. (R824792)

    EPA Science Inventory

    Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...

  15. Bayesian analysis of structural equation models with dichotomous variables.

    PubMed

    Lee, Sik-Yum; Song, Xin-Yuan

    2003-10-15

    Structural equation modelling has been used extensively in the behavioural and social sciences for studying interrelationships among manifest and latent variables. Recently, its uses have been well recognized in medical research. This paper introduces a Bayesian approach to analysing general structural equation models with dichotomous variables. In the posterior analysis, the observed dichotomous data are augmented with the hypothetical missing values, which involve the latent variables in the model and the unobserved continuous measurements underlying the dichotomous data. An algorithm based on the Gibbs sampler is developed for drawing the parameters values and the hypothetical missing values from the joint posterior distributions. Useful statistics, such as the Bayesian estimates and their standard error estimates, and the highest posterior density intervals, can be obtained from the simulated observations. A posterior predictive p-value is used to test the goodness-of-fit of the posited model. The methodology is applied to a study of hypertensive patient non-adherence to medication.

  16. Selecting Bayesian priors for stochastic rates using extended functional models

    NASA Astrophysics Data System (ADS)

    Gibson, Gavin J.

    2003-04-01

    We propose an extension to the functional modelling methods described by Dawid and Stone (1982 Ann. Stat. 10 1119-38) that leads naturally to a method for selecting vague parameter priors for Bayesian analyses involving stochastic population models. Motivated by applications from quantum optics and epidemiology, we focus on analysing observed sequences of event times obeying a non-homogeneous Poisson process, although the techniques are more widely applicable. The extended functional modelling approach is illustrated for the particular case of Bayesian estimation of the death rate in the immigration-death model from observation of the death times only. It is shown that the prior selected naturally leads to a well defined posterior density for parameters and avoids some undesirable pathologies reported by Gibson and Renshaw (2001a Inverse Problems 17 455-66, 2001b Stat. Comput. 11 347-58) for the case of exponential priors. Some limitations of the approach are also discussed.

  17. Spatial Bayesian hierarchical modelling of extreme sea states

    NASA Astrophysics Data System (ADS)

    Clancy, Colm; O'Sullivan, John; Sweeney, Conor; Dias, Frédéric; Parnell, Andrew C.

    2016-11-01

    A Bayesian hierarchical framework is used to model extreme sea states, incorporating a latent spatial process to more effectively capture the spatial variation of the extremes. The model is applied to a 34-year hindcast of significant wave height off the west coast of Ireland. The generalised Pareto distribution is fitted to declustered peaks over a threshold given by the 99.8th percentile of the data. Return levels of significant wave height are computed and compared against those from a model based on the commonly-used maximum likelihood inference method. The Bayesian spatial model produces smoother maps of return levels. Furthermore, this approach greatly reduces the uncertainty in the estimates, thus providing information on extremes which is more useful for practical applications.

  18. Exemplar models as a mechanism for performing Bayesian inference.

    PubMed

    Shi, Lei; Griffiths, Thomas L; Feldman, Naomi H; Sanborn, Adam N

    2010-08-01

    Probabilistic models have recently received much attention as accounts of human cognition. However, most research in which probabilistic models have been used has been focused on formulating the abstract problems behind cognitive tasks and their optimal solutions, rather than on mechanisms that could implement these solutions. Exemplar models are a successful class of psychological process models in which an inventory of stored examples is used to solve problems such as identification, categorization, and function learning. We show that exemplar models can be used to perform a sophisticated form of Monte Carlo approximation known as importance sampling and thus provide a way to perform approximate Bayesian inference. Simulations of Bayesian inference in speech perception, generalization along a single dimension, making predictions about everyday events, concept learning, and reconstruction from memory show that exemplar models can often account for human performance with only a few exemplars, for both simple and relatively complex prior distributions. These results suggest that exemplar models provide a possible mechanism for implementing at least some forms of Bayesian inference.

  19. Bayesian calibration of groundwater models with input data uncertainty

    NASA Astrophysics Data System (ADS)

    Xu, Tianfang; Valocchi, Albert J.; Ye, Ming; Liang, Feng; Lin, Yu-Feng

    2017-04-01

    Effective water resources management typically relies on numerical models to analyze groundwater flow and solute transport processes. Groundwater models are often subject to input data uncertainty, as some inputs (such as recharge and well pumping rates) are estimated and subject to uncertainty. Current practices of groundwater model calibration often overlook uncertainties in input data; this can lead to biased parameter estimates and compromised predictions. Through a synthetic case study of surface-ground water interaction under changing pumping conditions and land use, we investigate the impacts of uncertain pumping and recharge rates on model calibration and uncertainty analysis. We then present a Bayesian framework of model calibration to handle uncertain input of groundwater models. The framework implements a marginalizing step to account for input data uncertainty when evaluating likelihood. It was found that not accounting for input uncertainty may lead to biased, overconfident parameter estimates because parameters could be over-adjusted to compensate for possible input data errors. Parameter compensation can have deleterious impacts when the calibrated model is used to make forecast under a scenario that is different from calibration conditions. By marginalizing input data uncertainty, the Bayesian calibration approach effectively alleviates parameter compensation and gives more accurate predictions in the synthetic case study. The marginalizing Bayesian method also decomposes prediction uncertainty into uncertainties contributed by parameters, input data, and measurements. The results underscore the need to account for input uncertainty to better inform postmodeling decision making.

  20. Hierarchical Bayesian modeling of heterogeneous cluster- and subject-level associations between continuous and binary outcomes in dairy production.

    PubMed

    Bello, Nora M; Steibel, Juan P; Tempelman, Robert J

    2012-03-01

    The augmentation of categorical outcomes with underlying Gaussian variables in bivariate generalized mixed effects models has facilitated the joint modeling of continuous and binary response variables. These models typically assume that random effects and residual effects (co)variances are homogeneous across all clusters and subjects, respectively. Motivated by conflicting evidence about the association between performance outcomes in dairy production systems, we consider the situation where these (co)variance parameters may themselves be functions of systematic and/or random effects. We present a hierarchical Bayesian extension of bivariate generalized linear models whereby functions of the (co)variance matrices are specified as linear combinations of fixed and random effects following a square-root-free Cholesky reparameterization that ensures necessary positive semidefinite constraints. We test the proposed model by simulation and apply it to the analysis of a dairy cattle data set in which the random herd-level and residual cow-level effects (co)variances between a continuous production trait and binary reproduction trait are modeled as functions of fixed management effects and random cluster effects. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Bayesian methods for assessing system reliability: models and computation.

    SciTech Connect

    Graves, T. L.; Hamada, Michael,

    2004-01-01

    There are many challenges with assessing the reliability of a system today. These challenges arise because a system may be aging and full system tests may be too expensive or can no longer be performed. Without full system testing, one must integrate (1) all science and engineering knowledge, models and simulations, (2) information and data at various levels of the system, e.g., subsystems and components and (3) information and data from similar systems, subsystems and components. The analyst must work with various data types and how the data are collected, account for measurement bias and uncertainty, deal with model and simulation uncertainty and incorporate expert knowledge. Bayesian hierarchical modeling provides a rigorous way to combine information from multiple sources and different types of information. However, an obstacle to applying Bayesian methods is the need to develop new software to analyze novel statistical models. We discuss a new statistical modeling environment, YADAS, that facilitates the development of Bayesian statistical analyses. It includes classes that help analysts specify new models, as well as classes that support the creation of new analysis algorithms. We illustrate these concepts using several examples.

  2. A Bayesian semiparametric factor analysis model for subtype identification.

    PubMed

    Sun, Jiehuan; Warren, Joshua L; Zhao, Hongyu

    2017-04-25

    Disease subtype identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to infer disease subtypes, which often lead to biologically meaningful insights into disease. Despite many successes, existing clustering methods may not perform well when genes are highly correlated and many uninformative genes are included for clustering due to the high dimensionality. In this article, we introduce a novel subtype identification method in the Bayesian setting based on gene expression profiles. This method, called BCSub, adopts an innovative semiparametric Bayesian factor analysis model to reduce the dimension of the data to a few factor scores for clustering. Specifically, the factor scores are assumed to follow the Dirichlet process mixture model in order to induce clustering. Through extensive simulation studies, we show that BCSub has improved performance over commonly used clustering methods. When applied to two gene expression datasets, our model is able to identify subtypes that are clinically more relevant than those identified from the existing methods.

  3. Bayesian and maximum likelihood estimation of hierarchical response time models

    PubMed Central

    Farrell, Simon; Ludwig, Casimir

    2008-01-01

    Hierarchical (or multilevel) statistical models have become increasingly popular in psychology in the last few years. We consider the application of multilevel modeling to the ex-Gaussian, a popular model of response times. Single-level estimation is compared with hierarchical estimation of parameters of the ex-Gaussian distribution. Additionally, for each approach maximum likelihood (ML) estimation is compared with Bayesian estimation. A set of simulations and analyses of parameter recovery show that although all methods perform adequately well, hierarchical methods are better able to recover the parameters of the ex-Gaussian by reducing the variability in recovered parameters. At each level, little overall difference was observed between the ML and Bayesian methods. PMID:19001592

  4. Bayesian Estimation of Categorical Dynamic Factor Models

    ERIC Educational Resources Information Center

    Zhang, Zhiyong; Nesselroade, John R.

    2007-01-01

    Dynamic factor models have been used to analyze continuous time series behavioral data. We extend 2 main dynamic factor model variations--the direct autoregressive factor score (DAFS) model and the white noise factor score (WNFS) model--to categorical DAFS and WNFS models in the framework of the underlying variable method and illustrate them with…

  5. Bayesian Estimation of Categorical Dynamic Factor Models

    ERIC Educational Resources Information Center

    Zhang, Zhiyong; Nesselroade, John R.

    2007-01-01

    Dynamic factor models have been used to analyze continuous time series behavioral data. We extend 2 main dynamic factor model variations--the direct autoregressive factor score (DAFS) model and the white noise factor score (WNFS) model--to categorical DAFS and WNFS models in the framework of the underlying variable method and illustrate them with…

  6. Bayesian model updating using incomplete modal data without mode matching

    NASA Astrophysics Data System (ADS)

    Sun, Hao; Büyüköztürk, Oral

    2016-04-01

    This study investigates a new probabilistic strategy for model updating using incomplete modal data. A hierarchical Bayesian inference is employed to model the updating problem. A Markov chain Monte Carlo technique with adaptive random-work steps is used to draw parameter samples for uncertainty quantification. Mode matching between measured and predicted modal quantities is not required through model reduction. We employ an iterated improved reduced system technique for model reduction. The reduced model retains the dynamic features as close as possible to those of the model before reduction. The proposed algorithm is finally validated by an experimental example.

  7. Application of a predictive Bayesian model to environmental accounting.

    PubMed

    Anex, R P; Englehardt, J D

    2001-03-30

    Environmental accounting techniques are intended to capture important environmental costs and benefits that are often overlooked in standard accounting practices. Environmental accounting methods themselves often ignore or inadequately represent large but highly uncertain environmental costs and costs conditioned by specific prior events. Use of a predictive Bayesian model is demonstrated for the assessment of such highly uncertain environmental and contingent costs. The predictive Bayesian approach presented generates probability distributions for the quantity of interest (rather than parameters thereof). A spreadsheet implementation of a previously proposed predictive Bayesian model, extended to represent contingent costs, is described and used to evaluate whether a firm should undertake an accelerated phase-out of its PCB containing transformers. Variability and uncertainty (due to lack of information) in transformer accident frequency and severity are assessed simultaneously using a combination of historical accident data, engineering model-based cost estimates, and subjective judgement. Model results are compared using several different risk measures. Use of the model for incorporation of environmental risk management into a company's overall risk management strategy is discussed.

  8. Application of the Bayesian dynamic survival model in medicine.

    PubMed

    He, Jianghua; McGee, Daniel L; Niu, Xufeng

    2010-02-10

    The Bayesian dynamic survival model (BDSM), a time-varying coefficient survival model from the Bayesian prospective, was proposed in early 1990s but has not been widely used or discussed. In this paper, we describe the model structure of the BDSM and introduce two estimation approaches for BDSMs: the Markov Chain Monte Carlo (MCMC) approach and the linear Bayesian (LB) method. The MCMC approach estimates model parameters through sampling and is computationally intensive. With the newly developed geoadditive survival models and software BayesX, the BDSM is available for general applications. The LB approach is easier in terms of computations but it requires the prespecification of some unknown smoothing parameters. In a simulation study, we use the LB approach to show the effects of smoothing parameters on the performance of the BDSM and propose an ad hoc method for identifying appropriate values for those parameters. We also demonstrate the performance of the MCMC approach compared with the LB approach and a penalized partial likelihood method available in software R packages. A gastric cancer trial is utilized to illustrate the application of the BDSM.

  9. Assessment of substitution model adequacy using frequentist and Bayesian methods.

    PubMed

    Ripplinger, Jennifer; Sullivan, Jack

    2010-12-01

    In order to have confidence in model-based phylogenetic methods, such as maximum likelihood (ML) and Bayesian analyses, one must use an appropriate model of molecular evolution identified using statistically rigorous criteria. Although model selection methods such as the likelihood ratio test and Akaike information criterion are widely used in the phylogenetic literature, model selection methods lack the ability to reject all models if they provide an inadequate fit to the data. There are two methods, however, that assess absolute model adequacy, the frequentist Goldman-Cox (GC) test and Bayesian posterior predictive simulations (PPSs), which are commonly used in conjunction with the multinomial log likelihood test statistic. In this study, we use empirical and simulated data to evaluate the adequacy of common substitution models using both frequentist and Bayesian methods and compare the results with those obtained with model selection methods. In addition, we investigate the relationship between model adequacy and performance in ML and Bayesian analyses in terms of topology, branch lengths, and bipartition support. We show that tests of model adequacy based on the multinomial likelihood often fail to reject simple substitution models, especially when the models incorporate among-site rate variation (ASRV), and normally fail to reject less complex models than those chosen by model selection methods. In addition, we find that PPSs often fail to reject simpler models than the GC test. Use of the simplest substitution models not rejected based on fit normally results in similar but divergent estimates of tree topology and branch lengths. In addition, use of the simplest adequate substitution models can affect estimates of bipartition support, although these differences are often small with the largest differences confined to poorly supported nodes. We also find that alternative assumptions about ASRV can affect tree topology, tree length, and bipartition support. Our

  10. Bayesian inference in camera trapping studies for a class of spatial capture-recapture models

    USGS Publications Warehouse

    Royle, J. Andrew; Karanth, K. Ullas; Gopalaswamy, Arjun M.; Kumar, N. Samba

    2009-01-01

    We develop a class of models for inference about abundance or density using spatial capture-recapture data from studies based on camera trapping and related methods. The model is a hierarchical model composed of two components: a point process model describing the distribution of individuals in space (or their home range centers) and a model describing the observation of individuals in traps. We suppose that trap- and individual-specific capture probabilities are a function of distance between individual home range centers and trap locations. We show that the models can be regarded as generalized linear mixed models, where the individual home range centers are random effects. We adopt a Bayesian framework for inference under these models using a formulation based on data augmentation. We apply the models to camera trapping data on tigers from the Nagarahole Reserve, India, collected over 48 nights in 2006. For this study, 120 camera locations were used, but cameras were only operational at 30 locations during any given sample occasion. Movement of traps is common in many camera-trapping studies and represents an important feature of the observation model that we address explicitly in our application.

  11. Bayesian analysis of nonlinear mixed-effects mixture models for longitudinal data with heterogeneity and skewness.

    PubMed

    Lu, Xiaosun; Huang, Yangxin

    2014-07-20

    It is a common practice to analyze complex longitudinal data using nonlinear mixed-effects (NLME) models with normality assumption. The NLME models with normal distributions provide the most popular framework for modeling continuous longitudinal outcomes, assuming individuals are from a homogeneous population and relying on random-effects to accommodate inter-individual variation. However, the following two issues may standout: (i) normality assumption for model errors may cause lack of robustness and subsequently lead to invalid inference and unreasonable estimates, particularly, if the data exhibit skewness and (ii) a homogeneous population assumption may be unrealistically obscuring important features of between-subject and within-subject variations, which may result in unreliable modeling results. There has been relatively few studies concerning longitudinal data with both heterogeneity and skewness features. In the last two decades, the skew distributions have shown beneficial in dealing with asymmetric data in various applications. In this article, our objective is to address the simultaneous impact of both features arisen from longitudinal data by developing a flexible finite mixture of NLME models with skew distributions under Bayesian framework that allows estimates of both model parameters and class membership probabilities for longitudinal data. Simulation studies are conducted to assess the performance of the proposed models and methods, and a real example from an AIDS clinical trial illustrates the methodology by modeling the viral dynamics to compare potential models with different distribution specifications; the analysis results are reported.

  12. Bayesian Inference of High-Dimensional Dynamical Ocean Models

    NASA Astrophysics Data System (ADS)

    Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.

    2015-12-01

    This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.

  13. A localization model to localize multiple sources using Bayesian inference

    NASA Astrophysics Data System (ADS)

    Dunham, Joshua Rolv

    Accurate localization of a sound source in a room setting is important in both psychoacoustics and architectural acoustics. Binaural models have been proposed to explain how the brain processes and utilizes the interaural time differences (ITDs) and interaural level differences (ILDs) of sound waves arriving at the ears of a listener in determining source location. Recent work shows that applying Bayesian methods to this problem is proving fruitful. In this thesis, pink noise samples are convolved with head-related transfer functions (HRTFs) and compared to combinations of one and two anechoic speech signals convolved with different HRTFs or binaural room impulse responses (BRIRs) to simulate room positions. Through exhaustive calculation of Bayesian posterior probabilities and using a maximal likelihood approach, model selection will determine the number of sources present, and parameter estimation will result in azimuthal direction of the source(s).

  14. A Bayesian hierarchical model for climate change detection and attribution

    NASA Astrophysics Data System (ADS)

    Katzfuss, Matthias; Hammerling, Dorit; Smith, Richard L.

    2017-06-01

    Regression-based detection and attribution methods continue to take a central role in the study of climate change and its causes. Here we propose a novel Bayesian hierarchical approach to this problem, which allows us to address several open methodological questions. Specifically, we take into account the uncertainties in the true temperature change due to imperfect measurements, the uncertainty in the true climate signal under different forcing scenarios due to the availability of only a small number of climate model simulations, and the uncertainty associated with estimating the climate variability covariance matrix, including the truncation of the number of empirical orthogonal functions (EOFs) in this covariance matrix. We apply Bayesian model averaging to assign optimal probabilistic weights to different possible truncations and incorporate all uncertainties into the inference on the regression coefficients. We provide an efficient implementation of our method in a software package and illustrate its use with a realistic application.

  15. Bayesian restoration of ion channel records using hidden Markov models.

    PubMed

    Rosales, R; Stark, J A; Fitzgerald, W J; Hladky, S B

    2001-03-01

    Hidden Markov models have been used to restore recorded signals of single ion channels buried in background noise. Parameter estimation and signal restoration are usually carried out through likelihood maximization by using variants of the Baum-Welch forward-backward procedures. This paper presents an alternative approach for dealing with this inferential task. The inferences are made by using a combination of the framework provided by Bayesian statistics and numerical methods based on Markov chain Monte Carlo stochastic simulation. The reliability of this approach is tested by using synthetic signals of known characteristics. The expectations of the model parameters estimated here are close to those calculated using the Baum-Welch algorithm, but the present methods also yield estimates of their errors. Comparisons of the results of the Bayesian Markov Chain Monte Carlo approach with those obtained by filtering and thresholding demonstrate clearly the superiority of the new methods.

  16. Moment Reconstruction and Moment-Adjusted Imputation When Exposure is Generated by a Complex, Nonlinear Random Effects Modeling Process

    PubMed Central

    Potgieter, Cornelis J.; Wei, Rubin; Kipnis, Victor; Freedman, Laurence S.; Carroll, Raymond J.

    2016-01-01

    Summary For the classical, homoscedastic measurement error model, moment reconstruction (Freedman et al., 2004, 2008) and moment-adjusted imputation (Thomas et al., 2011) are appealing, computationally simple imputation-like methods for general model fitting. Like classical regression calibration, the idea is to replace the unobserved variable subject to measurement error with a proxy that can be used in a variety of analyses. Moment reconstruction and moment-adjusted imputation differ from regression calibration in that they attempt to match multiple features of the latent variable, and also to match some of the latent variable’s relationships with the response and additional covariates. In this note, we consider a problem where true exposure is generated by a complex, nonlinear random effects modeling process, and develop analogues of moment reconstruction and moment-adjusted imputation for this case. This general model includes classical measurement errors, Berkson measurement errors, mixtures of Berkson and classical errors and problems that are not measurement error problems, but also cases where the data generating process for true exposure is a complex, nonlinear random effects modeling process. The methods are illustrated using the National Institutes of Health-AARP Diet and Health Study where the latent variable is a dietary pattern score called the Healthy Eating Index - 2005. We also show how our general model includes methods used in radiation epidemiology as a special case. Simulations are used to illustrate the methods. PMID:27061196

  17. Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies.

    PubMed

    Liu, Xiaolei; Huang, Meng; Fan, Bin; Buckler, Edward S; Zhang, Zhiwu

    2016-02-01

    False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises true positives. The modified MLM method, Multiple Loci Linear Mixed Model (MLMM), incorporates multiple markers simultaneously as covariates in a stepwise MLM to partially remove the confounding between testing markers and kinship. To completely eliminate the confounding, we divided MLMM into two parts: Fixed Effect Model (FEM) and a Random Effect Model (REM) and use them iteratively. FEM contains testing markers, one at a time, and multiple associated markers as covariates to control false positives. To avoid model over-fitting problem in FEM, the associated markers are estimated in REM by using them to define kinship. The P values of testing markers and the associated markers are unified at each iteration. We named the new method as Fixed and random model Circulating Probability Unification (FarmCPU). Both real and simulated data analyses demonstrated that FarmCPU improves statistical power compared to current methods. Additional benefits include an efficient computing time that is linear to both number of individuals and number of markers. Now, a dataset with half million individuals and half million markers can be analyzed within three days.

  18. Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies

    PubMed Central

    Liu, Xiaolei; Huang, Meng; Fan, Bin; Buckler, Edward S.; Zhang, Zhiwu

    2016-01-01

    False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises true positives. The modified MLM method, Multiple Loci Linear Mixed Model (MLMM), incorporates multiple markers simultaneously as covariates in a stepwise MLM to partially remove the confounding between testing markers and kinship. To completely eliminate the confounding, we divided MLMM into two parts: Fixed Effect Model (FEM) and a Random Effect Model (REM) and use them iteratively. FEM contains testing markers, one at a time, and multiple associated markers as covariates to control false positives. To avoid model over-fitting problem in FEM, the associated markers are estimated in REM by using them to define kinship. The P values of testing markers and the associated markers are unified at each iteration. We named the new method as Fixed and random model Circulating Probability Unification (FarmCPU). Both real and simulated data analyses demonstrated that FarmCPU improves statistical power compared to current methods. Additional benefits include an efficient computing time that is linear to both number of individuals and number of markers. Now, a dataset with half million individuals and half million markers can be analyzed within three days. PMID:26828793

  19. Cross-validation to select Bayesian hierarchical models in phylogenetics.

    PubMed

    Duchêne, Sebastián; Duchêne, David A; Di Giallonardo, Francesca; Eden, John-Sebastian; Geoghegan, Jemma L; Holt, Kathryn E; Ho, Simon Y W; Holmes, Edward C

    2016-05-26

    Recent developments in Bayesian phylogenetic models have increased the range of inferences that can be drawn from molecular sequence data. Accordingly, model selection has become an important component of phylogenetic analysis. Methods of model selection generally consider the likelihood of the data under the model in question. In the context of Bayesian phylogenetics, the most common approach involves estimating the marginal likelihood, which is typically done by integrating the likelihood across model parameters, weighted by the prior. Although this method is accurate, it is sensitive to the presence of improper priors. We explored an alternative approach based on cross-validation that is widely used in evolutionary analysis. This involves comparing models according to their predictive performance. We analysed simulated data and a range of viral and bacterial data sets using a cross-validation approach to compare a variety of molecular clock and demographic models. Our results show that cross-validation can be effective in distinguishing between strict- and relaxed-clock models and in identifying demographic models that allow growth in population size over time. In most of our empirical data analyses, the model selected using cross-validation was able to match that selected using marginal-likelihood estimation. The accuracy of cross-validation appears to improve with longer sequence data, particularly when distinguishing between relaxed-clock models. Cross-validation is a useful method for Bayesian phylogenetic model selection. This method can be readily implemented even when considering complex models where selecting an appropriate prior for all parameters may be difficult.

  20. Slice sampling technique in Bayesian extreme of gold price modelling

    NASA Astrophysics Data System (ADS)

    Rostami, Mohammad; Adam, Mohd Bakri; Ibrahim, Noor Akma; Yahya, Mohamed Hisham

    2013-09-01

    In this paper, a simulation study of Bayesian extreme values by using Markov Chain Monte Carlo via slice sampling algorithm is implemented. We compared the accuracy of slice sampling with other methods for a Gumbel model. This study revealed that slice sampling algorithm offers more accurate and closer estimates with less RMSE than other methods . Finally we successfully employed this procedure to estimate the parameters of Malaysia extreme gold price from 2000 to 2011.

  1. A Tutorial Introduction to Bayesian Models of Cognitive Development

    DTIC Science & Technology

    2011-01-01

    optimal, subject as it is to emotions , heuristics, and biases of many different sorts (e.g., Tversky & Kahneman, 1974). However, even if humans are non...and how that changes over the lifespan. Bayesian models have also had little to say about emotional regulation or psychopathology. This is not to...Werker, J., & Amano, S. (2007). Unsuper- vised learning of vowel categories from infant-directed speech. Proceedings of the National Academy of Sciences

  2. Theory-Based Bayesian Models of Inductive Inference

    DTIC Science & Technology

    2010-06-30

    Oxford University Press . 28. Griffiths, T. L. and Tenenbaum, J.B. (2007). Two proposals for causal grammar. In A. Gopnik and L. Schulz (eds.). ( ausal Learning. Oxford University Press . 29. Tenenbaum. J. B.. Kemp, C, Shafto. P. (2007). Theory-based Bayesian models for inductive reasoning. In A. Feeney and E. Heit (eds.). Induction. Cambridge University Press. 30. Goodman, N. D., Tenenbaum, J. B., Griffiths. T. L.. & Feldman, J. (2008). Compositionality in rational analysis: Grammar-based induction for concept

  3. How to Address Measurement Noise in Bayesian Model Averaging

    NASA Astrophysics Data System (ADS)

    Schöniger, A.; Wöhling, T.; Nowak, W.

    2014-12-01

    When confronted with the challenge of selecting one out of several competing conceptual models for a specific modeling task, Bayesian model averaging is a rigorous choice. It ranks the plausibility of models based on Bayes' theorem, which yields an optimal trade-off between performance and complexity. With the resulting posterior model probabilities, their individual predictions are combined into a robust weighted average and the overall predictive uncertainty (including conceptual uncertainty) can be quantified. This rigorous framework does, however, not yet explicitly consider statistical significance of measurement noise in the calibration data set. This is a major drawback, because model weights might be instable due to the uncertainty in noisy data, which may compromise the reliability of model ranking. We present a new extension to the Bayesian model averaging framework that explicitly accounts for measurement noise as a source of uncertainty for the weights. This enables modelers to assess the reliability of model ranking for a specific application and a given calibration data set. Also, the impact of measurement noise on the overall prediction uncertainty can be determined. Technically, our extension is built within a Monte Carlo framework. We repeatedly perturb the observed data with random realizations of measurement error. Then, we determine the robustness of the resulting model weights against measurement noise. We quantify the variability of posterior model weights as weighting variance. We add this new variance term to the overall prediction uncertainty analysis within the Bayesian model averaging framework to make uncertainty quantification more realistic and "complete". We illustrate the importance of our suggested extension with an application to soil-plant model selection, based on studies by Wöhling et al. (2013, 2014). Results confirm that noise in leaf area index or evaporation rate observations produces a significant amount of weighting

  4. Application of hierarchical Bayesian unmixing models in river sediment source apportionment

    NASA Astrophysics Data System (ADS)

    Blake, Will; Smith, Hugh; Navas, Ana; Bodé, Samuel; Goddard, Rupert; Zou Kuzyk, Zou; Lennard, Amy; Lobb, David; Owens, Phil; Palazon, Leticia; Petticrew, Ellen; Gaspar, Leticia; Stock, Brian; Boeckx, Pacsal; Semmens, Brice

    2016-04-01

    Fingerprinting and unmixing concepts are used widely across environmental disciplines for forensic evaluation of pollutant sources. In aquatic and marine systems, this includes tracking the source of organic and inorganic pollutants in water and linking problem sediment to soil erosion and land use sources. It is, however, the particular complexity of ecological systems that has driven creation of the most sophisticated mixing models, primarily to (i) evaluate diet composition in complex ecological food webs, (ii) inform population structure and (iii) explore animal movement. In the context of the new hierarchical Bayesian unmixing model, MIXSIAR, developed to characterise intra-population niche variation in ecological systems, we evaluate the linkage between ecological 'prey' and 'consumer' concepts and river basin sediment 'source' and sediment 'mixtures' to exemplify the value of ecological modelling tools to river basin science. Recent studies have outlined advantages presented by Bayesian unmixing approaches in handling complex source and mixture datasets while dealing appropriately with uncertainty in parameter probability distributions. MixSIAR is unique in that it allows individual fixed and random effects associated with mixture hierarchy, i.e. factors that might exert an influence on model outcome for mixture groups, to be explored within the source-receptor framework. This offers new and powerful ways of interpreting river basin apportionment data. In this contribution, key components of the model are evaluated in the context of common experimental designs for sediment fingerprinting studies namely simple, nested and distributed catchment sampling programmes. Illustrative examples using geochemical and compound specific stable isotope datasets are presented and used to discuss best practice with specific attention to (1) the tracer selection process, (2) incorporation of fixed effects relating to sample timeframe and sediment type in the modelling

  5. Bayesian Isotonic Regression Dose-response (BIRD) Model.

    PubMed

    Li, Wen; Fu, Haoda

    2016-12-21

    Understanding dose-response relationship is a crucial step in drug development. There are a few parametric methods to estimate dose-response curves, such as the Emax model and the logistic model. These parametric models are easy to interpret and, hence, widely used. However, these models often require the inclusion of patients on high-dose levels; otherwise, the model parameters cannot be reliably estimated. To have robust estimation, nonparametric models are used. However, these models are not able to estimate certain important clinical parameters, such as ED50 and Emax. Furthermore, in many therapeutic areas, dose-response curves can be assumed as non-decreasing functions. This creates an additional challenge for nonparametric methods. In this paper, we propose a new Bayesian isotonic regression dose-response model which features advantages from both parametric and nonparametric models. The ED50 and Emax can be derived from this model. Simulations are provided to evaluate the Bayesian isotonic regression dose-response model performance against two parametric models. We apply this model to a data set from a diabetes dose-finding study.

  6. Shortlist B: a Bayesian model of continuous speech recognition.

    PubMed

    Norris, Dennis; McQueen, James M

    2008-04-01

    A Bayesian model of continuous speech recognition is presented. It is based on Shortlist (D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract prelexical and lexical representations, a feedforward architecture with no online feedback, and a lexical segmentation algorithm based on the viability of chunks of the input as possible words. Shortlist B is radically different from its predecessor in two respects. First, whereas Shortlist was a connectionist model based on interactive-activation principles, Shortlist B is based on Bayesian principles. Second, the input to Shortlist B is no longer a sequence of discrete phonemes; it is a sequence of multiple phoneme probabilities over 3 time slices per segment, derived from the performance of listeners in a large-scale gating study. Simulations are presented showing that the model can account for key findings: data on the segmentation of continuous speech, word frequency effects, the effects of mispronunciations on word recognition, and evidence on lexical involvement in phonemic decision making. The success of Shortlist B suggests that listeners make optimal Bayesian decisions during spoken-word recognition.

  7. Bayesian prediction of placebo analgesia in an instrumental learning model

    PubMed Central

    Jung, Won-Mo; Lee, Ye-Seul; Wallraven, Christian; Chae, Younbyoung

    2017-01-01

    Placebo analgesia can be primarily explained by the Pavlovian conditioning paradigm in which a passively applied cue becomes associated with less pain. In contrast, instrumental conditioning employs an active paradigm that might be more similar to clinical settings. In the present study, an instrumental conditioning paradigm involving a modified trust game in a simulated clinical situation was used to induce placebo analgesia. Additionally, Bayesian modeling was applied to predict the placebo responses of individuals based on their choices. Twenty-four participants engaged in a medical trust game in which decisions to receive treatment from either a doctor (more effective with high cost) or a pharmacy (less effective with low cost) were made after receiving a reference pain stimulus. In the conditioning session, the participants received lower levels of pain following both choices, while high pain stimuli were administered in the test session even after making the decision. The choice-dependent pain in the conditioning session was modulated in terms of both intensity and uncertainty. Participants reported significantly less pain when they chose the doctor or the pharmacy for treatment compared to the control trials. The predicted pain ratings based on Bayesian modeling showed significant correlations with the actual reports from participants for both of the choice categories. The instrumental conditioning paradigm allowed for the active choice of optional cues and was able to induce the placebo analgesia effect. Additionally, Bayesian modeling successfully predicted pain ratings in a simulated clinical situation that fits well with placebo analgesia induced by instrumental conditioning. PMID:28225816

  8. DISSECTING MAGNETAR VARIABILITY WITH BAYESIAN HIERARCHICAL MODELS

    SciTech Connect

    Huppenkothen, Daniela; Elenbaas, Chris; Watts, Anna L.; Horst, Alexander J. van der; Brewer, Brendon J.; Hogg, David W.; Murray, Iain; Frean, Marcus; Levin, Yuri; Kouveliotou, Chryssa

    2015-09-01

    Neutron stars are a prime laboratory for testing physical processes under conditions of strong gravity, high density, and extreme magnetic fields. Among the zoo of neutron star phenomena, magnetars stand out for their bursting behavior, ranging from extremely bright, rare giant flares to numerous, less energetic recurrent bursts. The exact trigger and emission mechanisms for these bursts are not known; favored models involve either a crust fracture and subsequent energy release into the magnetosphere, or explosive reconnection of magnetic field lines. In the absence of a predictive model, understanding the physical processes responsible for magnetar burst variability is difficult. Here, we develop an empirical model that decomposes magnetar bursts into a superposition of small spike-like features with a simple functional form, where the number of model components is itself part of the inference problem. The cascades of spikes that we model might be formed by avalanches of reconnection, or crust rupture aftershocks. Using Markov Chain Monte Carlo sampling augmented with reversible jumps between models with different numbers of parameters, we characterize the posterior distributions of the model parameters and the number of components per burst. We relate these model parameters to physical quantities in the system, and show for the first time that the variability within a burst does not conform to predictions from ideas of self-organized criticality. We also examine how well the properties of the spikes fit the predictions of simplified cascade models for the different trigger mechanisms.

  9. AIC, BIC, Bayesian evidence against the interacting dark energy model

    NASA Astrophysics Data System (ADS)

    Szydłowski, Marek; Krawiec, Adam; Kurek, Aleksandra; Kamionka, Michał

    2015-01-01

    Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting CDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative—the CDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam's principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), , baryon acoustic oscillation, the Alcock-Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting CDM model when compared to the CDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the CDM model. Given the weak or almost non-existing support for the interacting CDM model and bearing in mind Occam's razor we are inclined to reject this model.

  10. Conditional modeling of antibody titers using a zero-inflated poisson random effects model: application to Fabrazyme.

    PubMed

    Bonate, Peter L; Sung, Crystal; Welch, Karen; Richards, Susan

    2009-10-01

    Patients that are exposed to biotechnology-derived therapeutics often develop antibodies to the therapeutic, the magnitude of which is assessed by measuring antibody titers. A statistical approach for analyzing antibody titer data conditional on seroconversion is presented. The proposed method is to first transform the antibody titer data based on a geometric series using a common ratio of 2 and a scale factor of 50 and then analyze the exponent using a zero-inflated or hurdle model assuming a Poisson or negative binomial distribution with random effects to account for patient heterogeneity. Patient specific covariates can be used to model the probability of developing an antibody response, i.e., seroconversion, as well as the magnitude of the antibody titer itself. The method was illustrated using antibody titer data from 87 male seroconverted Fabry patients receiving Fabrazyme. Titers from five clinical trials were collected over 276 weeks of therapy with anti-Fabrazyme IgG titers ranging from 100 to 409,600 after exclusion of seronegative patients. The best model to explain seroconversion was a zero-inflated Poisson (ZIP) model where cumulative dose (under a constant dose regimen of dosing every 2 weeks) influenced the probability of seroconversion. There was an 80% chance of seroconversion when the cumulative dose reached 210 mg (90% confidence interval: 194-226 mg). No difference in antibody titers was noted between Japanese or Western patients. Once seroconverted, antibody titers did not remain constant but decreased in an exponential manner from an initial magnitude to a new lower steady-state value. The expected titer after the new steady-state titer had been achieved was 870 (90% CI: 630-1109). The half-life to the new steady-state value after seroconversion was 44 weeks (90% CI: 17-70 weeks). Time to seroconversion did not appear to be correlated with titer at the time of seroconversion. The method can be adequately used to model antibody titer data.

  11. Bayesian Transformation Models for Multivariate Survival Data

    PubMed Central

    DE CASTRO, MÁRIO; CHEN, MING-HUI; IBRAHIM, JOSEPH G.; KLEIN, JOHN P.

    2014-01-01

    In this paper we propose a general class of gamma frailty transformation models for multivariate survival data. The transformation class includes the commonly used proportional hazards and proportional odds models. The proposed class also includes a family of cure rate models. Under an improper prior for the parameters, we establish propriety of the posterior distribution. A novel Gibbs sampling algorithm is developed for sampling from the observed data posterior distribution. A simulation study is conducted to examine the properties of the proposed methodology. An application to a data set from a cord blood transplantation study is also reported. PMID:24904194

  12. Bayesian Geostatistical Modeling of Leishmaniasis Incidence in Brazil

    PubMed Central

    Karagiannis-Voules, Dimitrios-Alexios; Scholte, Ronaldo G. C.; Guimarães, Luiz H.; Utzinger, Jürg; Vounatsou, Penelope

    2013-01-01

    Background Leishmaniasis is endemic in 98 countries with an estimated 350 million people at risk and approximately 2 million cases annually. Brazil is one of the most severely affected countries. Methodology We applied Bayesian geostatistical negative binomial models to analyze reported incidence data of cutaneous and visceral leishmaniasis in Brazil covering a 10-year period (2001–2010). Particular emphasis was placed on spatial and temporal patterns. The models were fitted using integrated nested Laplace approximations to perform fast approximate Bayesian inference. Bayesian variable selection was employed to determine the most important climatic, environmental, and socioeconomic predictors of cutaneous and visceral leishmaniasis. Principal Findings For both types of leishmaniasis, precipitation and socioeconomic proxies were identified as important risk factors. The predicted number of cases in 2010 were 30,189 (standard deviation [SD]: 7,676) for cutaneous leishmaniasis and 4,889 (SD: 288) for visceral leishmaniasis. Our risk maps predicted the highest numbers of infected people in the states of Minas Gerais and Pará for visceral and cutaneous leishmaniasis, respectively. Conclusions/Significance Our spatially explicit, high-resolution incidence maps identified priority areas where leishmaniasis control efforts should be targeted with the ultimate goal to reduce disease incidence. PMID:23675545

  13. The Impact of Spatial Scales and Spatial Smoothing on the Outcome of Bayesian Spatial Model

    PubMed Central

    Kang, Su Yun; McGree, James; Mengersen, Kerrie

    2013-01-01

    Discretization of a geographical region is quite common in spatial analysis. There have been few studies into the impact of different geographical scales on the outcome of spatial models for different spatial patterns. This study aims to investigate the impact of spatial scales and spatial smoothing on the outcomes of modelling spatial point-based data. Given a spatial point-based dataset (such as occurrence of a disease), we study the geographical variation of residual disease risk using regular grid cells. The individual disease risk is modelled using a logistic model with the inclusion of spatially unstructured and/or spatially structured random effects. Three spatial smoothness priors for the spatially structured component are employed in modelling, namely an intrinsic Gaussian Markov random field, a second-order random walk on a lattice, and a Gaussian field with Matérn correlation function. We investigate how changes in grid cell size affect model outcomes under different spatial structures and different smoothness priors for the spatial component. A realistic example (the Humberside data) is analyzed and a simulation study is described. Bayesian computation is carried out using an integrated nested Laplace approximation. The results suggest that the performance and predictive capacity of the spatial models improve as the grid cell size decreases for certain spatial structures. It also appears that different spatial smoothness priors should be applied for different patterns of point data. PMID:24146799

  14. 3-D model-based Bayesian classification

    SciTech Connect

    Soenneland, L.; Tenneboe, P.; Gehrmann, T.; Yrke, O.

    1994-12-31

    The challenging task of the interpreter is to integrate different pieces of information and combine them into an earth model. The sophistication level of this earth model might vary from the simplest geometrical description to the most complex set of reservoir parameters related to the geometrical description. Obviously the sophistication level also depend on the completeness of the available information. The authors describe the interpreter`s task as a mapping between the observation space and the model space. The information available to the interpreter exists in observation space and the task is to infer a model in model-space. It is well-known that this inversion problem is non-unique. Therefore any attempt to find a solution depend son constraints being added in some manner. The solution will obviously depend on which constraints are introduced and it would be desirable to allow the interpreter to modify the constraints in a problem-dependent manner. They will present a probabilistic framework that gives the interpreter the tools to integrate the different types of information and produce constrained solutions. The constraints can be adapted to the problem at hand.

  15. A Semiparametric Bayesian Model for Repeatedly Repeated Binary Outcomes

    PubMed Central

    Quintana, Fernando A.; Müller, Peter; Rosner, Gary L.; Relling, Mary V.

    2009-01-01

    Summary We discuss the analysis of data from single nucleotide polymorphism (SNP) arrays comparing tumor and normal tissues. The data consist of sequences of indicators for loss of heterozygosity (LOH) and involve three nested levels of repetition: chromosomes for a given patient, regions within chromosomes, and SNPs nested within regions. We propose to analyze these data using a semiparametric model for multi-level repeated binary data. At the top level of the hierarchy we assume a sampling model for the observed binary LOH sequences that arises from a partial exchangeability argument. This implies a mixture of Markov chains model. The mixture is defined with respect to the Markov transition probabilities. We assume a nonparametric prior for the random mixing measure. The resulting model takes the form of a semiparametric random effects model with the matrix of transition probabilities being the random effects. The model includes appropriate dependence assumptions for the two remaining levels of the hierarchy, i.e., for regions within chromosomes and for chromosomes within patient. We use the model to identify regions of increased LOH in a dataset coming from a study of treatment-related leukemia in children with an initial cancer diagnostic. The model successfully identifies the desired regions and performs well compared to other available alternatives. PMID:19746193

  16. Accurate model selection of relaxed molecular clocks in bayesian phylogenetics.

    PubMed

    Baele, Guy; Li, Wai Lok Sibon; Drummond, Alexei J; Suchard, Marc A; Lemey, Philippe

    2013-02-01

    Recent implementations of path sampling (PS) and stepping-stone sampling (SS) have been shown to outperform the harmonic mean estimator (HME) and a posterior simulation-based analog of Akaike's information criterion through Markov chain Monte Carlo (AICM), in bayesian model selection of demographic and molecular clock models. Almost simultaneously, a bayesian model averaging approach was developed that avoids conditioning on a single model but averages over a set of relaxed clock models. This approach returns estimates of the posterior probability of each clock model through which one can estimate the Bayes factor in favor of the maximum a posteriori (MAP) clock model; however, this Bayes factor estimate may suffer when the posterior probability of the MAP model approaches 1. Here, we compare these two recent developments with the HME, stabilized/smoothed HME (sHME), and AICM, using both synthetic and empirical data. Our comparison shows reassuringly that MAP identification and its Bayes factor provide similar performance to PS and SS and that these approaches considerably outperform HME, sHME, and AICM in selecting the correct underlying clock model. We also illustrate the importance of using proper priors on a large set of empirical data sets.

  17. Bayesian Thurstonian models for ranking data using JAGS.

    PubMed

    Johnson, Timothy R; Kuhn, Kristine M

    2013-09-01

    A Thurstonian model for ranking data assumes that observed rankings are consistent with those of a set of underlying continuous variables. This model is appealing since it renders ranking data amenable to familiar models for continuous response variables-namely, linear regression models. To date, however, the use of Thurstonian models for ranking data has been very rare in practice. One reason for this may be that inferences based on these models require specialized technical methods. These methods have been developed to address computational challenges involved in these models but are not easy to implement without considerable technical expertise and are not widely available in software packages. To address this limitation, we show that Bayesian Thurstonian models for ranking data can be very easily implemented with the JAGS software package. We provide JAGS model files for Thurstonian ranking models for general use, discuss their implementation, and illustrate their use in analyses.

  18. Bayesian Networks for Modeling Dredging Decisions

    DTIC Science & Technology

    2011-10-01

    comments and discussions on modeling of dredging activities. Dr . Andrew F. Casper of the Aquatic Ecology and Invasive Species Branch, Ecosystem Evaluation...report was written by Dr . Martin T. Schultz, Environmental Risk Assessment Branch, Environmental Processes and Engineering Division (EPED...Environmental Laboratory (EL); Thomas D. Borrowman, Environmental Engineering Branch, EPED, EL; and Dr . Mitchell J. Small, Department of Civil and Environmental

  19. Predicting coastal cliff erosion using a Bayesian probabilistic model

    USGS Publications Warehouse

    Hapke, Cheryl J.; Plant, Nathaniel G.

    2010-01-01

    Regional coastal cliff retreat is difficult to model due to the episodic nature of failures and the along-shore variability of retreat events. There is a growing demand, however, for predictive models that can be used to forecast areas vulnerable to coastal erosion hazards. Increasingly, probabilistic models are being employed that require data sets of high temporal density to define the joint probability density function that relates forcing variables (e.g. wave conditions) and initial conditions (e.g. cliff geometry) to erosion events. In this study we use a multi-parameter Bayesian network to investigate correlations between key variables that control and influence variations in cliff retreat processes. The network uses Bayesian statistical methods to estimate event probabilities using existing observations. Within this framework, we forecast the spatial distribution of cliff retreat along two stretches of cliffed coast in Southern California. The input parameters are the height and slope of the cliff, a descriptor of material strength based on the dominant cliff-forming lithology, and the long-term cliff erosion rate that represents prior behavior. The model is forced using predicted wave impact hours. Results demonstrate that the Bayesian approach is well-suited to the forward modeling of coastal cliff retreat, with the correct outcomes forecast in 70–90% of the modeled transects. The model also performs well in identifying specific locations of high cliff erosion, thus providing a foundation for hazard mapping. This approach can be employed to predict cliff erosion at time-scales ranging from storm events to the impacts of sea-level rise at the century-scale.

  20. Calibrating Subjective Probabilities Using Hierarchical Bayesian Models

    NASA Astrophysics Data System (ADS)

    Merkle, Edgar C.

    A body of psychological research has examined the correspondence between a judge's subjective probability of an event's outcome and the event's actual outcome. The research generally shows that subjective probabilities are noisy and do not match the "true" probabilities. However, subjective probabilities are still useful for forecasting purposes if they bear some relationship to true probabilities. The purpose of the current research is to exploit relationships between subjective probabilities and outcomes to create improved, model-based probabilities for forecasting. Once the model has been trained in situations where the outcome is known, it can then be used in forecasting situations where the outcome is unknown. These concepts are demonstrated using experimental psychology data, and potential applications are discussed.

  1. Bayesian calibration of hyperelastic constitutive models of soft tissue.

    PubMed

    Madireddy, Sandeep; Sista, Bhargava; Vemaganti, Kumar

    2016-06-01

    There is inherent variability in the experimental response used to characterize the hyperelastic mechanical response of soft tissues. This has to be accounted for while estimating the parameters in the constitutive models to obtain reliable estimates of the quantities of interest. The traditional least squares method of parameter estimation does not give due importance to this variability. We use a Bayesian calibration framework based on nested Monte Carlo sampling to account for the variability in the experimental data and its effect on the estimated parameters through a systematic probability-based treatment. We consider three different constitutive models to represent the hyperelastic nature of soft tissue: Mooney-Rivlin model, exponential model, and Ogden model. Three stress-strain data sets corresponding to the deformation of agarose gel, bovine liver tissue, and porcine brain tissue are considered. Bayesian fits and parameter estimates are compared with the corresponding least squares values. Finally, we propagate the uncertainty in the parameters to a quantity of interest (QoI), namely the force-indentation response, to study the effect of model form on the values of the QoI. Our results show that the quality of the fit alone is insufficient to determine the adequacy of the model, and due importance has to be given to the maximum likelihood value, the landscape of the likelihood distribution, and model complexity. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. DPpackage: Bayesian Semi- and Nonparametric Modeling in R

    PubMed Central

    Jara, Alejandro; Hanson, Timothy E.; Quintana, Fernando A.; Müller, Peter; Rosner, Gary L.

    2011-01-01

    Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian non- and semi-parametric models in R, DPpackage. Currently DPpackage includes models for marginal and conditional density estimation, ROC curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison, and for eliciting the precision parameter of the Dirichlet process prior. To maximize computational efficiency, the actual sampling for each model is carried out using compiled FORTRAN. PMID:21796263

  3. Bayesian inference for two-part mixed-effects model using skew distributions, with application to longitudinal semicontinuous alcohol data.

    PubMed

    Xing, Dongyuan; Huang, Yangxin; Chen, Henian; Zhu, Yiliang; Dagne, Getachew A; Baldwin, Julie

    2017-08-01

    Semicontinuous data featured with an excessive proportion of zeros and right-skewed continuous positive values arise frequently in practice. One example would be the substance abuse/dependence symptoms data for which a substantial proportion of subjects investigated may report zero. Two-part mixed-effects models have been developed to analyze repeated measures of semicontinuous data from longitudinal studies. In this paper, we propose a flexible two-part mixed-effects model with skew distributions for correlated semicontinuous alcohol data under the framework of a Bayesian approach. The proposed model specification consists of two mixed-effects models linked by the correlated random effects: (i) a model on the occurrence of positive values using a generalized logistic mixed-effects model (Part I); and (ii) a model on the intensity of positive values using a linear mixed-effects model where the model errors follow skew distributions including skew- t and skew-normal distributions (Part II). The proposed method is illustrated with an alcohol abuse/dependence symptoms data from a longitudinal observational study, and the analytic results are reported by comparing potential models under different random-effects structures. Simulation studies are conducted to assess the performance of the proposed models and method.

  4. Bayesian Piecewise Linear Mixed Models With a Random Change Point: An Application to BMI Rebound in Childhood.

    PubMed

    Brilleman, Samuel L; Howe, Laura D; Wolfe, Rory; Tilling, Kate

    2017-11-01

    Body mass index (BMI) rebound refers to the beginning of the second rise in BMI during childhood. Accurate estimation of an individual's timing of BMI rebound is important because it is associated with health outcomes in later life. We estimated BMI trajectories for 6545 children from the Avon Longitudinal Study of Parents and Children. We used a novel Bayesian two-phase piecewise linear mixed model where the "change point" was an individual-level random effect corresponding to the individual-specific timing of BMI rebound. The model's individual-level random effects (intercept, prechange slope, postchange slope, change point) were multivariate normally distributed with an unstructured variance-covariance matrix, thereby, allowing for correlation between all random effects. Average age at BMI rebound (mean change point) was 6.5 (95% credible interval: 6.4 to 6.6) years. The standard deviation of the individual-specific timing of BMI rebound (random effects) was 2.0 years for females and 1.6 years for males. Correlation between the prechange slope and change point was 0.57, suggesting that faster rates of decline in BMI prior to rebound were associated with rebound occurring at an earlier age. Simulations showed that estimates from the model were less biased than those from models, assuming a common change point for all individuals or a nonlinear trajectory based on fractional polynomials. Our model flexibly estimated the individual-specific timing of BMI rebound, while retaining parameters that are meaningful and easy to interpret. It is applicable in any situation where one wishes to estimate a change-point process which varies between individuals.

  5. Bayesian analysis of physiologically based toxicokinetic and toxicodynamic models.

    PubMed

    Hack, C Eric

    2006-04-17

    Physiologically based toxicokinetic (PBTK) and toxicodynamic (TD) models of bromate in animals and humans would improve our ability to accurately estimate the toxic doses in humans based on available animal studies. These mathematical models are often highly parameterized and must be calibrated in order for the model predictions of internal dose to adequately fit the experimentally measured doses. Highly parameterized models are difficult to calibrate and it is difficult to obtain accurate estimates of uncertainty or variability in model parameters with commonly used frequentist calibration methods, such as maximum likelihood estimation (MLE) or least squared error approaches. The Bayesian approach called Markov chain Monte Carlo (MCMC) analysis can be used to successfully calibrate these complex models. Prior knowledge about the biological system and associated model parameters is easily incorporated in this approach in the form of prior parameter distributions, and the distributions are refined or updated using experimental data to generate posterior distributions of parameter estimates. The goal of this paper is to give the non-mathematician a brief description of the Bayesian approach and Markov chain Monte Carlo analysis, how this technique is used in risk assessment, and the issues associated with this approach.

  6. Approximate Bayesian Computation for Diagnostic Model Calibration and Evaluation

    NASA Astrophysics Data System (ADS)

    Vrugt, J. A.; Sadegh, M.

    2013-12-01

    In this talk I will discuss theory, concepts and applications of Approximate Bayesian Computation (ABC) for diagnostic model calibration and evaluation. This statistical methodology relaxes the need for an explicit likelihood function in favor of one or multiple different summary statistics rooted in hydrologic theory that together have a more clear and compelling diagnostic power than some average measure of the size of the error residuals. A few illustrative case studies are used to demonstrate that ABC is relatively easy to implement, and readily employs signature based indices to analyze and pinpoint which part of the model is malfunctioning and in need of further improvement.

  7. Modeling Women's Menstrual Cycles using PICI Gates in Bayesian Network.

    PubMed

    Zagorecki, Adam; Łupińska-Dubicka, Anna; Voortman, Mark; Druzdzel, Marek J

    2016-03-01

    A major difficulty in building Bayesian network (BN) models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a number of parameters that is linear in the number of parents. In this paper, we introduce a new class of parametric models, the Probabilistic Independence of Causal Influences (PICI) models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of efficiently modeling a variety of interactions. A subset of PICI models is decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We present an application of the proposed method to learning dynamic BNs for modeling a woman's menstrual cycle. We show that PICI models are especially useful for parameter learning from small data sets and lead to higher parameter accuracy than when learning CPTs.

  8. Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty.

    PubMed

    Baele, Guy; Lemey, Philippe; Suchard, Marc A

    2016-03-01

    Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of "working distributions" to facilitate--or shorten--the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a "working" distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different "working" distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. © The Author(s) 2015. Published by Oxford

  9. Study of TEC fluctuation via stochastic models and Bayesian inversion

    NASA Astrophysics Data System (ADS)

    Bires, A.; Roininen, L.; Damtie, B.; Nigussie, M.; Vanhamäki, H.

    2016-11-01

    We propose stochastic processes to be used to model the total electron content (TEC) observation. Based on this, we model the rate of change of TEC (ROT) variation during ionospheric quiet conditions with stationary processes. During ionospheric disturbed conditions, for example, when irregularity in ionospheric electron density distribution occurs, stationarity assumption over long time periods is no longer valid. In these cases, we make the parameter estimation for short time scales, during which we can assume stationarity. We show the relationship between the new method and commonly used TEC characterization parameters ROT and the ROT Index (ROTI). We construct our parametric model within the framework of Bayesian statistical inverse problems and hence give the solution as an a posteriori probability distribution. Bayesian framework allows us to model measurement errors systematically. Similarly, we mitigate variation of TEC due to factors which are not of ionospheric origin, like due to the motion of satellites relative to the receiver, by incorporating a priori knowledge in the Bayesian model. In practical computations, we draw the so-called maximum a posteriori estimates, which are our ROT and ROTI estimates, from the posterior distribution. Because the algorithm allows to estimate ROTI at each observation time, the estimator does not depend on the period of time for ROTI computation. We verify the method by analyzing TEC data recorded by GPS receiver located in Ethiopia (11.6°N, 37.4°E). The results indicate that the TEC fluctuations caused by the ionospheric irregularity can be effectively detected and quantified from the estimated ROT and ROTI values.

  10. A combined gamma frailty and normal random-effects model for repeated, overdispersed time-to-event data.

    PubMed

    Molenberghs, Geert; Verbeke, Geert; Efendi, Achmad; Braekers, Roel; Demétrio, Clarice G B

    2015-08-01

    This paper presents, extends, and studies a model for repeated, overdispersed time-to-event outcomes, subject to censoring. Building upon work by Molenberghs, Verbeke, and Demétrio (2007) and Molenberghs et al. (2010), gamma and normal random effects are included in a Weibull model, to account for overdispersion and between-subject effects, respectively. Unlike these authors, censoring is allowed for, and two estimation methods are presented. The partial marginalization approach to full maximum likelihood of Molenberghs et al. (2010) is contrasted with pseudo-likelihood estimation. A limited simulation study is conducted to examine the relative merits of these estimation methods. The modeling framework is employed to analyze data on recurrent asthma attacks in children on the one hand and on survival in cancer patients on the other.

  11. Two Bayesian tests of the GLOMOsys Model.

    PubMed

    Field, Sarahanne M; Wagenmakers, Eric-Jan; Newell, Ben R; Zeelenberg, René; van Ravenzwaaij, Don

    2016-12-01

    Priming is arguably one of the key phenomena in contemporary social psychology. Recent retractions and failed replication attempts have led to a division in the field between proponents and skeptics and have reinforced the importance of confirming certain priming effects through replication. In this study, we describe the results of 2 preregistered replication attempts of 1 experiment by Förster and Denzler (2012). In both experiments, participants first processed letters either globally or locally, then were tested using a typicality rating task. Bayes factor hypothesis tests were conducted for both experiments: Experiment 1 (N = 100) yielded an indecisive Bayes factor of 1.38, indicating that the in-lab data are 1.38 times more likely to have occurred under the null hypothesis than under the alternative. Experiment 2 (N = 908) yielded a Bayes factor of 10.84, indicating strong support for the null hypothesis that global priming does not affect participants' mean typicality ratings. The failure to replicate this priming effect challenges existing support for the GLOMO(sys) model. (PsycINFO Database Record

  12. Bayesian joint modeling of longitudinal and spatial survival AIDS data.

    PubMed

    Martins, Rui; Silva, Giovani L; Andreozzi, Valeska

    2016-08-30

    Joint analysis of longitudinal and survival data has received increasing attention in the recent years, especially for analyzing cancer and AIDS data. As both repeated measurements (longitudinal) and time-to-event (survival) outcomes are observed in an individual, a joint modeling is more appropriate because it takes into account the dependence between the two types of responses, which are often analyzed separately. We propose a Bayesian hierarchical model for jointly modeling longitudinal and survival data considering functional time and spatial frailty effects, respectively. That is, the proposed model deals with non-linear longitudinal effects and spatial survival effects accounting for the unobserved heterogeneity among individuals living in the same region. This joint approach is applied to a cohort study of patients with HIV/AIDS in Brazil during the years 2002-2006. Our Bayesian joint model presents considerable improvements in the estimation of survival times of the Brazilian HIV/AIDS patients when compared with those obtained through a separate survival model and shows that the spatial risk of death is the same across the different Brazilian states. Copyright © 2016 John Wiley & Sons, Ltd.

  13. Bayesian model-based inference of transcription factor activity

    PubMed Central

    Rogers, Simon; Khanin, Raya; Girolami, Mark

    2007-01-01

    Background In many approaches to the inference and modeling of regulatory interactions using microarray data, the expression of the gene coding for the transcription factor is considered to be an accurate surrogate for the true activity of the protein it produces. There are many instances where this is inaccurate due to post-translational modifications of the transcription factor protein. Inference of the activity of the transcription factor from the expression of its targets has predominantly involved linear models that do not reflect the nonlinear nature of transcription. We extend a recent approach to inferring the transcription factor activity based on nonlinear Michaelis-Menten kinetics of transcription from maximum likelihood to fully Bayesian inference and give an example of how the model can be further developed. Results We present results on synthetic and real microarray data. Additionally, we illustrate how gene and replicate specific delays can be incorporated into the model. Conclusion We demonstrate that full Bayesian inference is appropriate in this application and has several benefits over the maximum likelihood approach, especially when the volume of data is limited. We also show the benefits of using a non-linear model over a linear model, particularly in the case of repression. PMID:17493251

  14. A Bayesian subgroup analysis using collections of ANOVA models.

    PubMed

    Liu, Jinzhong; Sivaganesan, Siva; Laud, Purushottam W; Müller, Peter

    2017-03-20

    We develop a Bayesian approach to subgroup analysis using ANOVA models with multiple covariates, extending an earlier work. We assume a two-arm clinical trial with normally distributed response variable. We also assume that the covariates for subgroup finding are categorical and are a priori specified, and parsimonious easy-to-interpret subgroups are preferable. We represent the subgroups of interest by a collection of models and use a model selection approach to finding subgroups with heterogeneous effects. We develop suitable priors for the model space and use an objective Bayesian approach that yields multiplicity adjusted posterior probabilities for the models. We use a structured algorithm based on the posterior probabilities of the models to determine which subgroup effects to report. Frequentist operating characteristics of the approach are evaluated using simulation. While our approach is applicable in more general cases, we mainly focus on the 2 × 2 case of two covariates each at two levels for ease of presentation. The approach is illustrated using a real data example.

  15. A study of finite mixture model: Bayesian approach on financial time series data

    NASA Astrophysics Data System (ADS)

    Phoong, Seuk-Yen; Ismail, Mohd Tahir

    2014-07-01

    Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.

  16. Objective Bayesian Comparison of Constrained Analysis of Variance Models.

    PubMed

    Consonni, Guido; Paroli, Roberta

    2016-10-04

    In the social sciences we are often interested in comparing models specified by parametric equality or inequality constraints. For instance, when examining three group means [Formula: see text] through an analysis of variance (ANOVA), a model may specify that [Formula: see text], while another one may state that [Formula: see text], and finally a third model may instead suggest that all means are unrestricted. This is a challenging problem, because it involves a combination of nonnested models, as well as nested models having the same dimension. We adopt an objective Bayesian approach, requiring no prior specification from the user, and derive the posterior probability of each model under consideration. Our method is based on the intrinsic prior methodology, suitably modified to accommodate equality and inequality constraints. Focussing on normal ANOVA models, a comparative assessment is carried out through simulation studies. We also present an application to real data collected in a psychological experiment.

  17. Bayesian model comparison in cosmology with Population Monte Carlo

    NASA Astrophysics Data System (ADS)

    Kilbinger, Martin; Wraith, Darren; Robert, Christian P.; Benabed, Karim; Cappé, Olivier; Cardoso, Jean-François; Fort, Gersende; Prunet, Simon; Bouchet, François R.

    2010-07-01

    We use Bayesian model selection techniques to test extensions of the standard flat Λ cold dark matter (ΛCDM) paradigm. Dark-energy and curvature scenarios, and primordial perturbation models are considered. To that end, we calculate the Bayesian evidence in favour of each model using Population Monte Carlo (PMC), a new adaptive sampling technique which was recently applied in a cosmological context. In contrast to the case of other sampling-based inference techniques such as Markov chain Monte Carlo (MCMC), the Bayesian evidence is immediately available from the PMC sample used for parameter estimation without further computational effort, and it comes with an associated error evaluation. Also, it provides an unbiased estimator of the evidence after any fixed number of iterations and it is naturally parallelizable, in contrast with MCMC and nested sampling methods. By comparison with analytical predictions for simulated data, we show that our results obtained with PMC are reliable and robust. The variability in the evidence evaluation and the stability for various cases are estimated both from simulations and from data. For the cases we consider, the log-evidence is calculated with a precision of better than 0.08. Using a combined set of recent cosmic microwave background, type Ia supernovae and baryonic acoustic oscillation data, we find inconclusive evidence between flat ΛCDM and simple dark-energy models. A curved universe is moderately to strongly disfavoured with respect to a flat cosmology. Using physically well-motivated priors within the slow-roll approximation of inflation, we find a weak preference for a running spectral index. A Harrison-Zel'dovich spectrum is weakly disfavoured. With the current data, tensor modes are not detected; the large prior volume on the tensor-to-scalar ratio r results in moderate evidence in favour of r = 0.

  18. Genetic analysis of somatic cell scores in US Holsteins with a Bayesian mixture model.

    PubMed

    Boettcher, P J; Caraviello, D; Gianola, D

    2007-01-01

    The objective of this study was to apply finite mixture models to field data for somatic cell scores (SCS) for estimation of genetic parameters. Data were approximately 170,000 test-day records for SCS from first-parity Holstein cows in Wisconsin. Five different models of increasing level of complexity were fitted. Model 1 was the standard single-component model, and the others were 2-component Gaussian mixtures consisting of similar but distinct linear models. All mixture models (i.e., 2 to 5) included separate means for the 2 components. Model 2 assumed entirely homogeneous variances for both components. Models 3 and 4 assumed heterogeneous variances for either residual (model 3) or genetic and permanent environmental variances (model 4). Model 5 was the most complex, in which variances of all random effects were allowed to vary across components. A Bayesian approach was applied and Gibbs sampling was used to obtain posterior estimates. Five chains of 205,000 cycles were generated for each model. Estimates of variance components were based on posterior means. Models were compared by use of the deviance information criterion. Based on the deviance information criterion, all mixture models were superior to the linear model for analysis of SCS. The best model was one in which genetic and PE variances were heterogeneous, but residual variances were homogeneous. The genetic analysis suggested that SCS in healthy and infected cattle are different traits, because the genetic correlation between SCS in the 2 components of 0.13 was significantly different from unity.

  19. Predictive RANS simulations via Bayesian Model-Scenario Averaging

    NASA Astrophysics Data System (ADS)

    Edeling, W. N.; Cinnella, P.; Dwight, R. P.

    2014-10-01

    The turbulence closure model is the dominant source of error in most Reynolds-Averaged Navier-Stokes simulations, yet no reliable estimators for this error component currently exist. Here we develop a stochastic, a posteriori error estimate, calibrated to specific classes of flow. It is based on variability in model closure coefficients across multiple flow scenarios, for multiple closure models. The variability is estimated using Bayesian calibration against experimental data for each scenario, and Bayesian Model-Scenario Averaging (BMSA) is used to collate the resulting posteriors, to obtain a stochastic estimate of a Quantity of Interest (QoI) in an unmeasured (prediction) scenario. The scenario probabilities in BMSA are chosen using a sensor which automatically weights those scenarios in the calibration set which are similar to the prediction scenario. The methodology is applied to the class of turbulent boundary-layers subject to various pressure gradients. For all considered prediction scenarios the standard-deviation of the stochastic estimate is consistent with the measurement ground truth. Furthermore, the mean of the estimate is more consistently accurate than the individual model predictions.

  20. Quantum-Like Bayesian Networks for Modeling Decision Making.

    PubMed

    Moreira, Catarina; Wichert, Andreas

    2016-01-01

    In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios.

  1. Quantum-Like Bayesian Networks for Modeling Decision Making

    PubMed Central

    Moreira, Catarina; Wichert, Andreas

    2016-01-01

    In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios. PMID:26858669

  2. A Bayesian Multilevel Model for Microcystin Prediction in ...

    EPA Pesticide Factsheets

    The frequency of cyanobacteria blooms in North American lakes is increasing. A major concernwith rising cyanobacteria blooms is microcystin, a common cyanobacterial hepatotoxin. Toexplore the conditions that promote high microcystin concentrations, we analyzed the US EPANational Lake Assessment (NLA) dataset collected in the summer of 2007. The NLA datasetis reported for nine eco-regions. We used the results of random forest modeling as a means ofvariable selection from which we developed a Bayesian multilevel model of microcystin concentrations.Model parameters under a multilevel modeling framework are eco-region specific, butthey are also assumed to be exchangeable across eco-regions for broad continental scaling. Theexchangeability assumption ensures that both the common patterns and eco-region specific featureswill be reflected in the model. Furthermore, the method incorporates appropriate estimatesof uncertainty. Our preliminary results show associations between microcystin and turbidity, totalnutrients, and N:P ratios. The NLA 2012 will be used for Bayesian updating. The results willhelp develop management strategies to alleviate microcystin impacts and improve lake quality. This work provides a probabilistic framework for predicting microcystin presences in lakes. It would allow for insights to be made about how changes in nutrient concentrations could potentially change toxin levels.

  3. A Bayesian Multilevel Model for Microcystin Prediction in ...

    EPA Pesticide Factsheets

    The frequency of cyanobacteria blooms in North American lakes is increasing. A major concern with rising cyanobacteria blooms is microcystin, a common cyanobacterial hepatotoxin. To explore the conditions that promote high microcystin concentrations, we analyzed the US EPA National Lake Assessment (NLA) dataset collected in the summer of 2007. The NLA dataset is reported for nine eco-regions. We used the results of random forest modeling as a means ofvariable selection from which we developed a Bayesian multilevel model of microcystin concentrations. Model parameters under a multilevel modeling framework are eco-region specific, butthey are also assumed to be exchangeable across eco-regions for broad continental scaling. The exchangeability assumption ensures that both the common patterns and eco-region specific features will be reflected in the model. Furthermore, the method incorporates appropriate estimates of uncertainty. Our preliminary results show associations between microcystin and turbidity, total nutrients, and N:P ratios. Upon release of a comparable 2012 NLA dataset, we will apply Bayesian updating. The results will help develop management strategies to alleviate microcystin impacts and improve lake quality. This work provides a probabilistic framework for predicting microcystin presences in lakes. It would allow for insights to be made about how changes in nutrient concentrations could potentially change toxin levels.

  4. Predictive RANS simulations via Bayesian Model-Scenario Averaging

    SciTech Connect

    Edeling, W.N.; Cinnella, P.; Dwight, R.P.

    2014-10-15

    The turbulence closure model is the dominant source of error in most Reynolds-Averaged Navier–Stokes simulations, yet no reliable estimators for this error component currently exist. Here we develop a stochastic, a posteriori error estimate, calibrated to specific classes of flow. It is based on variability in model closure coefficients across multiple flow scenarios, for multiple closure models. The variability is estimated using Bayesian calibration against experimental data for each scenario, and Bayesian Model-Scenario Averaging (BMSA) is used to collate the resulting posteriors, to obtain a stochastic estimate of a Quantity of Interest (QoI) in an unmeasured (prediction) scenario. The scenario probabilities in BMSA are chosen using a sensor which automatically weights those scenarios in the calibration set which are similar to the prediction scenario. The methodology is applied to the class of turbulent boundary-layers subject to various pressure gradients. For all considered prediction scenarios the standard-deviation of the stochastic estimate is consistent with the measurement ground truth. Furthermore, the mean of the estimate is more consistently accurate than the individual model predictions.

  5. Bayesian Modeling of Biomolecular Assemblies with Cryo-EM Maps

    PubMed Central

    Habeck, Michael

    2017-01-01

    A growing array of experimental techniques allows us to characterize the three-dimensional structure of large biological assemblies at increasingly higher resolution. In addition to X-ray crystallography and nuclear magnetic resonance in solution, new structure determination methods such cryo-electron microscopy (cryo-EM), crosslinking/mass spectrometry and solid-state NMR have emerged. Often it is not sufficient to use a single experimental method, but complementary data need to be collected by using multiple techniques. The integration of all datasets can only be achieved by computational means. This article describes Inferential structure determination, a Bayesian approach to integrative modeling of biomolecular complexes with hybrid structural data. I will introduce probabilistic models for cryo-EM maps and outline Markov chain Monte Carlo algorithms for sampling model structures from the posterior distribution. I will focus on rigid and flexible modeling with cryo-EM data and discuss some of the computational challenges of Bayesian inference in the context of biomolecular modeling. PMID:28382301

  6. Assessing uncertainty in a stand growth model by Bayesian synthesis

    SciTech Connect

    Green, E.J.; MacFarlane, D.W.; Valentine, H.T.; Strawderman, W.E.

    1999-11-01

    The Bayesian synthesis method (BSYN) was used to bound the uncertainty in projections calculated with PIPESTEM, a mechanistic model of forest growth. The application furnished posterior distributions of (a) the values of the model's parameters, and (b) the values of three of the model's output variables--basal area per unit land area, average tree height, and tree density--at different points in time. Confidence or credible intervals for the output variables were obtained directly from the posterior distributions. The application also provides estimates of correlation among the parameters and output variables. BSYN, which originally was applied to a population dynamics model for bowhead whales, is generally applicable to deterministic models. Extension to two or more linked models is discussed. A simple worked example is included in an appendix.

  7. Bayesian Variable Selection on Model Spaces Constrained by Heredity Conditions.

    PubMed

    Taylor-Rodriguez, Daniel; Womack, Andrew; Bliznyuk, Nikolay

    2016-01-01

    This paper investigates Bayesian variable selection when there is a hierarchical dependence structure on the inclusion of predictors in the model. In particular, we study the type of dependence found in polynomial response surfaces of orders two and higher, whose model spaces are required to satisfy weak or strong heredity conditions. These conditions restrict the inclusion of higher-order terms depending upon the inclusion of lower-order parent terms. We develop classes of priors on the model space, investigate their theoretical and finite sample properties, and provide a Metropolis-Hastings algorithm for searching the space of models. The tools proposed allow fast and thorough exploration of model spaces that account for hierarchical polynomial structure in the predictors and provide control of the inclusion of false positives in high posterior probability models.

  8. Bayesian Variable Selection on Model Spaces Constrained by Heredity Conditions

    PubMed Central

    Taylor-Rodriguez, Daniel; Womack, Andrew; Bliznyuk, Nikolay

    2016-01-01

    This paper investigates Bayesian variable selection when there is a hierarchical dependence structure on the inclusion of predictors in the model. In particular, we study the type of dependence found in polynomial response surfaces of orders two and higher, whose model spaces are required to satisfy weak or strong heredity conditions. These conditions restrict the inclusion of higher-order terms depending upon the inclusion of lower-order parent terms. We develop classes of priors on the model space, investigate their theoretical and finite sample properties, and provide a Metropolis-Hastings algorithm for searching the space of models. The tools proposed allow fast and thorough exploration of model spaces that account for hierarchical polynomial structure in the predictors and provide control of the inclusion of false positives in high posterior probability models. PMID:28082825

  9. Assessing global vegetation activity using spatio-temporal Bayesian modelling

    NASA Astrophysics Data System (ADS)

    Mulder, Vera L.; van Eck, Christel M.; Friedlingstein, Pierre; Regnier, Pierre A. G.

    2016-04-01

    This work demonstrates the potential of modelling vegetation activity using a hierarchical Bayesian spatio-temporal model. This approach allows modelling changes in vegetation and climate simultaneous in space and time. Changes of vegetation activity such as phenology are modelled as a dynamic process depending on climate variability in both space and time. Additionally, differences in observed vegetation status can be contributed to other abiotic ecosystem properties, e.g. soil and terrain properties. Although these properties do not change in time, they do change in space and may provide valuable information in addition to the climate dynamics. The spatio-temporal Bayesian models were calibrated at a regional scale because the local trends in space and time can be better captured by the model. The regional subsets were defined according to the SREX segmentation, as defined by the IPCC. Each region is considered being relatively homogeneous in terms of large-scale climate and biomes, still capturing small-scale (grid-cell level) variability. Modelling within these regions is hence expected to be less uncertain due to the absence of these large-scale patterns, compared to a global approach. This overall modelling approach allows the comparison of model behavior for the different regions and may provide insights on the main dynamic processes driving the interaction between vegetation and climate within different regions. The data employed in this study encompasses the global datasets for soil properties (SoilGrids), terrain properties (Global Relief Model based on SRTM DEM and ETOPO), monthly time series of satellite-derived vegetation indices (GIMMS NDVI3g) and climate variables (Princeton Meteorological Forcing Dataset). The findings proved the potential of a spatio-temporal Bayesian modelling approach for assessing vegetation dynamics, at a regional scale. The observed interrelationships of the employed data and the different spatial and temporal trends support

  10. A Comparison of General Diagnostic Models (GDM) and Bayesian Networks Using a Middle School Mathematics Test

    ERIC Educational Resources Information Center

    Wu, Haiyan

    2013-01-01

    General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…

  11. A Comparison of General Diagnostic Models (GDM) and Bayesian Networks Using a Middle School Mathematics Test

    ERIC Educational Resources Information Center

    Wu, Haiyan

    2013-01-01

    General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…

  12. Mapping soil water retention curves via spatial Bayesian hierarchical models

    NASA Astrophysics Data System (ADS)

    Yang, Wen-Hsi; Clifford, David; Minasny, Budiman

    2015-05-01

    Soil water retention curves are an important parameter in soil hydrological modeling. These curves are usually represented by the van Genuchten model. Two approaches have previously been taken to predict curves across a field - interpolation of field measurements followed by estimation of the van Genuchten model parameters, or estimation of the parameters according to field measurements followed by interpolation of the estimated parameters. Neither approach is ideal as, due to their two-stage nature, they fail to properly track uncertainty from one stage to the next. In this paper we address this shortcoming through a spatial Bayesian hierarchical model that fits the van Genuchten model and predicts the fields of hydraulic parameters of the van Genuchten model as well as fields of the corresponding soil water retention curves. This approach expands the van Genuchten model to a hierarchical modeling framework. In this framework, soil properties and physical or environmental factors can be treated as covariates to add into the van Genuchten model hierarchically. Consequently, the effects of covariates on the hydraulic parameters of the van Genuchten model can be identified. In addition, our approach takes advantage of Bayesian analysis to account for uncertainty and overcome the shortcomings of other existing methods. The code used to fit these models are available as an appendix to this paper. We apply this approach to data surveyed from part of the alluvial plain of the river Rhône near Yenne in Savoie, France. In this data analysis, we demonstrate how the inclusion of soil type or spatial effects can improve the van Genuchten model's predictions of soil water retention curves.

  13. Predicting brain activity using a Bayesian spatial model.

    PubMed

    Derado, Gordana; Bowman, F Dubois; Zhang, Lijun

    2013-08-01

    Increasing the clinical applicability of functional neuroimaging technology is an emerging objective, e.g. for diagnostic and treatment purposes. We propose a novel Bayesian spatial hierarchical framework for predicting follow-up neural activity based on an individual's baseline functional neuroimaging data. Our approach attempts to overcome some shortcomings of the modeling methods used in other neuroimaging settings, by borrowing strength from the spatial correlations present in the data. Our proposed methodology is applicable to data from various imaging modalities including functional magnetic resonance imaging and positron emission tomography, and we provide an illustration here using positron emission tomography data from a study of Alzheimer's disease to predict disease progression.

  14. Theory-based Bayesian models of inductive learning and reasoning.

    PubMed

    Tenenbaum, Joshua B; Griffiths, Thomas L; Kemp, Charles

    2006-07-01

    Inductive inference allows humans to make powerful generalizations from sparse data when learning about word meanings, unobserved properties, causal relationships, and many other aspects of the world. Traditional accounts of induction emphasize either the power of statistical learning, or the importance of strong constraints from structured domain knowledge, intuitive theories or schemas. We argue that both components are necessary to explain the nature, use and acquisition of human knowledge, and we introduce a theory-based Bayesian framework for modeling inductive learning and reasoning as statistical inferences over structured knowledge representations.

  15. Approximate Bayesian computation for forward modeling in cosmology

    SciTech Connect

    Akeret, Joël; Refregier, Alexandre; Amara, Adam; Seehars, Sebastian; Hasner, Caspar E-mail: alexandre.refregier@phys.ethz.ch E-mail: sebastian.seehars@phys.ethz.ch

    2015-08-01

    Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to the posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release.

  16. Bayesian Sensitivity Analysis of Statistical Models with Missing Data

    PubMed Central

    ZHU, HONGTU; IBRAHIM, JOSEPH G.; TANG, NIANSHENG

    2013-01-01

    Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures. PMID:24753718

  17. Approximate Bayesian computation for forward modeling in cosmology

    NASA Astrophysics Data System (ADS)

    Akeret, Joël; Refregier, Alexandre; Amara, Adam; Seehars, Sebastian; Hasner, Caspar

    2015-08-01

    Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to the posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release.

  18. Bayesian Models for fMRI Data Analysis

    PubMed Central

    Zhang, Linlin; Guindani, Michele; Vannucci, Marina

    2015-01-01

    Functional magnetic resonance imaging (fMRI), a noninvasive neuroimaging method that provides an indirect measure of neuronal activity by detecting blood flow changes, has experienced an explosive growth in the past years. Statistical methods play a crucial role in understanding and analyzing fMRI data. Bayesian approaches, in particular, have shown great promise in applications. A remarkable feature of fully Bayesian approaches is that they allow a flexible modeling of spatial and temporal correlations in the data. This paper provides a review of the most relevant models developed in recent years. We divide methods according to the objective of the analysis. We start from spatio-temporal models for fMRI data that detect task-related activation patterns. We then address the very important problem of estimating brain connectivity. We also touch upon methods that focus on making predictions of an individual's brain activity or a clinical or behavioral response. We conclude with a discussion of recent integrative models that aim at combining fMRI data with other imaging modalities, such as EEG/MEG and DTI data, measured on the same subjects. We also briefly discuss the emerging field of imaging genetics. PMID:25750690

  19. Analysis of runoff extremes using spatial hierarchical Bayesian modeling

    NASA Astrophysics Data System (ADS)

    Reza Najafi, Mohammad; Moradkhani, Hamid

    2013-10-01

    A spatial hierarchical Bayesian method is developed to model the extreme runoffs over two spatial domains in Columbia River Basin, USA. This method combines the limited number of data from different locations. The two spatial domains contain 31 and 20 gage stations, respectively, with daily streamflow records ranging from 30 to over 130 years. The generalized Pareto distribution (GPD) is employed for the analysis of extremes. Temporally independent data are generated using declustering procedure, where runoff extremes are first grouped into clusters and then the maximum of each cluster is retained. The GPD scale parameter is modeled based on a Gaussian geostatistical process and additional variables including the latitude, longitude, elevation, and drainage area are incorporated by means of a hierarchy. Metropolis-Hasting within Gibbs Sampler is used to infer the parameters of the GPD and the geostatistical process to estimate the return levels across the basins. The performance of the hierarchical Bayesian model is evaluated by comparing the estimates of 100 year return level floods with the maximum likelihood estimates at sites that are not used during the parameter inference process. Various prior distributions are used to assess the sensitivity of the posterior distributions. The selected model is then employed to estimate floods with different return levels in time slices of 15 years in order to detect possible trends in runoff extremes. The results show cyclic variations in the spatial average of the 100 year return level floods across the basins with consistent increasing trends distinguishable in some areas.

  20. Model Selection in Historical Research Using Approximate Bayesian Computation

    PubMed Central

    Rubio-Campillo, Xavier

    2016-01-01

    Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953

  1. Preferential sampling and Bayesian geostatistics: Statistical modeling and examples.

    PubMed

    Cecconi, Lorenzo; Grisotto, Laura; Catelan, Dolores; Lagazio, Corrado; Berrocal, Veronica; Biggeri, Annibale

    2016-08-01

    Preferential sampling refers to any situation in which the spatial process and the sampling locations are not stochastically independent. In this paper, we present two examples of geostatistical analysis in which the usual assumption of stochastic independence between the point process and the measurement process is violated. To account for preferential sampling, we specify a flexible and general Bayesian geostatistical model that includes a shared spatial random component. We apply the proposed model to two different case studies that allow us to highlight three different modeling and inferential aspects of geostatistical modeling under preferential sampling: (1) continuous or finite spatial sampling frame; (2) underlying causal model and relevant covariates; and (3) inferential goals related to mean prediction surface or prediction uncertainty.

  2. Bayesian spatiotemporal model of fMRI data.

    PubMed

    Quirós, Alicia; Diez, Raquel Montes; Gamerman, Dani

    2010-01-01

    This research describes a new Bayesian spatiotemporal model to analyse block-design BOLD fMRI studies. In the temporal dimension, we parameterise the hemodynamic response function's (HRF) shape with a potential increase of signal and a subsequent exponential decay. In the spatial dimension, we use Gaussian Markov random fields (GMRF) priors on activation characteristics parameters (location and magnitude) that embody our prior knowledge that evoked responses are spatially contiguous and locally homogeneous. The result is a spatiotemporal model with a small number of parameters, all of them interpretable. Simulations from the model are performed in order to ascertain the performance of the sampling scheme and the ability of the posterior to estimate model parameters, as well as to check the model sensitivity to signal to noise ratio. Results are shown on synthetic data and on real data from a block-design fMRI experiment.

  3. Efficient multilevel brain tumor segmentation with integrated bayesian model classification.

    PubMed

    Corso, J J; Sharon, E; Dube, S; El-Saden, S; Sinha, U; Yuille, A

    2008-05-01

    We present a new method for automatic segmentation of heterogeneous image data that takes a step toward bridging the gap between bottom-up affinity-based segmentation methods and top-down generative model based approaches. The main contribution of the paper is a Bayesian formulation for incorporating soft model assignments into the calculation of affinities, which are conventionally model free. We integrate the resulting model-aware affinities into the multilevel segmentation by weighted aggregation algorithm, and apply the technique to the task of detecting and segmenting brain tumor and edema in multichannel magnetic resonance (MR) volumes. The computationally efficient method runs orders of magnitude faster than current state-of-the-art techniques giving comparable or improved results. Our quantitative results indicate the benefit of incorporating model-aware affinities into the segmentation process for the difficult case of glioblastoma multiforme brain tumor.

  4. Exploratory Bayesian model selection for serial genetics data.

    PubMed

    Zhao, Jing X; Foulkes, Andrea S; George, Edward I

    2005-06-01

    Characterizing the process by which molecular and cellular level changes occur over time will have broad implications for clinical decision making and help further our knowledge of disease etiology across many complex diseases. However, this presents an analytic challenge due to the large number of potentially relevant biomarkers and the complex, uncharacterized relationships among them. We propose an exploratory Bayesian model selection procedure that searches for model simplicity through independence testing of multiple discrete biomarkers measured over time. Bayes factor calculations are used to identify and compare models that are best supported by the data. For large model spaces, i.e., a large number of multi-leveled biomarkers, we propose a Markov chain Monte Carlo (MCMC) stochastic search algorithm for finding promising models. We apply our procedure to explore the extent to which HIV-1 genetic changes occur independently over time.

  5. Examples of Mixed-Effects Modeling with Crossed Random Effects and with Binomial Data

    ERIC Educational Resources Information Center

    Quene, Hugo; van den Bergh, Huub

    2008-01-01

    Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not nested. Traditional ANOVAs are compared against…

  6. Examples of Mixed-Effects Modeling with Crossed Random Effects and with Binomial Data

    ERIC Educational Resources Information Center

    Quene, Hugo; van den Bergh, Huub

    2008-01-01

    Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not nested. Traditional ANOVAs are compared against…

  7. An intuitive Bayesian spatial model for disease mapping that accounts for scaling.

    PubMed

    Riebler, Andrea; Sørbye, Sigrunn H; Simpson, Daniel; Rue, Håvard

    2016-08-01

    In recent years, disease mapping studies have become a routine application within geographical epidemiology and are typically analysed within a Bayesian hierarchical model formulation. A variety of model formulations for the latent level have been proposed but all come with inherent issues. In the classical BYM (Besag, York and Mollié) model, the spatially structured component cannot be seen independently from the unstructured component. This makes prior definitions for the hyperparameters of the two random effects challenging. There are alternative model formulations that address this confounding; however, the issue on how to choose interpretable hyperpriors is still unsolved. Here, we discuss a recently proposed parameterisation of the BYM model that leads to improved parameter control as the hyperparameters can be seen independently from each other. Furthermore, the need for a scaled spatial component is addressed, which facilitates assignment of interpretable hyperpriors and make these transferable between spatial applications with different graph structures. The hyperparameters themselves are used to define flexible extensions of simple base models. Consequently, penalised complexity priors for these parameters can be derived based on the information-theoretic distance from the flexible model to the base model, giving priors with clear interpretation. We provide implementation details for the new model formulation which preserve sparsity properties, and we investigate systematically the model performance and compare it to existing parameterisations. Through a simulation study, we show that the new model performs well, both showing good learning abilities and good shrinkage behaviour. In terms of model choice criteria, the proposed model performs at least equally well as existing parameterisations, but only the new formulation offers parameters that are interpretable and hyperpriors that have a clear meaning.

  8. Model Selection in Historical Research Using Approximate Bayesian Computation.

    PubMed

    Rubio-Campillo, Xavier

    2016-01-01

    Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester's laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence.

  9. Efficient estimation of thermodynamic state incorporating Bayesian model order selection

    NASA Astrophysics Data System (ADS)

    Lanterman, Aaron D.; Cooper, Matthew L.; Miller, Michael I.

    1999-08-01

    The recognition of targets in infrared scenes is complicated by the wide variety of appearances associated with different thermodynamic states. We represent the variability in the thermodynamic signatures of targets via an expansion in terms of 'eigentanks' derived from a principal component analysis performed over the target's surface. Employing a Poisson sensor likelihood, or equivalently a likelihood based on Csiszar's I-divergence, a natural discrepancy measure for nonnegative images, yields a coupled set of nonlinear equations which must be solved to computed maximum a posteriori estimates of the thermodynamic expansion coefficients. We propose a weighted least-squares approximation to the Poisson loglikelihood for which the MAP estimates are solutions of linear equations. Bayesian model order estimation techniques are employed to choose the number of coefficients; this prevents target models with numerous eigentanks in their representation from having an unfair advantage over simple target models. The Bayesian integral is approximated by Schwarz's application of Laplace's method of integration; this technique is closely related to Rissanen's minimum description length and Wallace's minimum message length criteria. Our implementation of these techniques on Silicon Graphics computers exploits the flexible nature of their rendering engines. The implementation is illustrated in estimating the orientation of a tank and the optimum number of representative eigentanks for real data provided by the U.S. Army Night Vision and Electronic Sensors Directorate.

  10. Fuzzy Naive Bayesian model for medical diagnostic decision support.

    PubMed

    Wagholikar, Kavishwar B; Vijayraghavan, Sundararajan; Deshpande, Ashok W

    2009-01-01

    This work relates to the development of computational algorithms to provide decision support to physicians. The authors propose a Fuzzy Naive Bayesian (FNB) model for medical diagnosis, which extends the Fuzzy Bayesian approach proposed by Okuda. A physician's interview based method is described to define a orthogonal fuzzy symptom information system, required to apply the model. For the purpose of elaboration and elicitation of characteristics, the algorithm is applied to a simple simulated dataset, and compared with conventional Naive Bayes (NB) approach. As a preliminary evaluation of FNB in real world scenario, the comparison is repeated on a real fuzzy dataset of 81 patients diagnosed with infectious diseases. The case study on simulated dataset elucidates that FNB can be optimal over NB for diagnosing patients with imprecise-fuzzy information, on account of the following characteristics - 1) it can model the information that, values of some attributes are semantically closer than values of other attributes, and 2) it offers a mechanism to temper exaggerations in patient information. Although the algorithm requires precise training data, its utility for fuzzy training data is argued for. This is supported by the case study on infectious disease dataset, which indicates optimality of FNB over NB for the infectious disease domain. Further case studies on large datasets are required to establish utility of FNB.

  11. Bayesian Learning of a Language Model from Continuous Speech

    NASA Astrophysics Data System (ADS)

    Neubig, Graham; Mimura, Masato; Mori, Shinsuke; Kawahara, Tatsuya

    We propose a novel scheme to learn a language model (LM) for automatic speech recognition (ASR) directly from continuous speech. In the proposed method, we first generate phoneme lattices using an acoustic model with no linguistic constraints, then perform training over these phoneme lattices, simultaneously learning both lexical units and an LM. As a statistical framework for this learning problem, we use non-parametric Bayesian statistics, which make it possible to balance the learned model's complexity (such as the size of the learned vocabulary) and expressive power, and provide a principled learning algorithm through the use of Gibbs sampling. Implementation is performed using weighted finite state transducers (WFSTs), which allow for the simple handling of lattice input. Experimental results on natural, adult-directed speech demonstrate that LMs built using only continuous speech are able to significantly reduce ASR phoneme error rates. The proposed technique of joint Bayesian learning of lexical units and an LM over lattices is shown to significantly contribute to this improvement.

  12. Bayesian model selection for characterizing genomic imprinting effects and patterns

    PubMed Central

    Yang, Runqing; Wang, Xin; Wu, Zeyuan; Prows, Daniel R.; Lin, Min

    2010-01-01

    Motivation: Although imprinted genes have been ubiquitously observed in nature, statistical methodology still has not been systematically developed for jointly characterizing genomic imprinting effects and patterns. To detect imprinting genes influencing quantitative traits, the least square and maximum likelihood approaches for fitting a single quantitative trait loci (QTL) and Bayesian method for simultaneously modeling multiple QTLs have been adopted in various studies. Results: In a widely used F2 reciprocal mating population for mapping imprinting genes, we herein propose a genomic imprinting model which describes additive, dominance and imprinting effects of multiple imprinted quantitative trait loci (iQTL) for traits of interest. Depending upon the estimates of the above genetic effects, we categorized imprinting patterns into seven types, which provides a complete classification scheme for describing imprinting patterns. Bayesian model selection was employed to identify iQTL along with many genetic parameters in a computationally efficient manner. To make statistical inference on the imprinting types of iQTL detected, a set of Bayes factors were formulated using the posterior probabilities for the genetic effects being compared. We demonstrated the performance of the proposed method by computer simulation experiments and then applied this method to two real datasets. Our approach can be generally used to identify inheritance modes and determine the contribution of major genes for quantitative variations. Contact: annie.lin@duke.edu; runqingyang@sjtu.edu.cn PMID:19880366

  13. Enhancing debris flow modeling parameters integrating Bayesian networks

    NASA Astrophysics Data System (ADS)

    Graf, C.; Stoffel, M.; Grêt-Regamey, A.

    2009-04-01

    Applied debris-flow modeling requires suitably constraint input parameter sets. Depending on the used model, there is a series of parameters to define before running the model. Normally, the data base describing the event, the initiation conditions, the flow behavior, the deposition process and mainly the potential range of possible debris flow events in a certain torrent is limited. There are only some scarce places in the world, where we fortunately can find valuable data sets describing event history of debris flow channels delivering information on spatial and temporal distribution of former flow paths and deposition zones. Tree-ring records in combination with detailed geomorphic mapping for instance provide such data sets over a long time span. Considering the significant loss potential associated with debris-flow disasters, it is crucial that decisions made in regard to hazard mitigation are based on a consistent assessment of the risks. This in turn necessitates a proper assessment of the uncertainties involved in the modeling of the debris-flow frequencies and intensities, the possible run out extent, as well as the estimations of the damage potential. In this study, we link a Bayesian network to a Geographic Information System in order to assess debris-flow risk. We identify the major sources of uncertainty and show the potential of Bayesian inference techniques to improve the debris-flow model. We model the flow paths and deposition zones of a highly active debris-flow channel in the Swiss Alps using the numerical 2-D model RAMMS. Because uncertainties in run-out areas cause large changes in risk estimations, we use the data of flow path and deposition zone information of reconstructed debris-flow events derived from dendrogeomorphological analysis covering more than 400 years to update the input parameters of the RAMMS model. The probabilistic model, which consistently incorporates this available information, can serve as a basis for spatial risk

  14. Markov chain Monte Carlo simulation for Bayesian Hidden Markov Models

    NASA Astrophysics Data System (ADS)

    Chan, Lay Guat; Ibrahim, Adriana Irawati Nur Binti

    2016-10-01

    A hidden Markov model (HMM) is a mixture model which has a Markov chain with finite states as its mixing distribution. HMMs have been applied to a variety of fields, such as speech and face recognitions. The main purpose of this study is to investigate the Bayesian approach to HMMs. Using this approach, we can simulate from the parameters' posterior distribution using some Markov chain Monte Carlo (MCMC) sampling methods. HMMs seem to be useful, but there are some limitations. Therefore, by using the Mixture of Dirichlet processes Hidden Markov Model (MDPHMM) based on Yau et. al (2011), we hope to overcome these limitations. We shall conduct a simulation study using MCMC methods to investigate the performance of this model.

  15. Multiple testing on standardized mortality ratios: a Bayesian hierarchical model for FDR estimation.

    PubMed

    Ventrucci, Massimo; Scott, E Marian; Cocchi, Daniela

    2011-01-01

    The analysis of large data sets of standardized mortality ratios (SMRs), obtained by collecting observed and expected disease counts in a map of contiguous regions, is a first step in descriptive epidemiology to detect potential environmental risk factors. A common situation arises when counts are collected in small areas, that is, where the expected count is very low, and disease risks underlying the map are spatially correlated. Traditional p-value-based methods, which control the false discovery rate (FDR) by means of Poisson p-values, might achieve small sensitivity in identifying risk in small areas. This problem is the focus of the present work, where a Bayesian approach which performs a test to evaluate the null hypothesis of no risk over each SMR and controls the posterior FDR is proposed. A Bayesian hierarchical model including spatial random effects to allow for extra-Poisson variability is implemented providing estimates of the posterior probabilities that the null hypothesis of absence of risk is true. By means of such posterior probabilities, an estimate of the posterior FDR conditional on the data can be computed. A conservative estimation is needed to achieve the control which is checked by simulation. The availability of this estimate allows the practitioner to determine nonarbitrary FDR-based selection rules to identify high-risk areas according to a preset FDR level. Sensitivity and specificity of FDR-based rules are studied via simulation and a comparison with p-value-based rules is also shown. A real data set is analyzed using rules based on several FDR levels.

  16. Bayesian methods for model choice and propagation of model uncertainty in groundwater transport modeling

    NASA Astrophysics Data System (ADS)

    Mendes, B. S.; Draper, D.

    2008-12-01

    The issue of model uncertainty and model choice is central in any groundwater modeling effort [Neuman and Wierenga, 2003]; among the several approaches to the problem we favour using Bayesian statistics because it is a method that integrates in a natural way uncertainties (arising from any source) and experimental data. In this work, we experiment with several Bayesian approaches to model choice, focusing primarily on demonstrating the usefulness of the Reversible Jump Markov Chain Monte Carlo (RJMCMC) simulation method [Green, 1995]; this is an extension of the now- common MCMC methods. Standard MCMC techniques approximate posterior distributions for quantities of interest, often by creating a random walk in parameter space; RJMCMC allows the random walk to take place between parameter spaces with different dimensionalities. This fact allows us to explore state spaces that are associated with different deterministic models for experimental data. Our work is exploratory in nature; we restrict our study to comparing two simple transport models applied to a data set gathered to estimate the breakthrough curve for a tracer compound in groundwater. One model has a mean surface based on a simple advection dispersion differential equation; the second model's mean surface is also governed by a differential equation but in two dimensions. We focus on artificial data sets (in which truth is known) to see if model identification is done correctly, but we also address the issues of over and under-paramerization, and we compare RJMCMC's performance with other traditional methods for model selection and propagation of model uncertainty, including Bayesian model averaging, BIC and DIC.References Neuman and Wierenga (2003). A Comprehensive Strategy of Hydrogeologic Modeling and Uncertainty Analysis for Nuclear Facilities and Sites. NUREG/CR-6805, Division of Systems Analysis and Regulatory Effectiveness Office of Nuclear Regulatory Research, U. S. Nuclear Regulatory Commission

  17. Development and comparison in uncertainty assessment based Bayesian modularization method in hydrological modeling

    NASA Astrophysics Data System (ADS)

    Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn

    2013-04-01

    SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.

  18. Bayesian Dose-Response Modeling in Sparse Data

    NASA Astrophysics Data System (ADS)

    Kim, Steven B.

    This book discusses Bayesian dose-response modeling in small samples applied to two different settings. The first setting is early phase clinical trials, and the second setting is toxicology studies in cancer risk assessment. In early phase clinical trials, experimental units are humans who are actual patients. Prior to a clinical trial, opinions from multiple subject area experts are generally more informative than the opinion of a single expert, but we may face a dilemma when they have disagreeing prior opinions. In this regard, we consider compromising the disagreement and compare two different approaches for making a decision. In addition to combining multiple opinions, we also address balancing two levels of ethics in early phase clinical trials. The first level is individual-level ethics which reflects the perspective of trial participants. The second level is population-level ethics which reflects the perspective of future patients. We extensively compare two existing statistical methods which focus on each perspective and propose a new method which balances the two conflicting perspectives. In toxicology studies, experimental units are living animals. Here we focus on a potential non-monotonic dose-response relationship which is known as hormesis. Briefly, hormesis is a phenomenon which can be characterized by a beneficial effect at low doses and a harmful effect at high doses. In cancer risk assessments, the estimation of a parameter, which is known as a benchmark dose, can be highly sensitive to a class of assumptions, monotonicity or hormesis. In this regard, we propose a robust approach which considers both monotonicity and hormesis as a possibility. In addition, We discuss statistical hypothesis testing for hormesis and consider various experimental designs for detecting hormesis based on Bayesian decision theory. Past experiments have not been optimally designed for testing for hormesis, and some Bayesian optimal designs may not be optimal under a

  19. Semiparametric Bayesian local functional models for diffusion tensor tract statistics☆

    PubMed Central

    Hua, Zhaowei; Dunson, David B.; Gilmore, John H.; Styner, Martin A.; Zhu, Hongtu

    2012-01-01

    We propose a semiparametric Bayesian local functional model (BFM) for the analysis of multiple diffusion properties (e.g., fractional anisotropy) along white matter fiber bundles with a set of covariates of interest, such as age and gender. BFM accounts for heterogeneity in the shape of the fiber bundle diffusion properties among subjects, while allowing the impact of the covariates to vary across subjects. A nonparametric Bayesian LPP2 prior facilitates global and local borrowings of information among subjects, while an infinite factor model flexibly represents low-dimensional structure. Local hypothesis testing and credible bands are developed to identify fiber segments, along which multiple diffusion properties are significantly associated with covariates of interest, while controlling for multiple comparisons. Moreover, BFM naturally group subjects into more homogeneous clusters. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFM. We apply BFM to investigate the development of white matter diffusivities along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment in new born infants. PMID:22732565

  20. Optimal inference with suboptimal models: Addiction and active Bayesian inference

    PubMed Central

    Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl

    2015-01-01

    When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321

  1. Path integration mediated systematic search: a Bayesian model.

    PubMed

    Vickerstaff, Robert J; Merkle, Tobias

    2012-08-21

    The systematic search behaviour is a backup system that increases the chances of desert ants finding their nest entrance after foraging when the path integrator has failed to guide them home accurately enough. Here we present a mathematical model of the systematic search that is based on extensive behavioural studies in North African desert ants Cataglyphis fortis. First, a simple search heuristic utilising Bayesian inference and a probability density function is developed. This model, which optimises the short-term nest detection probability, is then compared to three simpler search heuristics and to recorded search patterns of Cataglyphis ants. To compare the different searches a method to quantify search efficiency is established as well as an estimate of the error rate in the ants' path integrator. We demonstrate that the Bayesian search heuristic is able to automatically adapt to increasing levels of positional uncertainty to produce broader search patterns, just as desert ants do, and that it outperforms the three other search heuristics tested. The searches produced by it are also arguably the most similar in appearance to the ant's searches. Copyright © 2012 Elsevier Ltd. All rights reserved.

  2. Advanced REACH Tool: A Bayesian Model for Occupational Exposure Assessment

    PubMed Central

    McNally, Kevin; Warren, Nicholas; Fransman, Wouter; Entink, Rinke Klein; Schinkel, Jody; van Tongeren, Martie; Cherrie, John W.; Kromhout, Hans; Schneider, Thomas; Tielemans, Erik

    2014-01-01

    This paper describes a Bayesian model for the assessment of inhalation exposures in an occupational setting; the methodology underpins a freely available web-based application for exposure assessment, the Advanced REACH Tool (ART). The ART is a higher tier exposure tool that combines disparate sources of information within a Bayesian statistical framework. The information is obtained from expert knowledge expressed in a calibrated mechanistic model of exposure assessment, data on inter- and intra-individual variability in exposures from the literature, and context-specific exposure measurements. The ART provides central estimates and credible intervals for different percentiles of the exposure distribution, for full-shift and long-term average exposures. The ART can produce exposure estimates in the absence of measurements, but the precision of the estimates improves as more data become available. The methodology presented in this paper is able to utilize partially analogous data, a novel approach designed to make efficient use of a sparsely populated measurement database although some additional research is still required before practical implementation. The methodology is demonstrated using two worked examples: an exposure to copper pyrithione in the spraying of antifouling paints and an exposure to ethyl acetate in shoe repair. PMID:24665110

  3. Bayesian Energy Landscape Tilting: Towards Concordant Models of Molecular Ensembles

    PubMed Central

    Beauchamp, Kyle A.; Pande, Vijay S.; Das, Rhiju

    2014-01-01

    Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and 3J measurements gives convergent values of the peptide’s α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT’s principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data. PMID:24655513

  4. Perceptual decision making: drift-diffusion model is equivalent to a Bayesian model

    PubMed Central

    Bitzer, Sebastian; Park, Hame; Blankenburg, Felix; Kiebel, Stefan J.

    2014-01-01

    Behavioral data obtained with perceptual decision making experiments are typically analyzed with the drift-diffusion model. This parsimonious model accumulates noisy pieces of evidence toward a decision bound to explain the accuracy and reaction times of subjects. Recently, Bayesian models have been proposed to explain how the brain extracts information from noisy input as typically presented in perceptual decision making tasks. It has long been known that the drift-diffusion model is tightly linked with such functional Bayesian models but the precise relationship of the two mechanisms was never made explicit. Using a Bayesian model, we derived the equations which relate parameter values between these models. In practice we show that this equivalence is useful when fitting multi-subject data. We further show that the Bayesian model suggests different decision variables which all predict equal responses and discuss how these may be discriminated based on neural correlates of accumulated evidence. In addition, we discuss extensions to the Bayesian model which would be difficult to derive for the drift-diffusion model. We suggest that these and other extensions may be highly useful for deriving new experiments which test novel hypotheses. PMID:24616689

  5. Standardized Mean Differences in Two-Level Cross-Classified Random Effects Models

    ERIC Educational Resources Information Center

    Lai, Mark H. C.; Kwok, Oi-Man

    2014-01-01

    Multilevel modeling techniques are becoming more popular in handling data with multilevel structure in educational and behavioral research. Recently, researchers have paid more attention to cross-classified data structure that naturally arises in educational settings. However, unlike traditional single-level research, methodological studies about…

  6. Standardized Mean Differences in Two-Level Cross-Classified Random Effects Models

    ERIC Educational Resources Information Center

    Lai, Mark H. C.; Kwok, Oi-Man

    2014-01-01

    Multilevel modeling techniques are becoming more popular in handling data with multilevel structure in educational and behavioral research. Recently, researchers have paid more attention to cross-classified data structure that naturally arises in educational settings. However, unlike traditional single-level research, methodological studies about…

  7. Longitudinal Analysis with Regressions among Random Effects: A Latent Variable Modeling Approach

    ERIC Educational Resources Information Center

    Raykov, Tenko

    2007-01-01

    A didactic discussion of a latent variable modeling approach is presented that addresses frequent empirical concerns of social, behavioral, and educational researchers involved in longitudinal studies. The method is suitable when the purpose is to analyze repeated measure data along several interrelated dimensions and to explain some of the…

  8. Bayesian inferences for beta semiparametric-mixed models to analyze longitudinal neuroimaging data.

    PubMed

    Wang, Xiao-Feng; Li, Yingxing

    2014-07-01

    Diffusion tensor imaging (DTI) is a quantitative magnetic resonance imaging technique that measures the three-dimensional diffusion of water molecules within tissue through the application of multiple diffusion gradients. This technique is rapidly increasing in popularity for studying white matter properties and structural connectivity in the living human brain. One of the major outcomes derived from the DTI process is known as fractional anisotropy, a continuous measure restricted on the interval (0,1). Motivated from a longitudinal DTI study of multiple sclerosis, we use a beta semiparametric-mixed regression model for the neuroimaging data. This work extends the generalized additive model methodology with beta distribution family and random effects. We describe two estimation methods with penalized splines, which are formalized under a Bayesian inferential perspective. The first one is carried out by Markov chain Monte Carlo (MCMC) simulations while the second one uses a relatively new technique called integrated nested Laplace approximation (INLA). Simulations and the neuroimaging data analysis show that the estimates obtained from both approaches are stable and similar, while the INLA method provides an efficient alternative to the computationally expensive MCMC method.

  9. Mixed-effects varying-coefficient model with skewed distribution coupled with cause-specific varying-coefficient hazard model with random-effects for longitudinal-competing risks data analysis.

    PubMed

    Lu, Tao; Wang, Min; Liu, Guangying; Dong, Guang-Hui; Qian, Feng

    2016-01-01

    It is well known that there is strong relationship between HIV viral load and CD4 cell counts in AIDS studies. However, the relationship between them changes during the course of treatment and may vary among individuals. During treatments, some individuals may experience terminal events such as death. Because the terminal event may be related to the individual's viral load measurements, the terminal mechanism is non-ignorable. Furthermore, there exists competing risks from multiple types of events, such as AIDS-related death and other death. Most joint models for the analysis of longitudinal-survival data developed in literatures have focused on constant coefficients and assume symmetric distribution for the endpoints, which does not meet the needs for investigating the nature of varying relationship between HIV viral load and CD4 cell counts in practice. We develop a mixed-effects varying-coefficient model with skewed distribution coupled with cause-specific varying-coefficient hazard model with random-effects to deal with varying relationship between the two endpoints for longitudinal-competing risks survival data. A fully Bayesian inference procedure is established to estimate parameters in the joint model. The proposed method is applied to a multicenter AIDS cohort study. Various scenarios-based potential models that account for partial data features are compared. Some interesting findings are presented.

  10. Modeling the Climatology of Tornado Occurrence with Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Cheng, Vincent Y. S.

    Our mechanistic understanding of tornadic environments has significantly improved by the recent technological enhancements in the detection of tornadoes as well as the advances of numerical weather predictive modeling. Nonetheless, despite the decades of active research, prediction of tornado occurrence remains one of the most difficult problems in meteorological and climate science. In our efforts to develop predictive tools for tornado occurrence, there are a number of issues to overcome, such as the treatment of inconsistent tornado records, the consideration of suitable combination of atmospheric predictors, and the selection of appropriate resolution to accommodate the variability in time and space. In this dissertation, I address each of these topics by undertaking three empirical (statistical) modeling studies, where I examine the signature of different atmospheric factors influencing the tornado occurrence, the sampling biases in tornado observations, and the optimal spatiotemporal resolution for studying tornado occurrence. In the first study, I develop a novel Bayesian statistical framework to assess the probability of tornado occurrence in Canada, in which the sampling bias of tornado observations and the linkage between lightning climatology and tornadogenesis are considered. The results produced reasonable probability estimates of tornado occurrence for the under-sampled areas in the model domain. The same study also delineated the geographical variability in the lightning-tornado relationship across Canada. In the second study, I present a novel modeling framework to examine the relative importance of several key atmospheric variables (e.g., convective available potential energy, 0-3 km storm-relative helicity, 0-6 km bulk wind difference, 0-tropopause vertical wind shear) on tornado activity in North America. I found that the variable quantifying the updraft strength is more important during the warm season, whereas the effects of wind

  11. Confidence Intervals for the Between-Study Variance in Random Effects Meta-Analysis Using Generalised Cochran Heterogeneity Statistics

    ERIC Educational Resources Information Center

    Jackson, Dan

    2013-01-01

    Statistical inference is problematic in the common situation in meta-analysis where the random effects model is fitted to just a handful of studies. In particular, the asymptotic theory of maximum likelihood provides a poor approximation, and Bayesian methods are sensitive to the prior specification. Hence, less efficient, but easily computed and…

  12. Tests of Bayesian model selection techniques for gravitational wave astronomy

    SciTech Connect

    Cornish, Neil J.; Littenberg, Tyson B.

    2007-10-15

    The analysis of gravitational wave data involves many model selection problems. The most important example is the detection problem of selecting between the data being consistent with instrument noise alone, or instrument noise and a gravitational wave signal. The analysis of data from ground based gravitational wave detectors is mostly conducted using classical statistics, and methods such as the Neyman-Peterson criteria are used for model selection. Future space based detectors, such as the Laser Interferometer Space Antenna (LISA), are expected to produce rich data streams containing the signals from many millions of sources. Determining the number of sources that are resolvable, and the most appropriate description of each source poses a challenging model selection problem that may best be addressed in a Bayesian framework. An important class of LISA sources are the millions of low-mass binary systems within our own galaxy, tens of thousands of which will be detectable. Not only are the number of sources unknown, but so are the number of parameters required to model the waveforms. For example, a significant subset of the resolvable galactic binaries will exhibit orbital frequency evolution, while a smaller number will have measurable eccentricity. In the Bayesian approach to model selection one needs to compute the Bayes factor between competing models. Here we explore various methods for computing Bayes factors in the context of determining which galactic binaries have measurable frequency evolution. The methods explored include a reverse jump Markov chain Monte Carlo algorithm, Savage-Dickie density ratios, the Schwarz-Bayes information criterion, and the Laplace approximation to the model evidence. We find good agreement between all of the approaches.

  13. Deletion diagnostics for the generalised linear mixed model with independent random effects.

    PubMed

    Ganguli, B; Roy, S Sen; Naskar, M; Malloy, E J; Eisen, E A

    2016-04-30

    The Generalised linear mixed model (GLMM) is widely used for modelling environmental data. However, such data are prone to influential observations, which can distort the estimated exposure-response curve particularly in regions of high exposure. Deletion diagnostics for iterative estimation schemes commonly derive the deleted estimates based on a single iteration of the full system holding certain pivotal quantities such as the information matrix to be constant. In this paper, we present an approximate formula for the deleted estimates and Cook's distance for the GLMM, which does not assume that the estimates of variance parameters are unaffected by deletion. The procedure allows the user to calculate standardised DFBETAs for mean as well as variance parameters. In certain cases such as when using the GLMM as a device for smoothing, such residuals for the variance parameters are interesting in their own right. In general, the procedure leads to deleted estimates of mean parameters, which are corrected for the effect of deletion on variance components as estimation of the two sets of parameters is interdependent. The probabilistic behaviour of these residuals is investigated and a simulation based procedure suggested for their standardisation. The method is used to identify influential individuals in an occupational cohort exposed to silica. The results show that failure to conduct post model fitting diagnostics for variance components can lead to erroneous conclusions about the fitted curve and unstable confidence intervals. Copyright © 2015 John Wiley & Sons, Ltd.

  14. Fast Bayesian Inference in Dirichlet Process Mixture Models.

    PubMed

    Wang, Lianming; Dunson, David B

    2011-01-01

    There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.

  15. Bayesian Inference for Duplication–Mutation with Complementarity Network Models

    PubMed Central

    Persing, Adam; Beskos, Alexandros; Heine, Kari; De Iorio, Maria

    2015-01-01

    Abstract We observe an undirected graph G without multiple edges and self-loops, which is to represent a protein–protein interaction (PPI) network. We assume that G evolved under the duplication–mutation with complementarity (DMC) model from a seed graph, G0, and we also observe the binary forest Γ that represents the duplication history of G. A posterior density for the DMC model parameters is established, and we outline a sampling strategy by which one can perform Bayesian inference; that sampling strategy employs a particle marginal Metropolis–Hastings (PMMH) algorithm. We test our methodology on numerical examples to demonstrate a high accuracy and precision in the inference of the DMC model's mutation and homodimerization parameters. PMID:26355682

  16. Aggregated Residential Load Modeling Using Dynamic Bayesian Networks

    SciTech Connect

    Vlachopoulou, Maria; Chin, George; Fuller, Jason C.; Lu, Shuai

    2014-09-28

    Abstract—It is already obvious that the future power grid will have to address higher demand for power and energy, and to incorporate renewable resources of different energy generation patterns. Demand response (DR) schemes could successfully be used to manage and balance power supply and demand under operating conditions of the future power grid. To achieve that, more advanced tools for DR management of operations and planning are necessary that can estimate the available capacity from DR resources. In this research, a Dynamic Bayesian Network (DBN) is derived, trained, and tested that can model aggregated load of Heating, Ventilation, and Air Conditioning (HVAC) systems. DBNs can provide flexible and powerful tools for both operations and planing, due to their unique analytical capabilities. The DBN model accuracy and flexibility of use is demonstrated by testing the model under different operational scenarios.

  17. Development of a Bayesian Belief Network Runway Incursion Model

    NASA Technical Reports Server (NTRS)

    Green, Lawrence L.

    2014-01-01

    In a previous paper, a statistical analysis of runway incursion (RI) events was conducted to ascertain their relevance to the top ten Technical Challenges (TC) of the National Aeronautics and Space Administration (NASA) Aviation Safety Program (AvSP). The study revealed connections to perhaps several of the AvSP top ten TC. That data also identified several primary causes and contributing factors for RI events that served as the basis for developing a system-level Bayesian Belief Network (BBN) model for RI events. The system-level BBN model will allow NASA to generically model the causes of RI events and to assess the effectiveness of technology products being developed under NASA funding. These products are intended to reduce the frequency of RI events in particular, and to improve runway safety in general. The development, structure and assessment of that BBN for RI events by a Subject Matter Expert panel are documented in this paper.

  18. A generalizable hierarchical Bayesian model for persistent SAR change detection

    NASA Astrophysics Data System (ADS)

    Newstadt, Gregory E.; Zelnio, Edmund G.; Hero, Alfred O., III

    2012-05-01

    This paper proposes a hierarchical Bayesian model for multiple-pass, multiple antenna synthetic aperture radar (SAR) systems with the goal of adaptive change detection. We model the SAR phenomenology directly, including antenna and spatial dependencies, speckle and specular noise, and stationary clutter. We extend previous work1 by estimating the antenna covariance matrix directly, leading to improved performance in high clutter regions. The proposed SAR model is also shown to be easily generalizable when additional prior information is available, such as locations of roads/intersections or smoothness priors on the target motion. The performance of our posterior inference algorithm is analyzed over a large set of measured SAR imagery. It is shown that the proposed algorithm provides competitive or better results to common change detection algorithms with additional benefits such as few tuning parameters and a characterization of the posterior distribution.

  19. A Bayesian two part model applied to analyze risk factors of adult mortality with application to data from Namibia.

    PubMed

    Kazembe, Lawrence N

    2013-01-01

    Despite remarkable gains in life expectancy and declining mortality in the 21st century, in many places mostly in developing countries, adult mortality has increased in part due to HIV/AIDS or continued abject poverty levels. Moreover many factors including behavioural, socio-economic and demographic variables work simultaneously to impact on risk of mortality. Understanding risk factors of adult mortality is crucial towards designing appropriate public health interventions. In this paper we proposed a structured additive two-part random effects regression model for adult mortality data. Our proposal assumed two processes: (i) whether death occurred in the household (prevalence part), and (ii) number of reported deaths, if death did occur (severity part). The proposed model specification therefore consisted of two generalized linear mixed models (GLMM) with correlated random effects that permitted structured and unstructured spatial components at regional level. Specifically, the first part assumed a GLMM with a logistic link and the second part explored a count model following either a Poisson or negative binomial distribution. The model was used to analyse adult mortality data of 25,793 individuals from the 2006/2007 Namibian DHS data. Inference is based on the Bayesian framework with appropriate priors discussed.

  20. A Bayesian Two Part Model Applied to Analyze Risk Factors of Adult Mortality with Application to Data from Namibia

    PubMed Central

    Kazembe, Lawrence N.

    2013-01-01

    Despite remarkable gains in life expectancy and declining mortality in the 21st century, in many places mostly in developing countries, adult mortality has increased in part due to HIV/AIDS or continued abject poverty levels. Moreover many factors including behavioural, socio-economic and demographic variables work simultaneously to impact on risk of mortality. Understanding risk factors of adult mortality is crucial towards designing appropriate public health interventions. In this paper we proposed a structured additive two-part random effects regression model for adult mortality data. Our proposal assumed two processes: (i) whether death occurred in the household (prevalence part), and (ii) number of reported deaths, if death did occur (severity part). The proposed model specification therefore consisted of two generalized linear mixed models (GLMM) with correlated random effects that permitted structured and unstructured spatial components at regional level. Specifically, the first part assumed a GLMM with a logistic link and the second part explored a count model following either a Poisson or negative binomial distribution. The model was used to analyse adult mortality data of 25,793 individuals from the 2006/2007 Namibian DHS data. Inference is based on the Bayesian framework with appropriate priors discussed. PMID:24066052

  1. Advances in Bayesian Model Based Clustering Using Particle Learning

    SciTech Connect

    Merl, D M

    2009-11-19

    Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original

  2. Bayesian inference based on dual generalized order statistics from the exponentiated Weibull model

    NASA Astrophysics Data System (ADS)

    Al Sobhi, Mashail M.

    2015-02-01

    Bayesian estimation for the two parameters and the reliability function of the exponentiated Weibull model are obtained based on dual generalized order statistics (DGOS). Also, Bayesian prediction bounds for future DGOS from exponentiated Weibull model are obtained. The symmetric and asymmetric loss functions are considered for Bayesian computations. The Markov chain Monte Carlo (MCMC) methods are used for computing the Bayes estimates and prediction bounds. The results have been specialized to the lower record values. Comparisons are made between Bayesian and maximum likelihood estimators via Monte Carlo simulation.

  3. Groupwise structural parcellation of the whole cortex: A logistic random effects model based approach.

    PubMed

    Gallardo, Guillermo; Wells, William; Deriche, Rachid; Wassermann, Demian

    2017-02-01

    Current theories hold that brain function is highly related to long-range physical connections through axonal bundles, namely extrinsic connectivity. However, obtaining a groupwise cortical parcellation based on extrinsic connectivity remains challenging. Current parcellation methods are computationally expensive; need tuning of several parameters or rely on ad-hoc constraints. Furthermore, none of these methods present a model for the cortical extrinsic connectivity of the cortex. To tackle these problems, we propose a parsimonious model for the extrinsic connectivity and an efficient parceling technique based on clustering of tractograms. Our technique allows the creation of single subject and groupwise parcellations of the whole cortex. The parcellations obtained with our technique are in agreement with structural and functional parcellations in the literature. In particular, the motor and sensory cortex are subdivided in agreement with the human homunculus of Penfield. We illustrate this by comparing our resulting parcels with the motor strip mapping included in the Human Connectome Project data. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Inversion of hierarchical Bayesian models using Gaussian processes.

    PubMed

    Lomakina, Ekaterina I; Paliwal, Saee; Diaconescu, Andreea O; Brodersen, Kay H; Aponte, Eduardo A; Buhmann, Joachim M; Stephan, Klaas E

    2015-09-01

    Over the past decade, computational approaches to neuroimaging have increasingly made use of hierarchical Bayesian models (HBMs), either for inferring on physiological mechanisms underlying fMRI data (e.g., dynamic causal modelling, DCM) or for deriving computational trajectories (from behavioural data) which serve as regressors in general linear models. However, an unresolved problem is that standard methods for inverting the hierarchical Bayesian model are either very slow, e.g. Markov Chain Monte Carlo Methods (MCMC), or are vulnerable to local minima in non-convex optimisation problems, such as variational Bayes (VB). This article considers Gaussian process optimisation (GPO) as an alternative approach for global optimisation of sufficiently smooth and efficiently evaluable objective functions. GPO avoids being trapped in local extrema and can be computationally much more efficient than MCMC. Here, we examine the benefits of GPO for inverting HBMs commonly used in neuroimaging, including DCM for fMRI and the Hierarchical Gaussian Filter (HGF). Importantly, to achieve computational efficiency despite high-dimensional optimisation problems, we introduce a novel combination of GPO and local gradient-based search methods. The utility of this GPO implementation for DCM and HGF is evaluated against MCMC and VB, using both synthetic data from simulations and empirical data. Our results demonstrate that GPO provides parameter estimates with equivalent or better accuracy than the other techniques, but at a fraction of the computational cost required for MCMC. We anticipate that GPO will prove useful for robust and efficient inversion of high-dimensional and nonlinear models of neuroimaging data. Copyright © 2015. Published by Elsevier Inc.

  5. Nonparametric Bayesian inference of the microcanonical stochastic block model

    NASA Astrophysics Data System (ADS)

    Peixoto, Tiago P.

    2017-01-01

    A principled approach to characterize the hidden modular structure of networks is to formulate generative models and then infer their parameters from data. When the desired structure is composed of modules or "communities," a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints, i.e., the generated networks are not allowed to violate the patterns imposed by the model. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: (1) deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, which not only remove limitations that seriously degrade the inference on large networks but also reveal structures at multiple scales; (2) a very efficient inference algorithm that scales well not only for networks with a large number of nodes and edges but also with an unlimited number of modules. We show also how this approach can be used to sample modular hierarchies from the posterior distribution, as well as to perform model selection. We discuss and analyze the differences between sampling from the posterior and simply finding the single parameter estimate that maximizes it. Furthermore, we expose a direct equivalence between our microcanonical approach and alternative derivations based on the canonical SBM.

  6. Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola

    PubMed Central

    Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope

    2010-01-01

    The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775

  7. A Semiparametric Bayesian Model for Detecting Synchrony Among Multiple Neurons

    PubMed Central

    Shahbaba, Babak; Zhou, Bo; Lan, Shiwei; Ombao, Hernando; Moorman, David; Behseta, Sam

    2015-01-01

    We propose a scalable semiparametric Bayesian model to capture dependencies among multiple neurons by detecting their co-firing (possibly with some lag time) patterns over time. After discretizing time so there is at most one spike at each interval, the resulting sequence of 1’s (spike) and 0’s (silence) for each neuron is modeled using the logistic function of a continuous latent variable with a Gaussian process prior. For multiple neurons, the corresponding marginal distributions are coupled to their joint probability distribution using a parametric copula model. The advantages of our approach are as follows: the nonparametric component (i.e., the Gaussian process model) provides a flexible framework for modeling the underlying firing rates; the parametric component (i.e., the copula model) allows us to make inference regarding both contemporaneous and lagged relationships among neurons; using the copula model, we construct multivariate probabilistic models by separating the modeling of univariate marginal distributions from the modeling of dependence structure among variables; our method is easy to implement using a computationally efficient sampling algorithm that can be easily extended to high dimensional problems. Using simulated data, we show that our approach could correctly capture temporal dependencies in firing rates and identify synchronous neurons. We also apply our model to spike train data obtained from prefrontal cortical areas. PMID:24922500

  8. Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837

    ERIC Educational Resources Information Center

    Levy, Roy

    2014-01-01

    Digital games offer an appealing environment for assessing student proficiencies, including skills and misconceptions in a diagnostic setting. This paper proposes a dynamic Bayesian network modeling approach for observations of student performance from an educational video game. A Bayesian approach to model construction, calibration, and use in…

  9. Bayesian Framework for Water Quality Model Uncertainty Estimation and Risk Management

    EPA Science Inventory

    A formal Bayesian methodology is presented for integrated model calibration and risk-based water quality management using Bayesian Monte Carlo simulation and maximum likelihood estimation (BMCML). The primary focus is on lucid integration of model calibration with risk-based wat...

  10. Bayesian Framework for Water Quality Model Uncertainty Estimation and Risk Management

    EPA Science Inventory

    A formal Bayesian methodology is presented for integrated model calibration and risk-based water quality management using Bayesian Monte Carlo simulation and maximum likelihood estimation (BMCML). The primary focus is on lucid integration of model calibration with risk-based wat...

  11. Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data

    ERIC Educational Resources Information Center

    Lee, Sik-Yum

    2006-01-01

    A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…

  12. Bayesian Statistical Model Checking with Application to Stateflow/Simulink Verification

    DTIC Science & Technology

    2010-01-13

    Bayesian Statistical Model Checking with Application to Stateflow/Simulink Verification Paolo Zuliani, André Platzer , Edmund M. Clarke January 13...Legay, A. Platzer , and P. Zuliani. A Bayesian approach to Model Checking biological systems. In CMSB, volume 5688 of LNCS, pages 218–234, 2009. 21 [16

  13. A comparison of Bayesian and frequentist model selection methods for factor analysis models.

    PubMed

    Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric

    2017-06-01

    We compare the performances of well-known frequentist model fit indices (MFIs) and several Bayesian model selection criteria (MCC) as tools for cross-loading selection in factor analysis under low to moderate sample sizes, cross-loading sizes, and possible violations of distributional assumptions. The Bayesian criteria considered include the Bayes factor (BF), Bayesian Information Criterion (BIC), Deviance Information Criterion (DIC), a Bayesian leave-one-out with Pareto smoothed importance sampling (LOO-PSIS), and a Bayesian variable selection method using the spike-and-slab prior (SSP; Lu, Chow, & Loken, 2016). Simulation results indicate that of the Bayesian measures considered, the BF and the BIC showed the best balance between true positive rates and false positive rates, followed closely by the SSP. The LOO-PSIS and the DIC showed the highest true positive rates among all the measures considered, but with elevated false positive rates. In comparison, likelihood ratio tests (LRTs) are still the preferred frequentist model comparison tool, except for their higher false positive detection rates compared to the BF, BIC and SSP under violations of distributional assumptions. The root mean squared error of approximation (RMSEA) and the Tucker-Lewis index (TLI) at the conventional cut-off of approximate fit impose much more stringent "penalties" on model complexity under conditions with low cross-loading size, low sample size, and high model complexity compared with the LRTs and all other Bayesian MCC. Nevertheless, they provided a reasonable alternative to the LRTs in cases where the models cannot be readily constructed as nested within each other. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  14. A Bayesian Measurment Error Model for Misaligned Radiographic Data

    SciTech Connect

    Lennox, Kristin P.; Glascoe, Lee G.

    2013-09-06

    An understanding of the inherent variability in micro-computed tomography (micro-CT) data is essential to tasks such as statistical process control and the validation of radiographic simulation tools. The data present unique challenges to variability analysis due to the relatively low resolution of radiographs, and also due to minor variations from run to run which can result in misalignment or magnification changes between repeated measurements of a sample. Positioning changes artificially inflate the variability of the data in ways that mask true physical phenomena. We present a novel Bayesian nonparametric regression model that incorporates both additive and multiplicative measurement error in addition to heteroscedasticity to address this problem. We also use this model to assess the effects of sample thickness and sample position on measurement variability for an aluminum specimen. Supplementary materials for this article are available online.

  15. A Bayesian model of context-sensitive value attribution.

    PubMed

    Rigoli, Francesco; Friston, Karl J; Martinelli, Cristina; Selaković, Mirjana; Shergill, Sukhwinder S; Dolan, Raymond J

    2016-06-22

    Substantial evidence indicates that incentive value depends on an anticipation of rewards within a given context. However, the computations underlying this context sensitivity remain unknown. To address this question, we introduce a normative (Bayesian) account of how rewards map to incentive values. This assumes that the brain inverts a model of how rewards are generated. Key features of our account include (i) an influence of prior beliefs about the context in which rewards are delivered (weighted by their reliability in a Bayes-optimal fashion), (ii) the notion that incentive values correspond to precision-weighted prediction errors, (iii) and contextual information unfolding at different hierarchical levels. This formulation implies that incentive value is intrinsically context-dependent. We provide empirical support for this model by showing that incentive value is influenced by context variability and by hierarchically nested contexts. The perspective we introduce generates new empirical predictions that might help explaining psychopathologies, such as addiction.

  16. Modeling the user preference on broadcasting contents using Bayesian networks

    NASA Astrophysics Data System (ADS)

    Kang, Sanggil; Lim, Jeongyeon; Kim, Munchurl

    2004-01-01

    In this paper, we introduce a new supervised learning method of a Bayesian network for user preference models. Unlike other preference models, our method traces the trend of a user preference as time passes. It allows us to do online learning so we do not need the exhaustive data collection. The tracing of the trend can be done by modifying the frequency of attributes in order to force the old preference to be correlated with the current preference under the assumption that the current preference is correlated with the near future preference. The objective of our learning method is to force the mutual information to be reinforced by modifying the frequency of the attributes in the old preference by providing weights to the attributes. With developing mathematical derivation of our learning method, experimental results on the learning and reasoning performance on TV genre preference using a real set of TV program watching history data.

  17. GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model

    NASA Astrophysics Data System (ADS)

    Takaishi, Tetsuya

    2015-01-01

    The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran.

  18. Modelling categorical covariates in Bayesian disease mapping by partition structures.

    PubMed

    Giudici, P; Knorr-Held, L; Rasser, G

    We consider the problem of mapping the risk from a disease using a series of regional counts of observed and expected cases, and information on potential risk factors. To analyse this problem from a Bayesian viewpoint, we propose a methodology which extends a spatial partition model by including categorical covariate information. Such an extension allows detection of clusters in the residual variation, reflecting further, possibly unobserved, covariates. The methodology is implemented by means of reversible jump Markov chain Monte Carlo sampling. An application is presented in order to illustrate and compare our proposed extensions with a purely spatial partition model. Here we analyse a well-known data set on lip cancer incidence in Scotland. Copyright 2000 John Wiley & Sons, Ltd.

  19. A Bayesian Measurment Error Model for Misaligned Radiographic Data

    DOE PAGES

    Lennox, Kristin P.; Glascoe, Lee G.

    2013-09-06

    An understanding of the inherent variability in micro-computed tomography (micro-CT) data is essential to tasks such as statistical process control and the validation of radiographic simulation tools. The data present unique challenges to variability analysis due to the relatively low resolution of radiographs, and also due to minor variations from run to run which can result in misalignment or magnification changes between repeated measurements of a sample. Positioning changes artificially inflate the variability of the data in ways that mask true physical phenomena. We present a novel Bayesian nonparametric regression model that incorporates both additive and multiplicative measurement error inmore » addition to heteroscedasticity to address this problem. We also use this model to assess the effects of sample thickness and sample position on measurement variability for an aluminum specimen. Supplementary materials for this article are available online.« less

  20. Bayesian theory of probabilistic forecasting via deterministic hydrologic model

    NASA Astrophysics Data System (ADS)

    Krzysztofowicz, Roman

    1999-09-01

    Rational decision making (for flood warning, navigation, or reservoir systems) requires that the total uncertainty about a hydrologic predictand (such as river stage, discharge, or runoff volume) be quantified in terms of a probability distribution, conditional on all available information and knowledge. Hydrologic knowledge is typically embodied in a deterministic catchment model. Fundamentals are presented of a Bayesian forecasting system (BFS) for producing a probabilistic forecast of a hydrologic predictand via any deterministic catchment model. The BFS decomposes the total uncertainty into input uncertainty and hydrologic uncertainty, which are quantified independently and then integrated into a predictive (Bayes) distribution. This distribution results from a revision of a prior (climatic) distribution, is well calibrated, and has a nonnegative ex ante economic value. The BFS is compared with Monte Carlo simulation and "ensemble forecasting" technique, none of which can alone produce a probabilistic forecast that meets requirements of rational decision making, but each can serve as a component of the BFS.

  1. Sensitivity of fluvial sediment source apportionment to mixing model assumptions: A Bayesian model comparison

    PubMed Central

    Cooper, Richard J; Krueger, Tobias; Hiscock, Kevin M; Rawlins, Barry G

    2014-01-01

    Mixing models have become increasingly common tools for apportioning fluvial sediment load to various sediment sources across catchments using a wide variety of Bayesian and frequentist modeling approaches. In this study, we demonstrate how different model setups can impact upon resulting source apportionment estimates in a Bayesian framework via a one-factor-at-a-time (OFAT) sensitivity analysis. We formulate 13 versions of a mixing model, each with different error assumptions and model structural choices, and apply them to sediment geochemistry data from the River Blackwater, Norfolk, UK, to apportion suspended particulate matter (SPM) contributions from three sources (arable topsoils, road verges, and subsurface material) under base flow conditions between August 2012 and August 2013. Whilst all 13 models estimate subsurface sources to be the largest contributor of SPM (median ∼76%), comparison of apportionment estimates reveal varying degrees of sensitivity to changing priors, inclusion of covariance terms, incorporation of time-variant distributions, and methods of proportion characterization. We also demonstrate differences in apportionment results between a full and an empirical Bayesian setup, and between a Bayesian and a frequentist optimization approach. This OFAT sensitivity analysis reveals that mixing model structural choices and error assumptions can significantly impact upon sediment source apportionment results, with estimated median contributions in this study varying by up to 21% between model versions. Users of mixing models are therefore strongly advised to carefully consider and justify their choice of model structure prior to conducting sediment source apportionment investigations. Key Points An OFAT sensitivity analysis of sediment fingerprinting mixing models is conducted Bayesian models display high sensitivity to error assumptions and structural choices Source apportionment results differ between Bayesian and frequentist approaches PMID

  2. Bayesian Hierarchical Modeling for Big Data Fusion in Soil Hydrology

    NASA Astrophysics Data System (ADS)

    Mohanty, B.; Kathuria, D.; Katzfuss, M.

    2016-12-01

    Soil moisture datasets from remote sensing (RS) platforms (such as SMOS and SMAP) and reanalysis products from land surface models are typically available on a coarse spatial granularity of several square km. Ground based sensors on the other hand provide observations on a finer spatial scale (meter scale or less) but are sparsely available. Soil moisture is affected by high variability due to complex interactions between geologic, topographic, vegetation and atmospheric variables. Hydrologic processes usually occur at a scale of 1 km or less and therefore spatially ubiquitous and temporally periodic soil moisture products at this scale are required to aid local decision makers in agriculture, weather prediction and reservoir operations. Past literature has largely focused on downscaling RS soil moisture for a small extent of a field or a watershed and hence the applicability of such products has been limited. The present study employs a spatial Bayesian Hierarchical Model (BHM) to derive soil moisture products at a spatial scale of 1 km for the state of Oklahoma by fusing point scale Mesonet data and coarse scale RS data for soil moisture and its auxiliary covariates such as precipitation, topography, soil texture and vegetation. It is seen that the BHM model handles change of support problems easily while performing accurate uncertainty quantification arising from measurement errors and imperfect retrieval algorithms. The computational challenge arising due to the large number of measurements is tackled by utilizing basis function approaches and likelihood approximations. The BHM model can be considered as a complex Bayesian extension of traditional geostatistical prediction methods (such as Kriging) for large datasets in the presence of uncertainties.

  3. A Bayesian hierarchical model for categorical data with nonignorable nonresponse.

    PubMed

    Green, Paul E; Park, Taesung

    2003-12-01

    Log-linear models have been shown to be useful for smoothing contingency tables when categorical outcomes are subject to nonignorable nonresponse. A log-linear model can be fit to an augmented data table that includes an indicator variable designating whether subjects are respondents or nonrespondents. Maximum likelihood estimates calculated from the augmented data table are known to suffer from instability due to boundary solutions. Park and Brown (1994, Journal of the American Statistical Association 89, 44-52) and Park (1998, Biometrics 54, 1579-1590) developed empirical Bayes models that tend to smooth estimates away from the boundary. In those approaches, estimates for nonrespondents were calculated using an EM algorithm by maximizing a posterior distribution. As an extension of their earlier work, we develop a Bayesian hierarchical model that incorporates a log-linear model in the prior specification. In addition, due to uncertainty in the variable selection process associated with just one log-linear model, we simultaneously consider a finite number of models using a stochastic search variable selection (SSVS) procedure due to George and McCulloch (1997, Statistica Sinica 7, 339-373). The integration of the SSVS procedure into a Markov chain Monte Carlo (MCMC) sampler is straightforward, and leads to estimates of cell frequencies for the nonrespondents that are averages resulting from several log-linear models. The methods are demonstrated with a data example involving serum creatinine levels of patients who survived renal transplants. A simulation study is conducted to investigate properties of the model.

  4. Addressing model structural uncertainty in PUBs via Bayesian approach

    NASA Astrophysics Data System (ADS)

    Prieto, Cristina; Le-Vine, Nataliya; Vitolo, Claudia; Medina, Raúl

    2017-04-01

    A catchment is a complex system where a multitude of interrelated energy, water and vegetation processes occur at different temporal and spatial scales. A rainfall-runoff model is a simplified representation of the system, and serves as a hypothesis about catchment inner working. In predictions for ungauged basins, a common practice is to use a pre-selected model structure for a catchment, while there is usually no justification for its suitability (due to the lack of observed flows). This research aims moving beyond the 'one size fits all' problem. First, two metrics are proposed to assess suitability and adequacy of a selected model based on a) how well the model reproduces regionalised information, b) knowledge gain from considering the model over what is known from regionalisation alone. Second, dominant hydrological mechanisms (to be included into a model) are identified using the regionalised information via Bayesian approach. And third, available model structures are ranked and weighed based on their skill to support regionalised information, and then used in a multi-model ensemble to provide probabilistic predictions. The methodology is applied to basins in Northern Spain with varied hydroclimatological regimes. The results show that prediction quality is sensitive to model (or ensemble) error, quality of regionalised information, and available information content.

  5. Diagnosing Hybrid Systems: a Bayesian Model Selection Approach

    NASA Technical Reports Server (NTRS)

    McIlraith, Sheila A.

    2005-01-01

    In this paper we examine the problem of monitoring and diagnosing noisy complex dynamical systems that are modeled as hybrid systems-models of continuous behavior, interleaved by discrete transitions. In particular, we examine continuous systems with embedded supervisory controllers that experience abrupt, partial or full failure of component devices. Building on our previous work in this area (MBCG99;MBCG00), our specific focus in this paper ins on the mathematical formulation of the hybrid monitoring and diagnosis task as a Bayesian model tracking algorithm. The nonlinear dynamics of many hybrid systems present challenges to probabilistic tracking. Further, probabilistic tracking of a system for the purposes of diagnosis is problematic because the models of the system corresponding to failure modes are numerous and generally very unlikely. To focus tracking on these unlikely models and to reduce the number of potential models under consideration, we exploit logic-based techniques for qualitative model-based diagnosis to conjecture a limited initial set of consistent candidate models. In this paper we discuss alternative tracking techniques that are relevant to different classes of hybrid systems, focusing specifically on a method for tracking multiple models of nonlinear behavior simultaneously using factored sampling and conditional density propagation. To illustrate and motivate the approach described in this paper we examine the problem of monitoring and diganosing NASA's Sprint AERCam, a small spherical robotic camera unit with 12 thrusters that enable both linear and rotational motion.

  6. Fast Bayesian parameter estimation for stochastic logistic growth models.

    PubMed

    Heydari, Jonathan; Lawless, Conor; Lydall, David A; Wilkinson, Darren J

    2014-08-01

    The transition density of a stochastic, logistic population growth model with multiplicative intrinsic noise is analytically intractable. Inferring model parameter values by fitting such stochastic differential equation (SDE) models to data therefore requires relatively slow numerical simulation. Where such simulation is prohibitively slow, an alternative is to use model approximations which do have an analytically tractable transition density, enabling fast inference. We introduce two such approximations, with either multiplicative or additive intrinsic noise, each derived from the linear noise approximation (LNA) of a logistic growth SDE. After Bayesian inference we find that our fast LNA models, using Kalman filter recursion for computation of marginal likelihoods, give similar posterior distributions to slow, arbitrarily exact models. We also demonstrate that simulations from our LNA models better describe the characteristics of the stochastic logistic growth models than a related approach. Finally, we demonstrate that our LNA model with additive intrinsic noise and measurement error best describes an example set of longitudinal observations of microbial population size taken from a typical, genome-wide screening experiment. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  7. Bayesian network models for error detection in radiotherapy plans

    NASA Astrophysics Data System (ADS)

    Kalet, Alan M.; Gennari, John H.; Ford, Eric C.; Phillips, Mark H.

    2015-04-01

    The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network’s conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures.

  8. Bayesian network models for error detection in radiotherapy plans.

    PubMed

    Kalet, Alan M; Gennari, John H; Ford, Eric C; Phillips, Mark H

    2015-04-07

    The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network's conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures.

  9. A comparison of observation-level random effect and Beta-Binomial models for modelling overdispersion in Binomial data in ecology & evolution.

    PubMed

    Harrison, Xavier A

    2015-01-01

    Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels), I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data

  10. A Bayesian Attractor Model for Perceptual Decision Making

    PubMed Central

    Bitzer, Sebastian; Bruineberg, Jelle; Kiebel, Stefan J.

    2015-01-01

    Even for simple perceptual decisions, the mechanisms that the brain employs are still under debate. Although current consensus states that the brain accumulates evidence extracted from noisy sensory information, open questions remain about how this simple model relates to other perceptual phenomena such as flexibility in decisions, decision-dependent modulation of sensory gain, or confidence about a decision. We propose a novel approach of how perceptual decisions are made by combining two influential formalisms into a new model. Specifically, we embed an attractor model of decision making into a probabilistic framework that models decision making as Bayesian inference. We show that the new model can explain decision making behaviour by fitting it to experimental data. In addition, the new model combines for the first time three important features: First, the model can update decisions in response to switches in the underlying stimulus. Second, the probabilistic formulation accounts for top-down effects that may explain recent experimental findings of decision-related gain modulation of sensory neurons. Finally, the model computes an explicit measure of confidence which we relate to recent experimental evidence for confidence computations in perceptual decision tasks. PMID:26267143

  11. Bayesian analysis of a reduced-form air quality model.

    PubMed

    Foley, Kristen M; Reich, Brian J; Napelenok, Sergey L

    2012-07-17

    Numerical air quality models are being used for assessing emission control strategies for improving ambient pollution levels across the globe. This paper applies probabilistic modeling to evaluate the effectiveness of emission reduction scenarios aimed at lowering ground-level ozone concentrations. A Bayesian hierarchical model is used to combine air quality model output and monitoring data in order to characterize the impact of emissions reductions while accounting for different degrees of uncertainty in the modeled emissions inputs. The probabilistic model predictions are weighted based on population density in order to better quantify the societal benefits/disbenefits of four hypothetical emission reduction scenarios in which domain-wide NO(x) emissions from various sectors are reduced individually and then simultaneously. Cross validation analysis shows the statistical model performs well compared to observed ozone levels. Accounting for the variability and uncertainty in the emissions and atmospheric systems being modeled is shown to impact how emission reduction scenarios would be ranked, compared to standard methodology.

  12. Assimilating multi-source uncertainties of a parsimonious conceptual hydrological model using hierarchical Bayesian modeling

    Treesearch

    Wei Wu; James Clark; James Vose

    2010-01-01

    Hierarchical Bayesian (HB) modeling allows for multiple sources of uncertainty by factoring complex relationships into conditional distributions that can be used to draw inference and make predictions. We applied an HB model to estimate the parameters and state variables of a parsimonious hydrological model – GR4J – by coherently assimilating the uncertainties from the...

  13. A Bayesian modelling framework for tornado occurrences in North America.

    PubMed

    Cheng, Vincent Y S; Arhonditsis, George B; Sills, David M L; Gough, William A; Auld, Heather

    2015-03-25

    Tornadoes represent one of nature's most hazardous phenomena that have been responsible for significant destruction and devastating fatalities. Here we present a Bayesian modelling approach for elucidating the spatiotemporal patterns of tornado activity in North America. Our analysis shows a significant increase in the Canadian Prairies and the Northern Great Plains during the summer, indicating a clear transition of tornado activity from the United States to Canada. The linkage between monthly-averaged atmospheric variables and likelihood of tornado events is characterized by distinct seasonality; the convective available potential energy is the predominant factor in the summer; vertical wind shear appears to have a strong signature primarily in the winter and secondarily in the summer; and storm relative environmental helicity is most influential in the spring. The present probabilistic mapping can be used to draw inference on the likelihood of tornado occurrence in any location in North America within a selected time period of the year.

  14. iSEDfit: Bayesian spectral energy distribution modeling of galaxies

    NASA Astrophysics Data System (ADS)

    Moustakas, John

    2017-08-01

    iSEDfit uses Bayesian inference to extract the physical properties of galaxies from their observed broadband photometric spectral energy distribution (SED). In its default mode, the inputs to iSEDfit are the measured photometry (fluxes and corresponding inverse variances) and a measurement of the galaxy redshift. Alternatively, iSEDfit can be used to estimate photometric redshifts from the input photometry alone. After the priors have been specified, iSEDfit calculates the marginalized posterior probability distributions for the physical parameters of interest, including the stellar mass, star-formation rate, dust content, star formation history, and stellar metallicity. iSEDfit also optionally computes K-corrections and produces multiple "quality assurance" (QA) plots at each stage of the modeling procedure to aid in the interpretation of the prior parameter choices and subsequent fitting results. The software is distributed as part of the impro IDL suite.

  15. A Bayesian modelling framework for tornado occurrences in North America

    NASA Astrophysics Data System (ADS)

    Cheng, Vincent Y. S.; Arhonditsis, George B.; Sills, David M. L.; Gough, William A.; Auld, Heather

    2015-03-01

    Tornadoes represent one of nature’s most hazardous phenomena that have been responsible for significant destruction and devastating fatalities. Here we present a Bayesian modelling approach for elucidating the spatiotemporal patterns of tornado activity in North America. Our analysis shows a significant increase in the Canadian Prairies and the Northern Great Plains during the summer, indicating a clear transition of tornado activity from the United States to Canada. The linkage between monthly-averaged atmospheric variables and likelihood of tornado events is characterized by distinct seasonality; the convective available potential energy is the predominant factor in the summer; vertical wind shear appears to have a strong signature primarily in the winter and secondarily in the summer; and storm relative environmental helicity is most influential in the spring. The present probabilistic mapping can be used to draw inference on the likelihood of tornado occurrence in any location in North America within a selected time period of the year.

  16. Reginal Frequency Analysis Based on Scaling Properties and Bayesian Models

    NASA Astrophysics Data System (ADS)

    Kwon, Hyun-Han; Lee, Jeong-Ju; Moon, Young-Il

    2010-05-01

    A regional frequency analysis based on Hierarchical Bayesian Network (HBN) and scaling theory was developmed. Many recording rain gauges over South Korea were used for the analysis. First, a scaling approach combined with extreme distribution was employed to derive regional formula for frequency analysis. Second, HBN model was used to represent additional information about the regional structure of the scaling parameters, especially the location parameter and shape parameter. The location and shape parameters of the extreme distribution were estimated by utilizing scaling properties in a regression framework, and the scaling parameters linking the parameters (location and shape) to various duration times were simultaneously estimated. It was found that the regional frequency analysis combined with HBN and scaling properties show promising results in terms of establishing regional IDF curves.

  17. Designing and testing inflationary models with Bayesian networks

    SciTech Connect

    Price, Layne C.; Peiris, Hiranya V.; Frazer, Jonathan; Easther, Richard E-mail: h.peiris@ucl.ac.uk E-mail: r.easther@auckland.ac.nz

    2016-02-01

    Even simple inflationary scenarios have many free parameters. Beyond the variables appearing in the inflationary action, these include dynamical initial conditions, the number of fields, and couplings to other sectors. These quantities are often ignored but cosmological observables can depend on the unknown parameters. We use Bayesian networks to account for a large set of inflationary parameters, deriving generative models for the primordial spectra that are conditioned on a hierarchical set of prior probabilities describing the initial conditions, reheating physics, and other free parameters. We use N{sub f}-quadratic inflation as an illustrative example, finding that the number of e-folds N{sub *} between horizon exit for the pivot scale and the end of inflation is typically the most important parameter, even when the number of fields, their masses and initial conditions are unknown, along with possible conditional dependencies between these parameters.

  18. Antibiotic resistance in hospitals: a ward-specific random effect model in a low antibiotic consumption environment.

    PubMed

    Aldrin, Magne; Raastad, Ragnhild; Tvete, Ingunn Fride; Berild, Dag; Frigessi, Arnoldo; Leegaard, Truls; Monnet, Dominique L; Walberg, Mette; Müller, Fredrik

    2013-04-15

    Association between previous antibiotic use and emergence of antibiotic resistance has been reported for several microorganisms. The relationship has been extensively studied, and although the causes of antibiotic resistance are multi-factorial, clear evidence of antibiotic use as a major risk factor exists. Most studies are carried out in countries with high consumption of antibiotics and corresponding high levels of antibiotic resistance, and currently, little is known whether and at what level the associations are detectable in a low antibiotic consumption environment. We conduct an ecological, retrospective study aimed at determining the impact of antibiotic consumption on antibiotic-resistant Pseudomonas aeruginosa in three hospitals in Norway, a country with low levels of antibiotic use. We construct a sophisticated statistical model to capture such low signals. To reduce noise, we conduct our study at hospital ward level. We propose a random effect Poisson or binomial regression model, with a reparametrisation that allows us to reduce the number of parameters. Inference is likelihood based. Through scenario simulation, we study the potential effects of reduced or increased antibiotic use. Results clearly indicate that the effects of consumption on resistance are present under conditions with relatively low use of antibiotic agents. This strengthens the recommendation on prudent use of antibiotics, even when consumption is relatively low. Copyright © 2012 John Wiley & Sons, Ltd.

  19. Bayesian analysis of input uncertainty in hydrological modeling: 2. Application

    NASA Astrophysics Data System (ADS)

    Kavetski, Dmitri; Kuczera, George; Franks, Stewart W.

    2006-03-01

    The Bayesian total error analysis (BATEA) methodology directly addresses both input and output errors in hydrological modeling, requiring the modeler to make explicit, rather than implicit, assumptions about the likely extent of data uncertainty. This study considers a BATEA assessment of two North American catchments: (1) French Broad River and (2) Potomac basins. It assesses the performance of the conceptual Variable Infiltration Capacity (VIC) model with and without accounting for input (precipitation) uncertainty. The results show the considerable effects of precipitation errors on the predicted hydrographs (especially the prediction limits) and on the calibrated parameters. In addition, the performance of BATEA in the presence of severe model errors is analyzed. While BATEA allows a very direct treatment of input uncertainty and yields some limited insight into model errors, it requires the specification of valid error models, which are currently poorly understood and require further work. Moreover, it leads to computationally challenging highly dimensional problems. For some types of models, including the VIC implemented using robust numerical methods, the computational cost of BATEA can be reduced using Newton-type methods.

  20. Bridging groundwater models and decision support with a Bayesian network

    USGS Publications Warehouse

    Fienen, Michael N.; Masterson, John P.; Plant, Nathaniel G.; Gutierrez, Benjamin T.; Thieler, E. Robert

    2013-01-01

    Resource managers need to make decisions to plan for future environmental conditions, particularly sea level rise, in the face of substantial uncertainty. Many interacting processes factor in to the decisions they face. Advances in process models and the quantification of uncertainty have made models a valuable tool for this purpose. Long-simulation runtimes and, often, numerical instability make linking process models impractical in many cases. A method for emulating the important connections between model input and forecasts, while propagating uncertainty, has the potential to provide a bridge between complicated numerical process models and the efficiency and stability needed for decision making. We explore this using a Bayesian network (BN) to emulate a groundwater flow model. We expand on previous approaches to validating a BN by calculating forecasting skill using cross validation of a groundwater model of Assateague Island in Virginia and Maryland, USA. This BN emulation was shown to capture the important groundwater-flow characteristics and uncertainty of the groundwater system because of its connection to island morphology and sea level. Forecast power metrics associated with the validation of multiple alternative BN designs guided the selection of an optimal level of BN complexity. Assateague island is an ideal test case for exploring a forecasting tool based on current conditions because the unique hydrogeomorphological variability of the island includes a range of settings indicative of past, current, and future conditions. The resulting BN is a valuable tool for exploring the response of groundwater conditions to sea level rise in decision support.

  1. Hunting down the best model of inflation with Bayesian evidence

    SciTech Connect

    Martin, Jerome; Ringeval, Christophe; Trotta, Roberto

    2011-03-15

    We present the first calculation of the Bayesian evidence for different prototypical single field inflationary scenarios, including representative classes of small field and large field models. This approach allows us to compare inflationary models in a well-defined statistical way and to determine the current 'best model of inflation'. The calculation is performed numerically by interfacing the inflationary code FieldInf with MultiNest. We find that small field models are currently preferred, while large field models having a self-interacting potential of power p>4 are strongly disfavored. The class of small field models as a whole has posterior odds of approximately 3 ratio 1 when compared with the large field class. The methodology and results presented in this article are an additional step toward the construction of a full numerical pipeline to constrain the physics of the early Universe with astrophysical observations. More accurate data (such as the Planck data) and the techniques introduced here should allow us to identify conclusively the best inflationary model.

  2. Toward diagnostic model calibration and evaluation: Approximate Bayesian computation

    NASA Astrophysics Data System (ADS)

    Vrugt, Jasper A.; Sadegh, Mojtaba

    2013-07-01

    The ever increasing pace of computational power, along with continued advances in measurement technologies and improvements in process understanding has stimulated the development of increasingly complex hydrologic models that simulate soil moisture flow, groundwater recharge, surface runoff, root water uptake, and river discharge at different spatial and temporal scales. Reconciling these high-order system models with perpetually larger volumes of field data is becoming more and more difficult, particularly because classical likelihood-based fitting methods lack the power to detect and pinpoint deficiencies in the model structure. Gupta et al. (2008) has recently proposed steps (amongst others) toward the development of a more robust and powerful method of model evaluation. Their diagnostic approach uses signature behaviors and patterns observed in the input-output data to illuminate to what degree a representation of the real world has been adequately achieved and how the model should be improved for the purpose of learning and scientific discovery. In this paper, we introduce approximate Bayesian computation (ABC) as a vehicle for diagnostic model evaluation. This statistical methodology relaxes the need for an explicit likelihood function in favor of one or multiple different summary statistics rooted in hydrologic theory that together have a clearer and more compelling diagnostic power than some average measure of the size of the error residuals. Two illustrative case studies are used to demonstrate that ABC is relatively easy to implement, and readily employs signature based indices to analyze and pinpoint which part of the model is malfunctioning and in need of further improvement.

  3. Bridging groundwater models and decision support with a Bayesian network

    NASA Astrophysics Data System (ADS)

    Fienen, Michael N.; Masterson, John P.; Plant, Nathaniel G.; Gutierrez, Benjamin T.; Thieler, E. Robert

    2013-10-01

    Resource managers need to make decisions to plan for future environmental conditions, particularly sea level rise, in the face of substantial uncertainty. Many interacting processes factor in to the decisions they face. Advances in process models and the quantification of uncertainty have made models a valuable tool for this purpose. Long-simulation runtimes and, often, numerical instability make linking process models impractical in many cases. A method for emulating the important connections between model input and forecasts, while propagating uncertainty, has the potential to provide a bridge between complicated numerical process models and the efficiency and stability needed for decision making. We explore this using a Bayesian network (BN) to emulate a groundwater flow model. We expand on previous approaches to validating a BN by calculating forecasting skill using cross validation of a groundwater model of Assateague Island in Virginia and Maryland, USA. This BN emulation was shown to capture the important groundwater-flow characteristics and uncertainty of the groundwater system because of its connection to island morphology and sea level. Forecast power metrics associated with the validation of multiple alternative BN designs guided the selection of an optimal level of BN complexity. Assateague island is an ideal test case for exploring a forecasting tool based on current conditions because the unique hydrogeomorphological variability of the island includes a range of settings indicative of past, current, and future conditions. The resulting BN is a valuable tool for exploring the response of groundwater conditions to sea level rise in decision support.

  4. EFFICIENT MODEL-FITTING AND MODEL-COMPARISON FOR HIGH-DIMENSIONAL BAYESIAN GEOSTATISTICAL MODELS. (R826887)

    EPA Science Inventory

    Geostatistical models are appropriate for spatially distributed data measured at irregularly spaced locations. We propose an efficient Markov chain Monte Carlo (MCMC) algorithm for fitting Bayesian geostatistical models with substantial numbers of unknown parameters to sizable...

  5. EFFICIENT MODEL-FITTING AND MODEL-COMPARISON FOR HIGH-DIMENSIONAL BAYESIAN GEOSTATISTICAL MODELS. (R826887)

    EPA Science Inventory

    Geostatistical models are appropriate for spatially distributed data measured at irregularly spaced locations. We propose an efficient Markov chain Monte Carlo (MCMC) algorithm for fitting Bayesian geostatistical models with substantial numbers of unknown parameters to sizable...

  6. Using hierarchical Bayesian binary probit models to analyze crash injury severity on high speed facilities with real-time traffic data.

    PubMed

    Yu, Rongjie; Abdel-Aty, Mohamed

    2014-01-01

    Severe crashes are causing serious social and economic loss, and because of this, reducing crash injury severity has become one of the key objectives of the high speed facilities' (freeway and expressway) management. Traditional crash injury severity analysis utilized data mainly from crash reports concerning the crash occurrence information, drivers' characteristics and roadway geometric related variables. In this study, real-time traffic and weather data were introduced to analyze the crash injury severity. The space mean speeds captured by the Automatic Vehicle Identification (AVI) system on the two roadways were used as explanatory variables in this study; and data from a mountainous freeway (I-70 in Colorado) and an urban expressway (State Road 408 in Orlando) have been used to identify the analysis result's consistence. Binary probit (BP) models were estimated to classify the non-severe (property damage only) crashes and severe (injury and fatality) crashes. Firstly, Bayesian BP models' results were compared to the results from Maximum Likelihood Estimation BP models and it was concluded that Bayesian inference was superior with more significant variables. Then different levels of hierarchical Bayesian BP models were developed with random effects accounting for the unobserved heterogeneity at segment level and crash individual level, respectively. Modeling results from both studied locations demonstrate that large variations of speed prior to the crash occurrence would increase the likelihood of severe crash occurrence. Moreover, with considering unobserved heterogeneity in the Bayesian BP models, the model goodness-of-fit has improved substantially. Finally, possible future applications of the model results and the hierarchical Bayesian probit models were discussed.

  7. Analyzing large-scale conservation interventions with Bayesian hierarchical models: a case study of supplementing threatened Pacific salmon

    PubMed Central

    Scheuerell, Mark D; Buhle, Eric R; Semmens, Brice X; Ford, Michael J; Cooney, Tom; Carmichael, Richard W

    2015-01-01

    Myriad human activities increasingly threaten the existence of many species. A variety of conservation interventions such as habitat restoration, protected areas, and captive breeding have been used to prevent extinctions. Evaluating the effectiveness of these interventions requires appropriate statistical methods, given the quantity and quality of available data. Historically, analysis of variance has been used with some form of predetermined before-after control-impact design to estimate the effects of large-scale experiments or conservation interventions. However, ad hoc retrospective study designs or the presence of random effects at multiple scales may preclude the use of these tools. We evaluated the effects of a large-scale supplementation program on the density of adult Chinook salmon Oncorhynchus tshawytscha from the Snake River basin in the northwestern United States currently listed under the U.S. Endangered Species Act. We analyzed 43 years of data from 22 populations, accounting for random effects across time and space using a form of Bayesian hierarchical time-series model common in analyses of financial markets. We found that varying degrees of supplementation over a period of 25 years increased the density of natural-origin adults, on average, by 0–8% relative to nonsupplementation years. Thirty-nine of the 43 year effects were at least two times larger in magnitude than the mean supplementation effect, suggesting common environmental variables play a more important role in driving interannual variability in adult density. Additional residual variation in density varied considerably across the region, but there was no systematic difference between supplemented and reference populations. Our results demonstrate the power of hierarchical Bayesian models to detect the diffuse effects of management interventions and to quantitatively describe the variability of intervention success. Nevertheless, our study could not address whether ecological

  8. A Bayesian Reformulation of the Extended Drift-Diffusion Model in Perceptual Decision Making

    PubMed Central

    Fard, Pouyan R.; Park, Hame; Warkentin, Andrej; Kiebel, Stefan J.; Bitzer, Sebastian

    2017-01-01

    Perceptual decision making can be described as a process of accumulating evidence to a bound which has been formalized within drift-diffusion models (DDMs). Recently, an equivalent Bayesian model has been proposed. In contrast to standard DDMs, this Bayesian model directly links information in the stimulus to the decision process. Here, we extend this Bayesian model further and allow inter-trial variability of two parameters following the extended version of the DDM. We derive parameter distributions for the Bayesian model and show that they lead to predictions that are qualitatively equivalent to those made by the extended drift-diffusion model (eDDM). Further, we demonstrate the usefulness of the extended Bayesian model (eBM) for the analysis of concrete behavioral data. Specifically, using Bayesian model selection, we find evidence that including additional inter-trial parameter variability provides for a better model, when the model is constrained by trial-wise stimulus features. This result is remarkable because it was derived using just 200 trials per condition, which is typically thought to be insufficient for identifying variability parameters in DDMs. In sum, we present a Bayesian analysis, which provides for a novel and promising analysis of perceptual decision making experiments. PMID:28553219

  9. A Bayesian Reformulation of the Extended Drift-Diffusion Model in Perceptual Decision Making.

    PubMed

    Fard, Pouyan R; Park, Hame; Warkentin, Andrej; Kiebel, Stefan J; Bitzer, Sebastian

    2017-01-01

    Perceptual decision making can be described as a process of accumulating evidence to a bound which has been formalized within drift-diffusion models (DDMs). Recently, an equivalent Bayesian model has been proposed. In contrast to standard DDMs, this Bayesian model directly links information in the stimulus to the decision process. Here, we extend this Bayesian model further and allow inter-trial variability of two parameters following the extended version of the DDM. We derive parameter distributions for the Bayesian model and show that they lead to predictions that are qualitatively equivalent to those made by the extended drift-diffusion model (eDDM). Further, we demonstrate the usefulness of the extended Bayesian model (eBM) for the analysis of concrete behavioral data. Specifically, using Bayesian model selection, we find evidence that including additional inter-trial parameter variability provides for a better model, when the model is constrained by trial-wise stimulus features. This result is remarkable because it was derived using just 200 trials per condition, which is typically thought to be insufficient for identifying variability parameters in DDMs. In sum, we present a Bayesian analysis, which provides for a novel and promising analysis of perceptual decision making experiments.

  10. Hierarchical Bayesian modeling of heterogeneous variances in average daily weight gain of commercial feedlot cattle.

    PubMed

    Cernicchiaro, N; Renter, D G; Xiang, S; White, B J; Bello, N M

    2013-06-01

    Variability in ADG of feedlot cattle can affect profits, thus making overall returns more unstable. Hence, knowledge of the factors that contribute to heterogeneity of variances in animal performance can help feedlot managers evaluate risks and minimize profit volatility when making managerial and economic decisions in commercial feedlots. The objectives of the present study were to evaluate heteroskedasticity, defined as heterogeneity of variances, in ADG of cohorts of commercial feedlot cattle, and to identify cattle demographic factors at feedlot arrival as potential sources of variance heterogeneity, accounting for cohort- and feedlot-level information in the data structure. An operational dataset compiled from 24,050 cohorts from 25 U. S. commercial feedlots in 2005 and 2006 was used for this study. Inference was based on a hierarchical Bayesian model implemented with Markov chain Monte Carlo, whereby cohorts were modeled at the residual level and feedlot-year clusters were modeled as random effects. Forward model selection based on deviance information criteria was used to screen potentially important explanatory variables for heteroskedasticity at cohort- and feedlot-year levels. The Bayesian modeling framework was preferred as it naturally accommodates the inherently hierarchical structure of feedlot data whereby cohorts are nested within feedlot-year clusters. Evidence for heterogeneity of variance components of ADG was substantial and primarily concentrated at the cohort level. Feedlot-year specific effects were, by far, the greatest contributors to ADG heteroskedasticity among cohorts, with an estimated ∼12-fold change in dispersion between most and least extreme feedlot-year clusters. In addition, identifiable demographic factors associated with greater heterogeneity of cohort-level variance included smaller cohort sizes, fewer days on feed, and greater arrival BW, as well as feedlot arrival during summer months. These results support that

  11. Bayesian Multiscale Modeling of Closed Curves in Point Clouds.

    PubMed

    Gu, Kelvin; Pati, Debdeep; Dunson, David B

    2014-10-01

    Modeling object boundaries based on image or point cloud data is frequently necessary in medical and scientific applications ranging from detecting tumor contours for targeted radiation therapy, to the classification of organisms based on their structural information. In low-contrast images or sparse and noisy point clouds, there is often insufficient data to recover local segments of the boundary in isolation. Thus, it becomes critical to model the entire boundary in the form of a closed curve. To achieve this, we develop a Bayesian hierarchical model that expresses highly diverse 2D objects in the form of closed curves. The model is based on a novel multiscale deformation process. By relating multiple objects through a hierarchical formulation, we can successfully recover missing boundaries by borrowing structural information from similar objects at the appropriate scale. Furthermore, the model's latent parameters help interpret the population, indicating dimensions of significant structural variability and also specifying a 'central curve' that summarizes the collection. Theoretical properties of our prior are studied in specific cases and efficient Markov chain Monte Carlo methods are developed, evaluated through simulation examples and applied to panorex teeth images for modeling teeth contours and also to a brain tumor contour detection problem.

  12. Bayesian Network Webserver: a comprehensive tool for biological network modeling.

    PubMed

    Ziebarth, Jesse D; Bhattacharya, Anindya; Cui, Yan

    2013-11-01

    The Bayesian Network Webserver (BNW) is a platform for comprehensive network modeling of systems genetics and other biological datasets. It allows users to quickly and seamlessly upload a dataset, learn the structure of the network model that best explains the data and use the model to understand relationships between network variables. Many datasets, including those used to create genetic network models, contain both discrete (e.g. genotype) and continuous (e.g. gene expression traits) variables, and BNW allows for modeling hybrid datasets. Users of BNW can incorporate prior knowledge during structure learning through an easy-to-use structural constraint interface. After structure learning, users are immediately presented with an interactive network model, which can be used to make testable hypotheses about network relationships. BNW, including a downloadable structure learning package, is available at http://compbio.uthsc.edu/BNW. (The BNW interface for adding structural constraints uses HTML5 features that are not supported by current version of Internet Explorer. We suggest using other browsers (e.g. Google Chrome or Mozilla Firefox) when accessing BNW). ycui2@uthsc.edu. Supplementary data are available at Bioinformatics online.

  13. A Bayesian model for estimating multi-state disease progression

    PubMed Central

    Shen, Shiwen; Han, Simon X.; Petousis, Panayiotis; Weiss, Robert E.; Meng, Frank; Bui, Alex A.T.; Hsu, William

    2017-01-01

    A growing number of individuals who are considered at high risk of cancer are now routinely undergoing population screening. However, noted harms such as radiation exposure, overdiagnosis, and overtreatment underscore the need for better temporal models that predict who should be screened and at what frequency. The mean sojourn time (MST), an average duration period when a tumor can be detected by imaging but with no observable clinical symptoms, is a critical variable for formulating screening policy. Estimation of MST has been long studied using continuous Markov model (CMM) with Maximum likelihood estimation (MLE). However, a lot of traditional methods assume no observation error of the imaging data, which is unlikely and can bias the estimation of the MST. In addition, the MLE may not be stably estimated when data is sparse. Addressing these shortcomings, we present a probabilistic modeling approach for periodic cancer screening data. We first model the cancer state transition using a three state CMM model, while simultaneously considering observation error. We then jointly estimate the MST and observation error within a Bayesian framework. We also consider the inclusion of covariates to estimate individualized rates of disease progression. Our approach is demonstrated on participants who underwent chest x-ray screening in the National Lung Screening Trial (NLST) and validated using posterior predictive p-values and Pearson’s chi-square test. Our model demonstrates more accurate and sensible estimates of MST in comparison to MLE. PMID:28038345

  14. A Bayesian model for estimating multi-state disease progression.

    PubMed

    Shen, Shiwen; Han, Simon X; Petousis, Panayiotis; Weiss, Robert E; Meng, Frank; Bui, Alex A T; Hsu, William

    2017-02-01

    A growing number of individuals who are considered at high risk of cancer are now routinely undergoing population screening. However, noted harms such as radiation exposure, overdiagnosis, and overtreatment underscore the need for better temporal models that predict who should be screened and at what frequency. The mean sojourn time (MST), an average duration period when a tumor can be detected by imaging but with no observable clinical symptoms, is a critical variable for formulating screening policy. Estimation of MST has been long studied using continuous Markov model (CMM) with Maximum likelihood estimation (MLE). However, a lot of traditional methods assume no observation error of the imaging data, which is unlikely and can bias the estimation of the MST. In addition, the MLE may not be stably estimated when data is sparse. Addressing these shortcomings, we present a probabilistic modeling approach for periodic cancer screening data. We first model the cancer state transition using a three state CMM model, while simultaneously considering observation error. We then jointly estimate the MST and observation error within a Bayesian framework. We also consider the inclusion of covariates to estimate individualized rates of disease progression. Our approach is demonstrated on participants who underwent chest x-ray screening in the National Lung Screening Trial (NLST) and validated using posterior predictive p-values and Pearson's chi-square test. Our model demonstrates more accurate and sensible estimates of MST in comparison to MLE. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. A Bayesian hierarchical model for wind gust prediction

    NASA Astrophysics Data System (ADS)

    Friederichs, Petra; Oesting, Marco; Schlather, Martin

    2014-05-01

    A postprocessing method for ensemble wind gust forecasts given by a mesoscale limited area numerical weather prediction (NWP) model is presented, which is based on extreme value theory. A process layer for the parameters of a generalized extreme value distribution (GEV) is introduced using a Bayesian hierarchical model (BHM). Incorporating the information of the COMSO-DE forecasts, the process parameters model the spatial response surfaces of the GEV parameters as Gaussian random fields. The spatial BHM provides area wide forecasts of wind gusts in terms of a conditional GEV. It models the marginal distribution of the spatial gust process and provides not only forecasts of the conditional GEV at locations without observations, but also uncertainty information about the estimates. A disadvantages of BHM model is that it assumes conditional independent observations. In order to incorporate the dependence between gusts at neighboring locations as well as the spatial random fields of observed and forecasted maximal wind gusts, we propose to model them jointly by a bivariate Brown-Resnick process.

  16. A Bayesian Network Approach to Modeling Learning Progressions and Task Performance. CRESST Report 776

    ERIC Educational Resources Information Center

    West, Patti; Rutstein, Daisy Wise; Mislevy, Robert J.; Liu, Junhui; Choi, Younyoung; Levy, Roy; Crawford, Aaron; DiCerbo, Kristen E.; Chappel, Kristina; Behrens, John T.

    2010-01-01

    A major issue in the study of learning progressions (LPs) is linking student performance on assessment tasks to the progressions. This report describes the challenges faced in making this linkage using Bayesian networks to model LPs in the field of computer networking. The ideas are illustrated with exemplar Bayesian networks built on Cisco…

  17. A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.

    ERIC Educational Resources Information Center

    Glas, Cees A. W.; Meijer, Rob R.

    A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…

  18. Bayesian Methods for Analyzing Structural Equation Models with Covariates, Interaction, and Quadratic Latent Variables

    ERIC Educational Resources Information Center

    Lee, Sik-Yum; Song, Xin-Yuan; Tang, Nian-Sheng

    2007-01-01

    The analysis of interaction among latent variables has received much attention. This article introduces a Bayesian approach to analyze a general structural equation model that accommodates the general nonlinear terms of latent variables and covariates. This approach produces a Bayesian estimate that has the same statistical optimal properties as a…

  19. Bayesian inference for joint modelling of longitudinal continuous, binary and ordinal events.

    PubMed

    Li, Qiuju; Pan, Jianxin; Belcher, John

    2016-12-01

    In medical studies, repeated measurements of continuous, binary and ordinal outcomes are routinely collected from the same patient. Instead of modelling each outcome separately, in this study we propose to jointly model the trivariate longitudinal responses, so as to take account of the inherent association between the different outcomes and thus improve statistical inferences. This work is motivated by a large cohort study in the North West of England, involving trivariate responses from each patient: Body Mass Index, Depression (Yes/No) ascertained with cut-off score not less than 8 at the Hospital Anxiety and Depression Scale, and Pain Interference generated from the Medical Outcomes Study 36-item short-form health survey with values returned on an ordinal scale 1-5. There are some well-established methods for combined continuous and binary, or even continuous and ordinal responses, but little work was done on the joint analysis of continuous, binary and ordinal responses. We propose conditional joint random-effects models, which take into account the inherent association between the continuous, binary and ordinal outcomes. Bayesian analysis methods are used to make statistical inferences. Simulation studies show that, by jointly modelling the trivariate outcomes, standard deviations of the estimates of parameters in the models are smaller and much more stable, leading to more efficient parameter estimates and reliable statistical inferences. In the real data analysis, the proposed joint analysis yields a much smaller deviance information criterion value than the separate analysis, and shows other good statistical properties too. © The Author(s) 2014.

  20. Assessing the consistency of the treatment effect under the discrete random effects model in multiregional clinical trials.

    PubMed

    Liu, Jung-Tzu; Tsou, Hsiao-Hui; Gordon Lan, K K; Chen, Chi-Tian; Lai, Yi-Hsuan; Chang, Wan-Jung; Tzeng, Chyng-Shyan; Hsiao, Chin-Fu

    2016-06-30

    In recent years, developing pharmaceutical products via multiregional clinical trials (MRCTs) has become standard. Traditionally, an MRCT would assume that a treatment effect is uniform across regions. However, heterogeneity among regions may have impact upon the evaluation of a medicine's effect. In this study, we consider a random effects model using discrete distribution (DREM) to account for heterogeneous treatment effects across regions for the design and evaluation of MRCTs. We derive an power function for a treatment that is beneficial under DREM and illustrate determination of the overall sample size in an MRCT. We use the concept of consistency based on Method 2 of the Japanese Ministry of Health, Labour, and Welfare's guidance to evaluate the probability for treatment benefit and consistency under DREM. We further derive an optimal sample size allocation over regions to maximize the power for consistency. Moreover, we provide three algorithms for deriving sample size at the desired level of power for benefit and consistency. In practice, regional treatment effects are unknown. Thus, we provide some guidelines on the design of MRCTs with consistency when the regional treatment effect are assumed to fall into a specified interval. Numerical examples are given to illustrate applications of the proposed approach. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.