Sample records for bayesian computation approach

  1. Computational Neuropsychology and Bayesian Inference.

    PubMed

    Parr, Thomas; Rees, Geraint; Friston, Karl J

    2018-01-01

    Computational theories of brain function have become very influential in neuroscience. They have facilitated the growth of formal approaches to disease, particularly in psychiatric research. In this paper, we provide a narrative review of the body of computational research addressing neuropsychological syndromes, and focus on those that employ Bayesian frameworks. Bayesian approaches to understanding brain function formulate perception and action as inferential processes. These inferences combine 'prior' beliefs with a generative (predictive) model to explain the causes of sensations. Under this view, neuropsychological deficits can be thought of as false inferences that arise due to aberrant prior beliefs (that are poor fits to the real world). This draws upon the notion of a Bayes optimal pathology - optimal inference with suboptimal priors - and provides a means for computational phenotyping. In principle, any given neuropsychological disorder could be characterized by the set of prior beliefs that would make a patient's behavior appear Bayes optimal. We start with an overview of some key theoretical constructs and use these to motivate a form of computational neuropsychology that relates anatomical structures in the brain to the computations they perform. Throughout, we draw upon computational accounts of neuropsychological syndromes. These are selected to emphasize the key features of a Bayesian approach, and the possible types of pathological prior that may be present. They range from visual neglect through hallucinations to autism. Through these illustrative examples, we review the use of Bayesian approaches to understand the link between biology and computation that is at the heart of neuropsychology.

  2. Computational Neuropsychology and Bayesian Inference

    PubMed Central

    Parr, Thomas; Rees, Geraint; Friston, Karl J.

    2018-01-01

    Computational theories of brain function have become very influential in neuroscience. They have facilitated the growth of formal approaches to disease, particularly in psychiatric research. In this paper, we provide a narrative review of the body of computational research addressing neuropsychological syndromes, and focus on those that employ Bayesian frameworks. Bayesian approaches to understanding brain function formulate perception and action as inferential processes. These inferences combine ‘prior’ beliefs with a generative (predictive) model to explain the causes of sensations. Under this view, neuropsychological deficits can be thought of as false inferences that arise due to aberrant prior beliefs (that are poor fits to the real world). This draws upon the notion of a Bayes optimal pathology – optimal inference with suboptimal priors – and provides a means for computational phenotyping. In principle, any given neuropsychological disorder could be characterized by the set of prior beliefs that would make a patient’s behavior appear Bayes optimal. We start with an overview of some key theoretical constructs and use these to motivate a form of computational neuropsychology that relates anatomical structures in the brain to the computations they perform. Throughout, we draw upon computational accounts of neuropsychological syndromes. These are selected to emphasize the key features of a Bayesian approach, and the possible types of pathological prior that may be present. They range from visual neglect through hallucinations to autism. Through these illustrative examples, we review the use of Bayesian approaches to understand the link between biology and computation that is at the heart of neuropsychology. PMID:29527157

  3. Efficient fuzzy Bayesian inference algorithms for incorporating expert knowledge in parameter estimation

    NASA Astrophysics Data System (ADS)

    Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad

    2016-05-01

    Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert elicitation methodology is developed and applied to the real-world test case in order to provide a road map for the use of fuzzy Bayesian inference in groundwater modeling applications.

  4. Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach

    NASA Technical Reports Server (NTRS)

    Warner, James E.; Hochhalter, Jacob D.

    2016-01-01

    This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.

  5. Two Approaches to Calibration in Metrology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Campanelli, Mark

    2014-04-01

    Inferring mathematical relationships with quantified uncertainty from measurement data is common to computational science and metrology. Sufficient knowledge of measurement process noise enables Bayesian inference. Otherwise, an alternative approach is required, here termed compartmentalized inference, because collection of uncertain data and model inference occur independently. Bayesian parameterized model inference is compared to a Bayesian-compatible compartmentalized approach for ISO-GUM compliant calibration problems in renewable energy metrology. In either approach, model evidence can help reduce model discrepancy.

  6. MUMPS Based Integration of Disparate Computer-Assisted Medical Diagnosis Modules

    DTIC Science & Technology

    1989-12-12

    modules use a Bayesian approach, while the Opthalmology module uses a Rule Based approach. In the current effort, MUMPS is used to develop an...Abdominal and Chest Pain modules use a Bayesian approach, while the Opthalmology module uses a Rule Based approach. In the current effort, MUMPS is used

  7. A Bayesian approach for parameter estimation and prediction using a computationally intensive model

    DOE PAGES

    Higdon, Dave; McDonnell, Jordan D.; Schunck, Nicolas; ...

    2015-02-05

    Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based modelmore » $$\\eta (\\theta )$$, where θ denotes the uncertain, best input setting. Hence the statistical model is of the form $$y=\\eta (\\theta )+\\epsilon ,$$ where $$\\epsilon $$ accounts for measurement, and possibly other, error sources. When nonlinearity is present in $$\\eta (\\cdot )$$, the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model $$\\eta (\\cdot )$$. This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. Lastly, we also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory.« less

  8. Bayesian evidence computation for model selection in non-linear geoacoustic inference problems.

    PubMed

    Dettmer, Jan; Dosso, Stan E; Osler, John C

    2010-12-01

    This paper applies a general Bayesian inference approach, based on Bayesian evidence computation, to geoacoustic inversion of interface-wave dispersion data. Quantitative model selection is carried out by computing the evidence (normalizing constants) for several model parameterizations using annealed importance sampling. The resulting posterior probability density estimate is compared to estimates obtained from Metropolis-Hastings sampling to ensure consistent results. The approach is applied to invert interface-wave dispersion data collected on the Scotian Shelf, off the east coast of Canada for the sediment shear-wave velocity profile. Results are consistent with previous work on these data but extend the analysis to a rigorous approach including model selection and uncertainty analysis. The results are also consistent with core samples and seismic reflection measurements carried out in the area.

  9. Identification of transmissivity fields using a Bayesian strategy and perturbative approach

    NASA Astrophysics Data System (ADS)

    Zanini, Andrea; Tanda, Maria Giovanna; Woodbury, Allan D.

    2017-10-01

    The paper deals with the crucial problem of the groundwater parameter estimation that is the basis for efficient modeling and reclamation activities. A hierarchical Bayesian approach is developed: it uses the Akaike's Bayesian Information Criteria in order to estimate the hyperparameters (related to the covariance model chosen) and to quantify the unknown noise variance. The transmissivity identification proceeds in two steps: the first, called empirical Bayesian interpolation, uses Y* (Y = lnT) observations to interpolate Y values on a specified grid; the second, called empirical Bayesian update, improve the previous Y estimate through the addition of hydraulic head observations. The relationship between the head and the lnT has been linearized through a perturbative solution of the flow equation. In order to test the proposed approach, synthetic aquifers from literature have been considered. The aquifers in question contain a variety of boundary conditions (both Dirichelet and Neuman type) and scales of heterogeneities (σY2 = 1.0 and σY2 = 5.3). The estimated transmissivity fields were compared to the true one. The joint use of Y* and head measurements improves the estimation of Y considering both degrees of heterogeneity. Even if the variance of the strong transmissivity field can be considered high for the application of the perturbative approach, the results show the same order of approximation of the non-linear methods proposed in literature. The procedure allows to compute the posterior probability distribution of the target quantities and to quantify the uncertainty in the model prediction. Bayesian updating has advantages related both to the Monte-Carlo (MC) and non-MC approaches. In fact, as the MC methods, Bayesian updating allows computing the direct posterior probability distribution of the target quantities and as non-MC methods it has computational times in the order of seconds.

  10. Application of Bayesian Approach in Cancer Clinical Trial

    PubMed Central

    Bhattacharjee, Atanu

    2014-01-01

    The application of Bayesian approach in clinical trials becomes more useful over classical method. It is beneficial from design to analysis phase. The straight forward statement is possible to obtain through Bayesian about the drug treatment effect. Complex computational problems are simple to handle with Bayesian techniques. The technique is only feasible to performing presence of prior information of the data. The inference is possible to establish through posterior estimates. However, some limitations are present in this method. The objective of this work was to explore the several merits and demerits of Bayesian approach in cancer research. The review of the technique will be helpful for the clinical researcher involved in the oncology to explore the limitation and power of Bayesian techniques. PMID:29147387

  11. MapReduce Based Parallel Bayesian Network for Manufacturing Quality Control

    NASA Astrophysics Data System (ADS)

    Zheng, Mao-Kuan; Ming, Xin-Guo; Zhang, Xian-Yu; Li, Guo-Ming

    2017-09-01

    Increasing complexity of industrial products and manufacturing processes have challenged conventional statistics based quality management approaches in the circumstances of dynamic production. A Bayesian network and big data analytics integrated approach for manufacturing process quality analysis and control is proposed. Based on Hadoop distributed architecture and MapReduce parallel computing model, big volume and variety quality related data generated during the manufacturing process could be dealt with. Artificial intelligent algorithms, including Bayesian network learning, classification and reasoning, are embedded into the Reduce process. Relying on the ability of the Bayesian network in dealing with dynamic and uncertain problem and the parallel computing power of MapReduce, Bayesian network of impact factors on quality are built based on prior probability distribution and modified with posterior probability distribution. A case study on hull segment manufacturing precision management for ship and offshore platform building shows that computing speed accelerates almost directly proportionally to the increase of computing nodes. It is also proved that the proposed model is feasible for locating and reasoning of root causes, forecasting of manufacturing outcome, and intelligent decision for precision problem solving. The integration of bigdata analytics and BN method offers a whole new perspective in manufacturing quality control.

  12. Bayesian generalized least squares regression with application to log Pearson type 3 regional skew estimation

    NASA Astrophysics Data System (ADS)

    Reis, D. S.; Stedinger, J. R.; Martins, E. S.

    2005-10-01

    This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.

  13. The Bayesian Revolution Approaches Psychological Development

    ERIC Educational Resources Information Center

    Shultz, Thomas R.

    2007-01-01

    This commentary reviews five articles that apply Bayesian ideas to psychological development, some with psychology experiments, some with computational modeling, and some with both experiments and modeling. The reviewed work extends the current Bayesian revolution into tasks often studied in children, such as causal learning and word learning, and…

  14. Computational system identification of continuous-time nonlinear systems using approximate Bayesian computation

    NASA Astrophysics Data System (ADS)

    Krishnanathan, Kirubhakaran; Anderson, Sean R.; Billings, Stephen A.; Kadirkamanathan, Visakan

    2016-11-01

    In this paper, we derive a system identification framework for continuous-time nonlinear systems, for the first time using a simulation-focused computational Bayesian approach. Simulation approaches to nonlinear system identification have been shown to outperform regression methods under certain conditions, such as non-persistently exciting inputs and fast-sampling. We use the approximate Bayesian computation (ABC) algorithm to perform simulation-based inference of model parameters. The framework has the following main advantages: (1) parameter distributions are intrinsically generated, giving the user a clear description of uncertainty, (2) the simulation approach avoids the difficult problem of estimating signal derivatives as is common with other continuous-time methods, and (3) as noted above, the simulation approach improves identification under conditions of non-persistently exciting inputs and fast-sampling. Term selection is performed by judging parameter significance using parameter distributions that are intrinsically generated as part of the ABC procedure. The results from a numerical example demonstrate that the method performs well in noisy scenarios, especially in comparison to competing techniques that rely on signal derivative estimation.

  15. Moving beyond qualitative evaluations of Bayesian models of cognition.

    PubMed

    Hemmer, Pernille; Tauber, Sean; Steyvers, Mark

    2015-06-01

    Bayesian models of cognition provide a powerful way to understand the behavior and goals of individuals from a computational point of view. Much of the focus in the Bayesian cognitive modeling approach has been on qualitative model evaluations, where predictions from the models are compared to data that is often averaged over individuals. In many cognitive tasks, however, there are pervasive individual differences. We introduce an approach to directly infer individual differences related to subjective mental representations within the framework of Bayesian models of cognition. In this approach, Bayesian data analysis methods are used to estimate cognitive parameters and motivate the inference process within a Bayesian cognitive model. We illustrate this integrative Bayesian approach on a model of memory. We apply the model to behavioral data from a memory experiment involving the recall of heights of people. A cross-validation analysis shows that the Bayesian memory model with inferred subjective priors predicts withheld data better than a Bayesian model where the priors are based on environmental statistics. In addition, the model with inferred priors at the individual subject level led to the best overall generalization performance, suggesting that individual differences are important to consider in Bayesian models of cognition.

  16. Approximate Bayesian computation for spatial SEIR(S) epidemic models.

    PubMed

    Brown, Grant D; Porter, Aaron T; Oleson, Jacob J; Hinman, Jessica A

    2018-02-01

    Approximate Bayesia n Computation (ABC) provides an attractive approach to estimation in complex Bayesian inferential problems for which evaluation of the kernel of the posterior distribution is impossible or computationally expensive. These highly parallelizable techniques have been successfully applied to many fields, particularly in cases where more traditional approaches such as Markov chain Monte Carlo (MCMC) are impractical. In this work, we demonstrate the application of approximate Bayesian inference to spatially heterogeneous Susceptible-Exposed-Infectious-Removed (SEIR) stochastic epidemic models. These models have a tractable posterior distribution, however MCMC techniques nevertheless become computationally infeasible for moderately sized problems. We discuss the practical implementation of these techniques via the open source ABSEIR package for R. The performance of ABC relative to traditional MCMC methods in a small problem is explored under simulation, as well as in the spatially heterogeneous context of the 2014 epidemic of Chikungunya in the Americas. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Information loss in approximately bayesian data assimilation: a comparison of generative and discriminative approaches to estimating agricultural yield

    USDA-ARS?s Scientific Manuscript database

    Data assimilation and regression are two commonly used methods for predicting agricultural yield from remote sensing observations. Data assimilation is a generative approach because it requires explicit approximations of the Bayesian prior and likelihood to compute the probability density function...

  18. Bayesian models: A statistical primer for ecologists

    USGS Publications Warehouse

    Hobbs, N. Thompson; Hooten, Mevin B.

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models

  19. A Bayesian Network Approach to Modeling Learning Progressions and Task Performance. CRESST Report 776

    ERIC Educational Resources Information Center

    West, Patti; Rutstein, Daisy Wise; Mislevy, Robert J.; Liu, Junhui; Choi, Younyoung; Levy, Roy; Crawford, Aaron; DiCerbo, Kristen E.; Chappel, Kristina; Behrens, John T.

    2010-01-01

    A major issue in the study of learning progressions (LPs) is linking student performance on assessment tasks to the progressions. This report describes the challenges faced in making this linkage using Bayesian networks to model LPs in the field of computer networking. The ideas are illustrated with exemplar Bayesian networks built on Cisco…

  20. A review and comparison of Bayesian and likelihood-based inferences in beta regression and zero-or-one-inflated beta regression.

    PubMed

    Liu, Fang; Eugenio, Evercita C

    2018-04-01

    Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.

  1. Bayesian design of decision rules for failure detection

    NASA Technical Reports Server (NTRS)

    Chow, E. Y.; Willsky, A. S.

    1984-01-01

    The formulation of the decision making process of a failure detection algorithm as a Bayes sequential decision problem provides a simple conceptualization of the decision rule design problem. As the optimal Bayes rule is not computable, a methodology that is based on the Bayesian approach and aimed at a reduced computational requirement is developed for designing suboptimal rules. A numerical algorithm is constructed to facilitate the design and performance evaluation of these suboptimal rules. The result of applying this design methodology to an example shows that this approach is potentially a useful one.

  2. Parallel Markov chain Monte Carlo - bridging the gap to high-performance Bayesian computation in animal breeding and genetics.

    PubMed

    Wu, Xiao-Lin; Sun, Chuanyu; Beissinger, Timothy M; Rosa, Guilherme Jm; Weigel, Kent A; Gatti, Natalia de Leon; Gianola, Daniel

    2012-09-25

    Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs.

  3. Parallel Markov chain Monte Carlo - bridging the gap to high-performance Bayesian computation in animal breeding and genetics

    PubMed Central

    2012-01-01

    Background Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Results Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Conclusions Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs. PMID:23009363

  4. Expectation propagation for large scale Bayesian inference of non-linear molecular networks from perturbation data.

    PubMed

    Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger

    2017-01-01

    Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.

  5. Bayesian Computation for Log-Gaussian Cox Processes: A Comparative Analysis of Methods

    PubMed Central

    Teng, Ming; Nathoo, Farouk S.; Johnson, Timothy D.

    2017-01-01

    The Log-Gaussian Cox Process is a commonly used model for the analysis of spatial point pattern data. Fitting this model is difficult because of its doubly-stochastic property, i.e., it is an hierarchical combination of a Poisson process at the first level and a Gaussian Process at the second level. Various methods have been proposed to estimate such a process, including traditional likelihood-based approaches as well as Bayesian methods. We focus here on Bayesian methods and several approaches that have been considered for model fitting within this framework, including Hamiltonian Monte Carlo, the Integrated nested Laplace approximation, and Variational Bayes. We consider these approaches and make comparisons with respect to statistical and computational efficiency. These comparisons are made through several simulation studies as well as through two applications, the first examining ecological data and the second involving neuroimaging data. PMID:29200537

  6. Understanding the Scalability of Bayesian Network Inference using Clique Tree Growth Curves

    NASA Technical Reports Server (NTRS)

    Mengshoel, Ole Jakob

    2009-01-01

    Bayesian networks (BNs) are used to represent and efficiently compute with multi-variate probability distributions in a wide range of disciplines. One of the main approaches to perform computation in BNs is clique tree clustering and propagation. In this approach, BN computation consists of propagation in a clique tree compiled from a Bayesian network. There is a lack of understanding of how clique tree computation time, and BN computation time in more general, depends on variations in BN size and structure. On the one hand, complexity results tell us that many interesting BN queries are NP-hard or worse to answer, and it is not hard to find application BNs where the clique tree approach in practice cannot be used. On the other hand, it is well-known that tree-structured BNs can be used to answer probabilistic queries in polynomial time. In this article, we develop an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN's non-root nodes to the number of root nodes, or (ii) the expected number of moral edges in their moral graphs. Our approach is based on combining analytical and experimental results. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for each set. For the special case of bipartite BNs, we consequently have two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, we systematically increase the degree of the root nodes in bipartite Bayesian networks, and find that root clique growth is well-approximated by Gompertz growth curves. It is believed that this research improves the understanding of the scaling behavior of clique tree clustering, provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms, and presents an aid for analytical trade-off studies of clique tree clustering using growth curves.

  7. A Massively Parallel Bayesian Approach to Planetary Protection Trajectory Analysis and Design

    NASA Technical Reports Server (NTRS)

    Wallace, Mark S.

    2015-01-01

    The NASA Planetary Protection Office has levied a requirement that the upper stage of future planetary launches have a less than 10(exp -4) chance of impacting Mars within 50 years after launch. A brute-force approach requires a decade of computer time to demonstrate compliance. By using a Bayesian approach and taking advantage of the demonstrated reliability of the upper stage, the required number of fifty-year propagations can be massively reduced. By spreading the remaining embarrassingly parallel Monte Carlo simulations across multiple computers, compliance can be demonstrated in a reasonable time frame. The method used is described here.

  8. Dynamic Bayesian network modeling for longitudinal brain morphometry

    PubMed Central

    Chen, Rong; Resnick, Susan M; Davatzikos, Christos; Herskovits, Edward H

    2011-01-01

    Identifying interactions among brain regions from structural magnetic-resonance images presents one of the major challenges in computational neuroanatomy. We propose a Bayesian data-mining approach to the detection of longitudinal morphological changes in the human brain. Our method uses a dynamic Bayesian network to represent evolving inter-regional dependencies. The major advantage of dynamic Bayesian network modeling is that it can represent complicated interactions among temporal processes. We validated our approach by analyzing a simulated atrophy study, and found that this approach requires only a small number of samples to detect the ground-truth temporal model. We further applied dynamic Bayesian network modeling to a longitudinal study of normal aging and mild cognitive impairment — the Baltimore Longitudinal Study of Aging. We found that interactions among regional volume-change rates for the mild cognitive impairment group are different from those for the normal-aging group. PMID:21963916

  9. Statistical Surrogate Modeling of Atmospheric Dispersion Events Using Bayesian Adaptive Splines

    NASA Astrophysics Data System (ADS)

    Francom, D.; Sansó, B.; Bulaevskaya, V.; Lucas, D. D.

    2016-12-01

    Uncertainty in the inputs of complex computer models, including atmospheric dispersion and transport codes, is often assessed via statistical surrogate models. Surrogate models are computationally efficient statistical approximations of expensive computer models that enable uncertainty analysis. We introduce Bayesian adaptive spline methods for producing surrogate models that capture the major spatiotemporal patterns of the parent model, while satisfying all the necessities of flexibility, accuracy and computational feasibility. We present novel methodological and computational approaches motivated by a controlled atmospheric tracer release experiment conducted at the Diablo Canyon nuclear power plant in California. Traditional methods for building statistical surrogate models often do not scale well to experiments with large amounts of data. Our approach is well suited to experiments involving large numbers of model inputs, large numbers of simulations, and functional output for each simulation. Our approach allows us to perform global sensitivity analysis with ease. We also present an approach to calibration of simulators using field data.

  10. Learning classification trees

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1991-01-01

    Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.

  11. Development and comparison of Bayesian modularization method in uncertainty assessment of hydrological models

    NASA Astrophysics Data System (ADS)

    Li, L.; Xu, C.-Y.; Engeland, K.

    2012-04-01

    With respect to model calibration, parameter estimation and analysis of uncertainty sources, different approaches have been used in hydrological models. Bayesian method is one of the most widely used methods for uncertainty assessment of hydrological models, which incorporates different sources of information into a single analysis through Bayesian theorem. However, none of these applications can well treat the uncertainty in extreme flows of hydrological models' simulations. This study proposes a Bayesian modularization method approach in uncertainty assessment of conceptual hydrological models by considering the extreme flows. It includes a comprehensive comparison and evaluation of uncertainty assessments by a new Bayesian modularization method approach and traditional Bayesian models using the Metropolis Hasting (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions are used in combination with traditional Bayesian: the AR (1) plus Normal and time period independent model (Model 1), the AR (1) plus Normal and time period dependent model (Model 2) and the AR (1) plus multi-normal model (Model 3). The results reveal that (1) the simulations derived from Bayesian modularization method are more accurate with the highest Nash-Sutcliffe efficiency value, and (2) the Bayesian modularization method performs best in uncertainty estimates of entire flows and in terms of the application and computational efficiency. The study thus introduces a new approach for reducing the extreme flow's effect on the discharge uncertainty assessment of hydrological models via Bayesian. Keywords: extreme flow, uncertainty assessment, Bayesian modularization, hydrological model, WASMOD

  12. We introduce an algorithm for the simultaneous reconstruction of faults and slip fields. We prove that the minimum of a related regularized functional converges to the unique solution of the fault inverse problem. We consider a Bayesian approach. We use a parallel multi-core platform and we discuss techniques to save on computational time.

    NASA Astrophysics Data System (ADS)

    Volkov, D.

    2017-12-01

    We introduce an algorithm for the simultaneous reconstruction of faults and slip fields on those faults. We define a regularized functional to be minimized for the reconstruction. We prove that the minimum of that functional converges to the unique solution of the related fault inverse problem. Due to inherent uncertainties in measurements, rather than seeking a deterministic solution to the fault inverse problem, we consider a Bayesian approach. The advantage of such an approach is that we obtain a way of quantifying uncertainties as part of our final answer. On the downside, this Bayesian approach leads to a very large computation. To contend with the size of this computation we developed an algorithm for the numerical solution to the stochastic minimization problem which can be easily implemented on a parallel multi-core platform and we discuss techniques to save on computational time. After showing how this algorithm performs on simulated data and assessing the effect of noise, we apply it to measured data. The data was recorded during a slow slip event in Guerrero, Mexico.

  13. BELM: Bayesian extreme learning machine.

    PubMed

    Soria-Olivas, Emilio; Gómez-Sanchis, Juan; Martín, José D; Vila-Francés, Joan; Martínez, Marcelino; Magdalena, José R; Serrano, Antonio J

    2011-03-01

    The theory of extreme learning machine (ELM) has become very popular on the last few years. ELM is a new approach for learning the parameters of the hidden layers of a multilayer neural network (as the multilayer perceptron or the radial basis function neural network). Its main advantage is the lower computational cost, which is especially relevant when dealing with many patterns defined in a high-dimensional space. This brief proposes a bayesian approach to ELM, which presents some advantages over other approaches: it allows the introduction of a priori knowledge; obtains the confidence intervals (CIs) without the need of applying methods that are computationally intensive, e.g., bootstrap; and presents high generalization capabilities. Bayesian ELM is benchmarked against classical ELM in several artificial and real datasets that are widely used for the evaluation of machine learning algorithms. Achieved results show that the proposed approach produces a competitive accuracy with some additional advantages, namely, automatic production of CIs, reduction of probability of model overfitting, and use of a priori knowledge.

  14. Near Real-Time Probabilistic Damage Diagnosis Using Surrogate Modeling and High Performance Computing

    NASA Technical Reports Server (NTRS)

    Warner, James E.; Zubair, Mohammad; Ranjan, Desh

    2017-01-01

    This work investigates novel approaches to probabilistic damage diagnosis that utilize surrogate modeling and high performance computing (HPC) to achieve substantial computational speedup. Motivated by Digital Twin, a structural health management (SHM) paradigm that integrates vehicle-specific characteristics with continual in-situ damage diagnosis and prognosis, the methods studied herein yield near real-time damage assessments that could enable monitoring of a vehicle's health while it is operating (i.e. online SHM). High-fidelity modeling and uncertainty quantification (UQ), both critical to Digital Twin, are incorporated using finite element method simulations and Bayesian inference, respectively. The crux of the proposed Bayesian diagnosis methods, however, is the reformulation of the numerical sampling algorithms (e.g. Markov chain Monte Carlo) used to generate the resulting probabilistic damage estimates. To this end, three distinct methods are demonstrated for rapid sampling that utilize surrogate modeling and exploit various degrees of parallelism for leveraging HPC. The accuracy and computational efficiency of the methods are compared on the problem of strain-based crack identification in thin plates. While each approach has inherent problem-specific strengths and weaknesses, all approaches are shown to provide accurate probabilistic damage diagnoses and several orders of magnitude computational speedup relative to a baseline Bayesian diagnosis implementation.

  15. Final Report, DOE Early Career Award: Predictive modeling of complex physical systems: new tools for statistical inference, uncertainty quantification, and experimental design

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marzouk, Youssef

    Predictive simulation of complex physical systems increasingly rests on the interplay of experimental observations with computational models. Key inputs, parameters, or structural aspects of models may be incomplete or unknown, and must be developed from indirect and limited observations. At the same time, quantified uncertainties are needed to qualify computational predictions in the support of design and decision-making. In this context, Bayesian statistics provides a foundation for inference from noisy and limited data, but at prohibitive computional expense. This project intends to make rigorous predictive modeling *feasible* in complex physical systems, via accelerated and scalable tools for uncertainty quantification, Bayesianmore » inference, and experimental design. Specific objectives are as follows: 1. Develop adaptive posterior approximations and dimensionality reduction approaches for Bayesian inference in high-dimensional nonlinear systems. 2. Extend accelerated Bayesian methodologies to large-scale {\\em sequential} data assimilation, fully treating nonlinear models and non-Gaussian state and parameter distributions. 3. Devise efficient surrogate-based methods for Bayesian model selection and the learning of model structure. 4. Develop scalable simulation/optimization approaches to nonlinear Bayesian experimental design, for both parameter inference and model selection. 5. Demonstrate these inferential tools on chemical kinetic models in reacting flow, constructing and refining thermochemical and electrochemical models from limited data. Demonstrate Bayesian filtering on canonical stochastic PDEs and in the dynamic estimation of inhomogeneous subsurface properties and flow fields.« less

  16. Validation of the thermal challenge problem using Bayesian Belief Networks.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McFarland, John; Swiler, Laura Painton

    The thermal challenge problem has been developed at Sandia National Laboratories as a testbed for demonstrating various types of validation approaches and prediction methods. This report discusses one particular methodology to assess the validity of a computational model given experimental data. This methodology is based on Bayesian Belief Networks (BBNs) and can incorporate uncertainty in experimental measurements, in physical quantities, and model uncertainties. The approach uses the prior and posterior distributions of model output to compute a validation metric based on Bayesian hypothesis testing (a Bayes' factor). This report discusses various aspects of the BBN, specifically in the context ofmore » the thermal challenge problem. A BBN is developed for a given set of experimental data in a particular experimental configuration. The development of the BBN and the method for ''solving'' the BBN to develop the posterior distribution of model output through Monte Carlo Markov Chain sampling is discussed in detail. The use of the BBN to compute a Bayes' factor is demonstrated.« less

  17. Quantum state estimation when qubits are lost: a no-data-left-behind approach

    DOE PAGES

    Williams, Brian P.; Lougovski, Pavel

    2017-04-06

    We present an approach to Bayesian mean estimation of quantum states using hyperspherical parametrization and an experiment-specific likelihood which allows utilization of all available data, even when qubits are lost. With this method, we report the first closed-form Bayesian mean and maximum likelihood estimates for the ideal single qubit. Due to computational constraints, we utilize numerical sampling to determine the Bayesian mean estimate for a photonic two-qubit experiment in which our novel analysis reduces burdens associated with experimental asymmetries and inefficiencies. This method can be applied to quantum states of any dimension and experimental complexity.

  18. A Bayesian framework for adaptive selection, calibration, and validation of coarse-grained models of atomistic systems

    NASA Astrophysics Data System (ADS)

    Farrell, Kathryn; Oden, J. Tinsley; Faghihi, Danial

    2015-08-01

    A general adaptive modeling algorithm for selection and validation of coarse-grained models of atomistic systems is presented. A Bayesian framework is developed to address uncertainties in parameters, data, and model selection. Algorithms for computing output sensitivities to parameter variances, model evidence and posterior model plausibilities for given data, and for computing what are referred to as Occam Categories in reference to a rough measure of model simplicity, make up components of the overall approach. Computational results are provided for representative applications.

  19. BUMPER: the Bayesian User-friendly Model for Palaeo-Environmental Reconstruction

    NASA Astrophysics Data System (ADS)

    Holden, Phil; Birks, John; Brooks, Steve; Bush, Mark; Hwang, Grace; Matthews-Bird, Frazer; Valencia, Bryan; van Woesik, Robert

    2017-04-01

    We describe the Bayesian User-friendly Model for Palaeo-Environmental Reconstruction (BUMPER), a Bayesian transfer function for inferring past climate and other environmental variables from microfossil assemblages. The principal motivation for a Bayesian approach is that the palaeoenvironment is treated probabilistically, and can be updated as additional data become available. Bayesian approaches therefore provide a reconstruction-specific quantification of the uncertainty in the data and in the model parameters. BUMPER is fully self-calibrating, straightforward to apply, and computationally fast, requiring 2 seconds to build a 100-taxon model from a 100-site training-set on a standard personal computer. We apply the model's probabilistic framework to generate thousands of artificial training-sets under ideal assumptions. We then use these to demonstrate both the general applicability of the model and the sensitivity of reconstructions to the characteristics of the training-set, considering assemblage richness, taxon tolerances, and the number of training sites. We demonstrate general applicability to real data, considering three different organism types (chironomids, diatoms, pollen) and different reconstructed variables. In all of these applications an identically configured model is used, the only change being the input files that provide the training-set environment and taxon-count data.

  20. An adaptive Gaussian process-based method for efficient Bayesian experimental design in groundwater contaminant source identification problems: ADAPTIVE GAUSSIAN PROCESS-BASED INVERSION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Jiangjiang; Li, Weixuan; Zeng, Lingzao

    Surrogate models are commonly used in Bayesian approaches such as Markov Chain Monte Carlo (MCMC) to avoid repetitive CPU-demanding model evaluations. However, the approximation error of a surrogate may lead to biased estimations of the posterior distribution. This bias can be corrected by constructing a very accurate surrogate or implementing MCMC in a two-stage manner. Since the two-stage MCMC requires extra original model evaluations, the computational cost is still high. If the information of measurement is incorporated, a locally accurate approximation of the original model can be adaptively constructed with low computational cost. Based on this idea, we propose amore » Gaussian process (GP) surrogate-based Bayesian experimental design and parameter estimation approach for groundwater contaminant source identification problems. A major advantage of the GP surrogate is that it provides a convenient estimation of the approximation error, which can be incorporated in the Bayesian formula to avoid over-confident estimation of the posterior distribution. The proposed approach is tested with a numerical case study. Without sacrificing the estimation accuracy, the new approach achieves about 200 times of speed-up compared to our previous work using two-stage MCMC.« less

  1. Bayesian flood forecasting methods: A review

    NASA Astrophysics Data System (ADS)

    Han, Shasha; Coulibaly, Paulin

    2017-08-01

    Over the past few decades, floods have been seen as one of the most common and largely distributed natural disasters in the world. If floods could be accurately forecasted in advance, then their negative impacts could be greatly minimized. It is widely recognized that quantification and reduction of uncertainty associated with the hydrologic forecast is of great importance for flood estimation and rational decision making. Bayesian forecasting system (BFS) offers an ideal theoretic framework for uncertainty quantification that can be developed for probabilistic flood forecasting via any deterministic hydrologic model. It provides suitable theoretical structure, empirically validated models and reasonable analytic-numerical computation method, and can be developed into various Bayesian forecasting approaches. This paper presents a comprehensive review on Bayesian forecasting approaches applied in flood forecasting from 1999 till now. The review starts with an overview of fundamentals of BFS and recent advances in BFS, followed with BFS application in river stage forecasting and real-time flood forecasting, then move to a critical analysis by evaluating advantages and limitations of Bayesian forecasting methods and other predictive uncertainty assessment approaches in flood forecasting, and finally discusses the future research direction in Bayesian flood forecasting. Results show that the Bayesian flood forecasting approach is an effective and advanced way for flood estimation, it considers all sources of uncertainties and produces a predictive distribution of the river stage, river discharge or runoff, thus gives more accurate and reliable flood forecasts. Some emerging Bayesian forecasting methods (e.g. ensemble Bayesian forecasting system, Bayesian multi-model combination) were shown to overcome limitations of single model or fixed model weight and effectively reduce predictive uncertainty. In recent years, various Bayesian flood forecasting approaches have been developed and widely applied, but there is still room for improvements. Future research in the context of Bayesian flood forecasting should be on assimilation of various sources of newly available information and improvement of predictive performance assessment methods.

  2. Reconstructing Constructivism: Causal Models, Bayesian Learning Mechanisms, and the Theory Theory

    ERIC Educational Resources Information Center

    Gopnik, Alison; Wellman, Henry M.

    2012-01-01

    We propose a new version of the "theory theory" grounded in the computational framework of probabilistic causal models and Bayesian learning. Probabilistic models allow a constructivist but rigorous and detailed approach to cognitive development. They also explain the learning of both more specific causal hypotheses and more abstract framework…

  3. Autonomic Closure for Turbulent Flows Using Approximate Bayesian Computation

    NASA Astrophysics Data System (ADS)

    Doronina, Olga; Christopher, Jason; Hamlington, Peter; Dahm, Werner

    2017-11-01

    Autonomic closure is a new technique for achieving fully adaptive and physically accurate closure of coarse-grained turbulent flow governing equations, such as those solved in large eddy simulations (LES). Although autonomic closure has been shown in recent a priori tests to more accurately represent unclosed terms than do dynamic versions of traditional LES models, the computational cost of the approach makes it challenging to implement for simulations of practical turbulent flows at realistically high Reynolds numbers. The optimization step used in the approach introduces large matrices that must be inverted and is highly memory intensive. In order to reduce memory requirements, here we propose to use approximate Bayesian computation (ABC) in place of the optimization step, thereby yielding a computationally-efficient implementation of autonomic closure that trades memory-intensive for processor-intensive computations. The latter challenge can be overcome as co-processors such as general purpose graphical processing units become increasingly available on current generation petascale and exascale supercomputers. In this work, we outline the formulation of ABC-enabled autonomic closure and present initial results demonstrating the accuracy and computational cost of the approach.

  4. A Bayesian test for Hardy–Weinberg equilibrium of biallelic X-chromosomal markers

    PubMed Central

    Puig, X; Ginebra, J; Graffelman, J

    2017-01-01

    The X chromosome is a relatively large chromosome, harboring a lot of genetic information. Much of the statistical analysis of X-chromosomal information is complicated by the fact that males only have one copy. Recently, frequentist statistical tests for Hardy–Weinberg equilibrium have been proposed specifically for dealing with markers on the X chromosome. Bayesian test procedures for Hardy–Weinberg equilibrium for the autosomes have been described, but Bayesian work on the X chromosome in this context is lacking. This paper gives the first Bayesian approach for testing Hardy–Weinberg equilibrium with biallelic markers at the X chromosome. Marginal and joint posterior distributions for the inbreeding coefficient in females and the male to female allele frequency ratio are computed, and used for statistical inference. The paper gives a detailed account of the proposed Bayesian test, and illustrates it with data from the 1000 Genomes project. In that implementation, a novel approach to tackle multiple testing from a Bayesian perspective through posterior predictive checks is used. PMID:28900292

  5. The choice of sample size: a mixed Bayesian / frequentist approach.

    PubMed

    Pezeshk, Hamid; Nematollahi, Nader; Maroufy, Vahed; Gittins, John

    2009-04-01

    Sample size computations are largely based on frequentist or classical methods. In the Bayesian approach the prior information on the unknown parameters is taken into account. In this work we consider a fully Bayesian approach to the sample size determination problem which was introduced by Grundy et al. and developed by Lindley. This approach treats the problem as a decision problem and employs a utility function to find the optimal sample size of a trial. Furthermore, we assume that a regulatory authority, which is deciding on whether or not to grant a licence to a new treatment, uses a frequentist approach. We then find the optimal sample size for the trial by maximising the expected net benefit, which is the expected benefit of subsequent use of the new treatment minus the cost of the trial.

  6. Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves

    NASA Technical Reports Server (NTRS)

    Mengshoel, Ole J.

    2010-01-01

    One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms.

  7. Discriminative Bayesian Dictionary Learning for Classification.

    PubMed

    Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal

    2016-12-01

    We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.

  8. Improved inference in Bayesian segmentation using Monte Carlo sampling: application to hippocampal subfield volumetry.

    PubMed

    Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen

    2013-10-01

    Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement in an Alzheimer's disease classification task. As an additional benefit, the technique also allows one to compute informative "error bars" on the volume estimates of individual structures. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Improved Inference in Bayesian Segmentation Using Monte Carlo Sampling: Application to Hippocampal Subfield Volumetry

    PubMed Central

    Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Leemput, Koen Van

    2013-01-01

    Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement in an Alzheimer’s disease classification task. As an additional benefit, the technique also allows one to compute informative “error bars” on the volume estimates of individual structures. PMID:23773521

  10. Artificial Intelligence (AI) Center of Excellence at the University of Pennsylvania

    DTIC Science & Technology

    1995-07-01

    that controls impact forces. Robust Location Estimation for MLR and Non-MLR Distributions (Dissertation Proposal) Gerda L. Kamberova MS-CIS-92-28...Bayesian Approach To Computer Vision Problems Gerda L. Kamberova MS-CIS-92-29 GRASP LAB 310 The object of our study is the Bayesian approach in...Estimation for MLR and Non-MLR Distributions (Dissertation) Gerda L. Kamberova MS-CIS-92-93 GRASP LAB 340 We study the problem of estimating an unknown

  11. Invited commentary: Lost in estimation--searching for alternatives to markov chains to fit complex Bayesian models.

    PubMed

    Molitor, John

    2012-03-01

    Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.

  12. Estimation of parameter uncertainty for an activated sludge model using Bayesian inference: a comparison with the frequentist method.

    PubMed

    Zonta, Zivko J; Flotats, Xavier; Magrí, Albert

    2014-08-01

    The procedure commonly used for the assessment of the parameters included in activated sludge models (ASMs) relies on the estimation of their optimal value within a confidence region (i.e. frequentist inference). Once optimal values are estimated, parameter uncertainty is computed through the covariance matrix. However, alternative approaches based on the consideration of the model parameters as probability distributions (i.e. Bayesian inference), may be of interest. The aim of this work is to apply (and compare) both Bayesian and frequentist inference methods when assessing uncertainty for an ASM-type model, which considers intracellular storage and biomass growth, simultaneously. Practical identifiability was addressed exclusively considering respirometric profiles based on the oxygen uptake rate and with the aid of probabilistic global sensitivity analysis. Parameter uncertainty was thus estimated according to both the Bayesian and frequentist inferential procedures. Results were compared in order to evidence the strengths and weaknesses of both approaches. Since it was demonstrated that Bayesian inference could be reduced to a frequentist approach under particular hypotheses, the former can be considered as a more generalist methodology. Hence, the use of Bayesian inference is encouraged for tackling inferential issues in ASM environments.

  13. A fast combination method in DSmT and its application to recommender system

    PubMed Central

    Liu, Yihai

    2018-01-01

    In many applications involving epistemic uncertainties usually modeled by belief functions, it is often necessary to approximate general (non-Bayesian) basic belief assignments (BBAs) to subjective probabilities (called Bayesian BBAs). This necessity occurs if one needs to embed the fusion result in a system based on the probabilistic framework and Bayesian inference (e.g. tracking systems), or if one needs to make a decision in the decision making problems. In this paper, we present a new fast combination method, called modified rigid coarsening (MRC), to obtain the final Bayesian BBAs based on hierarchical decomposition (coarsening) of the frame of discernment. Regarding this method, focal elements with probabilities are coarsened efficiently to reduce computational complexity in the process of combination by using disagreement vector and a simple dichotomous approach. In order to prove the practicality of our approach, this new approach is applied to combine users’ soft preferences in recommender systems (RSs). Additionally, in order to make a comprehensive performance comparison, the proportional conflict redistribution rule #6 (PCR6) is regarded as a baseline in a range of experiments. According to the results of experiments, MRC is more effective in accuracy of recommendations compared to original Rigid Coarsening (RC) method and comparable in computational time. PMID:29351297

  14. A Computationally-Efficient Inverse Approach to Probabilistic Strain-Based Damage Diagnosis

    NASA Technical Reports Server (NTRS)

    Warner, James E.; Hochhalter, Jacob D.; Leser, William P.; Leser, Patrick E.; Newman, John A

    2016-01-01

    This work presents a computationally-efficient inverse approach to probabilistic damage diagnosis. Given strain data at a limited number of measurement locations, Bayesian inference and Markov Chain Monte Carlo (MCMC) sampling are used to estimate probability distributions of the unknown location, size, and orientation of damage. Substantial computational speedup is obtained by replacing a three-dimensional finite element (FE) model with an efficient surrogate model. The approach is experimentally validated on cracked test specimens where full field strains are determined using digital image correlation (DIC). Access to full field DIC data allows for testing of different hypothetical sensor arrangements, facilitating the study of strain-based diagnosis effectiveness as the distance between damage and measurement locations increases. The ability of the framework to effectively perform both probabilistic damage localization and characterization in cracked plates is demonstrated and the impact of measurement location on uncertainty in the predictions is shown. Furthermore, the analysis time to produce these predictions is orders of magnitude less than a baseline Bayesian approach with the FE method by utilizing surrogate modeling and effective numerical sampling approaches.

  15. A local approach for focussed Bayesian fusion

    NASA Astrophysics Data System (ADS)

    Sander, Jennifer; Heizmann, Michael; Goussev, Igor; Beyerer, Jürgen

    2009-04-01

    Local Bayesian fusion approaches aim to reduce high storage and computational costs of Bayesian fusion which is separated from fixed modeling assumptions. Using the small world formalism, we argue why this proceeding is conform with Bayesian theory. Then, we concentrate on the realization of local Bayesian fusion by focussing the fusion process solely on local regions that are task relevant with a high probability. The resulting local models correspond then to restricted versions of the original one. In a previous publication, we used bounds for the probability of misleading evidence to show the validity of the pre-evaluation of task specific knowledge and prior information which we perform to build local models. In this paper, we prove the validity of this proceeding using information theoretic arguments. For additional efficiency, local Bayesian fusion can be realized in a distributed manner. Here, several local Bayesian fusion tasks are evaluated and unified after the actual fusion process. For the practical realization of distributed local Bayesian fusion, software agents are predestinated. There is a natural analogy between the resulting agent based architecture and criminal investigations in real life. We show how this analogy can be used to improve the efficiency of distributed local Bayesian fusion additionally. Using a landscape model, we present an experimental study of distributed local Bayesian fusion in the field of reconnaissance, which highlights its high potential.

  16. Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach.

    PubMed

    Duarte, Belmiro P M; Wong, Weng Kee

    2015-08-01

    This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted.

  17. Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach

    PubMed Central

    Duarte, Belmiro P. M.; Wong, Weng Kee

    2014-01-01

    Summary This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted. PMID:26512159

  18. Bayesian inference of a historical bottleneck in a heavily exploited marine mammal.

    PubMed

    Hoffman, J I; Grant, S M; Forcada, J; Phillips, C D

    2011-10-01

    Emerging Bayesian analytical approaches offer increasingly sophisticated means of reconstructing historical population dynamics from genetic data, but have been little applied to scenarios involving demographic bottlenecks. Consequently, we analysed a large mitochondrial and microsatellite dataset from the Antarctic fur seal Arctocephalus gazella, a species subjected to one of the most extreme examples of uncontrolled exploitation in history when it was reduced to the brink of extinction by the sealing industry during the late eighteenth and nineteenth centuries. Classical bottleneck tests, which exploit the fact that rare alleles are rapidly lost during demographic reduction, yielded ambiguous results. In contrast, a strong signal of recent demographic decline was detected using both Bayesian skyline plots and Approximate Bayesian Computation, the latter also allowing derivation of posterior parameter estimates that were remarkably consistent with historical observations. This was achieved using only contemporary samples, further emphasizing the potential of Bayesian approaches to address important problems in conservation and evolutionary biology. © 2011 Blackwell Publishing Ltd.

  19. [Bayesian approach for the cost-effectiveness evaluation of healthcare technologies].

    PubMed

    Berchialla, Paola; Gregori, Dario; Brunello, Franco; Veltri, Andrea; Petrinco, Michele; Pagano, Eva

    2009-01-01

    The development of Bayesian statistical methods for the assessment of the cost-effectiveness of health care technologies is reviewed. Although many studies adopt a frequentist approach, several authors have advocated the use of Bayesian methods in health economics. Emphasis has been placed on the advantages of the Bayesian approach, which include: (i) the ability to make more intuitive and meaningful inferences; (ii) the ability to tackle complex problems, such as allowing for the inclusion of patients who generate no cost, thanks to the availability of powerful computational algorithms; (iii) the importance of a full use of quantitative and structural prior information to produce realistic inferences. Much literature comparing the cost-effectiveness of two treatments is based on the incremental cost-effectiveness ratio. However, new methods are arising with the purpose of decision making. These methods are based on a net benefits approach. In the present context, the cost-effectiveness acceptability curves have been pointed out to be intrinsically Bayesian in their formulation. They plot the probability of a positive net benefit against the threshold cost of a unit increase in efficacy.A case study is presented in order to illustrate the Bayesian statistics in the cost-effectiveness analysis. Emphasis is placed on the cost-effectiveness acceptability curves. Advantages and disadvantages of the method described in this paper have been compared to frequentist methods and discussed.

  20. A new method for E-government procurement using collaborative filtering and Bayesian approach.

    PubMed

    Zhang, Shuai; Xi, Chengyu; Wang, Yan; Zhang, Wenyu; Chen, Yanhong

    2013-01-01

    Nowadays, as the Internet services increase faster than ever before, government systems are reinvented as E-government services. Therefore, government procurement sectors have to face challenges brought by the explosion of service information. This paper presents a novel method for E-government procurement (eGP) to search for the optimal procurement scheme (OPS). Item-based collaborative filtering and Bayesian approach are used to evaluate and select the candidate services to get the top-M recommendations such that the involved computation load can be alleviated. A trapezoidal fuzzy number similarity algorithm is applied to support the item-based collaborative filtering and Bayesian approach, since some of the services' attributes can be hardly expressed as certain and static values but only be easily represented as fuzzy values. A prototype system is built and validated with an illustrative example from eGP to confirm the feasibility of our approach.

  1. A New Method for E-Government Procurement Using Collaborative Filtering and Bayesian Approach

    PubMed Central

    Wang, Yan

    2013-01-01

    Nowadays, as the Internet services increase faster than ever before, government systems are reinvented as E-government services. Therefore, government procurement sectors have to face challenges brought by the explosion of service information. This paper presents a novel method for E-government procurement (eGP) to search for the optimal procurement scheme (OPS). Item-based collaborative filtering and Bayesian approach are used to evaluate and select the candidate services to get the top-M recommendations such that the involved computation load can be alleviated. A trapezoidal fuzzy number similarity algorithm is applied to support the item-based collaborative filtering and Bayesian approach, since some of the services' attributes can be hardly expressed as certain and static values but only be easily represented as fuzzy values. A prototype system is built and validated with an illustrative example from eGP to confirm the feasibility of our approach. PMID:24385869

  2. Posterior Predictive Bayesian Phylogenetic Model Selection

    PubMed Central

    Lewis, Paul O.; Xie, Wangang; Chen, Ming-Hui; Fan, Yu; Kuo, Lynn

    2014-01-01

    We present two distinctly different posterior predictive approaches to Bayesian phylogenetic model selection and illustrate these methods using examples from green algal protein-coding cpDNA sequences and flowering plant rDNA sequences. The Gelfand–Ghosh (GG) approach allows dissection of an overall measure of model fit into components due to posterior predictive variance (GGp) and goodness-of-fit (GGg), which distinguishes this method from the posterior predictive P-value approach. The conditional predictive ordinate (CPO) method provides a site-specific measure of model fit useful for exploratory analyses and can be combined over sites yielding the log pseudomarginal likelihood (LPML) which is useful as an overall measure of model fit. CPO provides a useful cross-validation approach that is computationally efficient, requiring only a sample from the posterior distribution (no additional simulation is required). Both GG and CPO add new perspectives to Bayesian phylogenetic model selection based on the predictive abilities of models and complement the perspective provided by the marginal likelihood (including Bayes Factor comparisons) based solely on the fit of competing models to observed data. [Bayesian; conditional predictive ordinate; CPO; L-measure; LPML; model selection; phylogenetics; posterior predictive.] PMID:24193892

  3. Bayesian randomized clinical trials: From fixed to adaptive design.

    PubMed

    Yin, Guosheng; Lam, Chi Kin; Shi, Haolun

    2017-08-01

    Randomized controlled studies are the gold standard for phase III clinical trials. Using α-spending functions to control the overall type I error rate, group sequential methods are well established and have been dominating phase III studies. Bayesian randomized design, on the other hand, can be viewed as a complement instead of competitive approach to the frequentist methods. For the fixed Bayesian design, the hypothesis testing can be cast in the posterior probability or Bayes factor framework, which has a direct link to the frequentist type I error rate. Bayesian group sequential design relies upon Bayesian decision-theoretic approaches based on backward induction, which is often computationally intensive. Compared with the frequentist approaches, Bayesian methods have several advantages. The posterior predictive probability serves as a useful and convenient tool for trial monitoring, and can be updated at any time as the data accrue during the trial. The Bayesian decision-theoretic framework possesses a direct link to the decision making in the practical setting, and can be modeled more realistically to reflect the actual cost-benefit analysis during the drug development process. Other merits include the possibility of hierarchical modeling and the use of informative priors, which would lead to a more comprehensive utilization of information from both historical and longitudinal data. From fixed to adaptive design, we focus on Bayesian randomized controlled clinical trials and make extensive comparisons with frequentist counterparts through numerical studies. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. A Gaussian Approximation Approach for Value of Information Analysis.

    PubMed

    Jalal, Hawre; Alarid-Escudero, Fernando

    2018-02-01

    Most decisions are associated with uncertainty. Value of information (VOI) analysis quantifies the opportunity loss associated with choosing a suboptimal intervention based on current imperfect information. VOI can inform the value of collecting additional information, resource allocation, research prioritization, and future research designs. However, in practice, VOI remains underused due to many conceptual and computational challenges associated with its application. Expected value of sample information (EVSI) is rooted in Bayesian statistical decision theory and measures the value of information from a finite sample. The past few years have witnessed a dramatic growth in computationally efficient methods to calculate EVSI, including metamodeling. However, little research has been done to simplify the experimental data collection step inherent to all EVSI computations, especially for correlated model parameters. This article proposes a general Gaussian approximation (GA) of the traditional Bayesian updating approach based on the original work by Raiffa and Schlaifer to compute EVSI. The proposed approach uses a single probabilistic sensitivity analysis (PSA) data set and involves 2 steps: 1) a linear metamodel step to compute the EVSI on the preposterior distributions and 2) a GA step to compute the preposterior distribution of the parameters of interest. The proposed approach is efficient and can be applied for a wide range of data collection designs involving multiple non-Gaussian parameters and unbalanced study designs. Our approach is particularly useful when the parameters of an economic evaluation are correlated or interact.

  5. Bayesian statistics in radionuclide metrology: measurement of a decaying source

    NASA Astrophysics Data System (ADS)

    Bochud, François O.; Bailat, Claude J.; Laedermann, Jean-Pascal

    2007-08-01

    The most intuitive way of defining a probability is perhaps through the frequency at which it appears when a large number of trials are realized in identical conditions. The probability derived from the obtained histogram characterizes the so-called frequentist or conventional statistical approach. In this sense, probability is defined as a physical property of the observed system. By contrast, in Bayesian statistics, a probability is not a physical property or a directly observable quantity, but a degree of belief or an element of inference. The goal of this paper is to show how Bayesian statistics can be used in radionuclide metrology and what its advantages and disadvantages are compared with conventional statistics. This is performed through the example of an yttrium-90 source typically encountered in environmental surveillance measurement. Because of the very low activity of this kind of source and the small half-life of the radionuclide, this measurement takes several days, during which the source decays significantly. Several methods are proposed to compute simultaneously the number of unstable nuclei at a given reference time, the decay constant and the background. Asymptotically, all approaches give the same result. However, Bayesian statistics produces coherent estimates and confidence intervals in a much smaller number of measurements. Apart from the conceptual understanding of statistics, the main difficulty that could deter radionuclide metrologists from using Bayesian statistics is the complexity of the computation.

  6. Diagnosis and Reconfiguration using Bayesian Networks: An Electrical Power System Case Study

    NASA Technical Reports Server (NTRS)

    Knox, W. Bradley; Mengshoel, Ole

    2009-01-01

    Automated diagnosis and reconfiguration are important computational techniques that aim to minimize human intervention in autonomous systems. In this paper, we develop novel techniques and models in the context of diagnosis and reconfiguration reasoning using causal Bayesian networks (BNs). We take as starting point a successful diagnostic approach, using a static BN developed for a real-world electrical power system. We discuss in this paper the extension of this diagnostic approach along two dimensions, namely: (i) from a static BN to a dynamic BN; and (ii) from a diagnostic task to a reconfiguration task. More specifically, we discuss the auto-generation of a dynamic Bayesian network from a static Bayesian network. In addition, we discuss subtle, but important, differences between Bayesian networks when used for diagnosis versus reconfiguration. We discuss a novel reconfiguration agent, which models a system causally, including effects of actions through time, using a dynamic Bayesian network. Though the techniques we discuss are general, we demonstrate them in the context of electrical power systems (EPSs) for aircraft and spacecraft. EPSs are vital subsystems on-board aircraft and spacecraft, and many incidents and accidents of these vehicles have been attributed to EPS failures. We discuss a case study that provides initial but promising results for our approach in the setting of electrical power systems.

  7. Revealing Neurocomputational Mechanisms of Reinforcement Learning and Decision-Making With the hBayesDM Package

    PubMed Central

    Ahn, Woo-Young; Haines, Nathaniel; Zhang, Lei

    2017-01-01

    Reinforcement learning and decision-making (RLDM) provide a quantitative framework and computational theories with which we can disentangle psychiatric conditions into the basic dimensions of neurocognitive functioning. RLDM offer a novel approach to assessing and potentially diagnosing psychiatric patients, and there is growing enthusiasm for both RLDM and computational psychiatry among clinical researchers. Such a framework can also provide insights into the brain substrates of particular RLDM processes, as exemplified by model-based analysis of data from functional magnetic resonance imaging (fMRI) or electroencephalography (EEG). However, researchers often find the approach too technical and have difficulty adopting it for their research. Thus, a critical need remains to develop a user-friendly tool for the wide dissemination of computational psychiatric methods. We introduce an R package called hBayesDM (hierarchical Bayesian modeling of Decision-Making tasks), which offers computational modeling of an array of RLDM tasks and social exchange games. The hBayesDM package offers state-of-the-art hierarchical Bayesian modeling, in which both individual and group parameters (i.e., posterior distributions) are estimated simultaneously in a mutually constraining fashion. At the same time, the package is extremely user-friendly: users can perform computational modeling, output visualization, and Bayesian model comparisons, each with a single line of coding. Users can also extract the trial-by-trial latent variables (e.g., prediction errors) required for model-based fMRI/EEG. With the hBayesDM package, we anticipate that anyone with minimal knowledge of programming can take advantage of cutting-edge computational-modeling approaches to investigate the underlying processes of and interactions between multiple decision-making (e.g., goal-directed, habitual, and Pavlovian) systems. In this way, we expect that the hBayesDM package will contribute to the dissemination of advanced modeling approaches and enable a wide range of researchers to easily perform computational psychiatric research within different populations. PMID:29601060

  8. A sub-space greedy search method for efficient Bayesian Network inference.

    PubMed

    Zhang, Qing; Cao, Yong; Li, Yong; Zhu, Yanming; Sun, Samuel S M; Guo, Dianjing

    2011-09-01

    Bayesian network (BN) has been successfully used to infer the regulatory relationships of genes from microarray dataset. However, one major limitation of BN approach is the computational cost because the calculation time grows more than exponentially with the dimension of the dataset. In this paper, we propose a sub-space greedy search method for efficient Bayesian Network inference. Particularly, this method limits the greedy search space by only selecting gene pairs with higher partial correlation coefficients. Using both synthetic and real data, we demonstrate that the proposed method achieved comparable results with standard greedy search method yet saved ∼50% of the computational time. We believe that sub-space search method can be widely used for efficient BN inference in systems biology. Copyright © 2011 Elsevier Ltd. All rights reserved.

  9. Bayesian data analysis in population ecology: motivations, methods, and benefits

    USGS Publications Warehouse

    Dorazio, Robert

    2016-01-01

    During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.

  10. Crystalline nucleation in undercooled liquids: a Bayesian data-analysis approach for a nonhomogeneous Poisson process.

    PubMed

    Filipponi, A; Di Cicco, A; Principi, E

    2012-12-01

    A Bayesian data-analysis approach to data sets of maximum undercooling temperatures recorded in repeated melting-cooling cycles of high-purity samples is proposed. The crystallization phenomenon is described in terms of a nonhomogeneous Poisson process driven by a temperature-dependent sample nucleation rate J(T). The method was extensively tested by computer simulations and applied to real data for undercooled liquid Ge. It proved to be particularly useful in the case of scarce data sets where the usage of binned data would degrade the available experimental information.

  11. New insights into faster computation of uncertainties

    NASA Astrophysics Data System (ADS)

    Bhattacharya, Atreyee

    2012-11-01

    Heavy computation power, lengthy simulations, and an exhaustive number of model runs—often these seem like the only statistical tools that scientists have at their disposal when computing uncertainties associated with predictions, particularly in cases of environmental processes such as groundwater movement. However, calculation of uncertainties need not be as lengthy, a new study shows. Comparing two approaches—the classical Bayesian “credible interval” and a less commonly used regression-based “confidence interval” method—Lu et al. show that for many practical purposes both methods provide similar estimates of uncertainties. The advantage of the regression method is that it demands 10-1000 model runs, whereas the classical Bayesian approach requires 10,000 to millions of model runs.

  12. Embedding the results of focussed Bayesian fusion into a global context

    NASA Astrophysics Data System (ADS)

    Sander, Jennifer; Heizmann, Michael

    2014-05-01

    Bayesian statistics offers a well-founded and powerful fusion methodology also for the fusion of heterogeneous information sources. However, except in special cases, the needed posterior distribution is not analytically derivable. As consequence, Bayesian fusion may cause unacceptably high computational and storage costs in practice. Local Bayesian fusion approaches aim at reducing the complexity of the Bayesian fusion methodology significantly. This is done by concentrating the actual Bayesian fusion on the potentially most task relevant parts of the domain of the Properties of Interest. Our research on these approaches is motivated by an analogy to criminal investigations where criminalists pursue clues also only locally. This publication follows previous publications on a special local Bayesian fusion technique called focussed Bayesian fusion. Here, the actual calculation of the posterior distribution gets completely restricted to a suitably chosen local context. By this, the global posterior distribution is not completely determined. Strategies for using the results of a focussed Bayesian analysis appropriately are needed. In this publication, we primarily contrast different ways of embedding the results of focussed Bayesian fusion explicitly into a global context. To obtain a unique global posterior distribution, we analyze the application of the Maximum Entropy Principle that has been shown to be successfully applicable in metrology and in different other areas. To address the special need for making further decisions subsequently to the actual fusion task, we further analyze criteria for decision making under partial information.

  13. A combined Fuzzy and Naive Bayesian strategy can be used to assign event codes to injury narratives.

    PubMed

    Marucci-Wellman, H; Lehto, M; Corns, H

    2011-12-01

    Bayesian methods show promise for classifying injury narratives from large administrative datasets into cause groups. This study examined a combined approach where two Bayesian models (Fuzzy and Naïve) were used to either classify a narrative or select it for manual review. Injury narratives were extracted from claims filed with a worker's compensation insurance provider between January 2002 and December 2004. Narratives were separated into a training set (n=11,000) and prediction set (n=3,000). Expert coders assigned two-digit Bureau of Labor Statistics Occupational Injury and Illness Classification event codes to each narrative. Fuzzy and Naïve Bayesian models were developed using manually classified cases in the training set. Two semi-automatic machine coding strategies were evaluated. The first strategy assigned cases for manual review if the Fuzzy and Naïve models disagreed on the classification. The second strategy selected additional cases for manual review from the Agree dataset using prediction strength to reach a level of 50% computer coding and 50% manual coding. When agreement alone was used as the filtering strategy, the majority were coded by the computer (n=1,928, 64%) leaving 36% for manual review. The overall combined (human plus computer) sensitivity was 0.90 and positive predictive value (PPV) was >0.90 for 11 of 18 2-digit event categories. Implementing the 2nd strategy improved results with an overall sensitivity of 0.95 and PPV >0.90 for 17 of 18 categories. A combined Naïve-Fuzzy Bayesian approach can classify some narratives with high accuracy and identify others most beneficial for manual review, reducing the burden on human coders.

  14. Computational statistics using the Bayesian Inference Engine

    NASA Astrophysics Data System (ADS)

    Weinberg, Martin D.

    2013-09-01

    This paper introduces the Bayesian Inference Engine (BIE), a general parallel, optimized software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organize and reuse expensive derived data. The BIE is the first platform for computational statistics designed explicitly to enable Bayesian update and model comparison for astronomical problems. Bayesian update is based on the representation of high-dimensional posterior distributions using metric-ball-tree based kernel density estimation. Among its algorithmic offerings, the BIE emphasizes hybrid tempered Markov chain Monte Carlo schemes that robustly sample multimodal posterior distributions in high-dimensional parameter spaces. Moreover, the BIE implements a full persistence or serialization system that stores the full byte-level image of the running inference and previously characterized posterior distributions for later use. Two new algorithms to compute the marginal likelihood from the posterior distribution, developed for and implemented in the BIE, enable model comparison for complex models and data sets. Finally, the BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. It includes an extensible object-oriented and easily extended framework that implements every aspect of the Bayesian inference. By providing a variety of statistical algorithms for all phases of the inference problem, a scientist may explore a variety of approaches with a single model and data implementation. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU General Public License.

  15. On-line Bayesian model updating for structural health monitoring

    NASA Astrophysics Data System (ADS)

    Rocchetta, Roberto; Broggi, Matteo; Huchet, Quentin; Patelli, Edoardo

    2018-03-01

    Fatigue induced cracks is a dangerous failure mechanism which affects mechanical components subject to alternating load cycles. System health monitoring should be adopted to identify cracks which can jeopardise the structure. Real-time damage detection may fail in the identification of the cracks due to different sources of uncertainty which have been poorly assessed or even fully neglected. In this paper, a novel efficient and robust procedure is used for the detection of cracks locations and lengths in mechanical components. A Bayesian model updating framework is employed, which allows accounting for relevant sources of uncertainty. The idea underpinning the approach is to identify the most probable crack consistent with the experimental measurements. To tackle the computational cost of the Bayesian approach an emulator is adopted for replacing the computationally costly Finite Element model. To improve the overall robustness of the procedure, different numerical likelihoods, measurement noises and imprecision in the value of model parameters are analysed and their effects quantified. The accuracy of the stochastic updating and the efficiency of the numerical procedure are discussed. An experimental aluminium frame and on a numerical model of a typical car suspension arm are used to demonstrate the applicability of the approach.

  16. On the limitations of standard statistical modeling in biological systems: a full Bayesian approach for biology.

    PubMed

    Gomez-Ramirez, Jaime; Sanz, Ricardo

    2013-09-01

    One of the most important scientific challenges today is the quantitative and predictive understanding of biological function. Classical mathematical and computational approaches have been enormously successful in modeling inert matter, but they may be inadequate to address inherent features of biological systems. We address the conceptual and methodological obstacles that lie in the inverse problem in biological systems modeling. We introduce a full Bayesian approach (FBA), a theoretical framework to study biological function, in which probability distributions are conditional on biophysical information that physically resides in the biological system that is studied by the scientist. Copyright © 2013 Elsevier Ltd. All rights reserved.

  17. Bayesian reconstruction and use of anatomical a priori information for emission tomography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowsher, J.E.; Johnson, V.E.; Turkington, T.G.

    1996-10-01

    A Bayesian method is presented for simultaneously segmenting and reconstructing emission computed tomography (ECT) images and for incorporating high-resolution, anatomical information into those reconstructions. The anatomical information is often available from other imaging modalities such as computed tomography (CT) or magnetic resonance imaging (MRI). The Bayesian procedure models the ECT radiopharmaceutical distribution as consisting of regions, such that radiopharmaceutical activity is similar throughout each region. It estimates the number of regions, the mean activity of each region, and the region classification and mean activity of each voxel. Anatomical information is incorporated by assigning higher prior probabilities to ECT segmentations inmore » which each ECT region stays within a single anatomical region. This approach is effective because anatomical tissue type often strongly influences radiopharmaceutical uptake. The Bayesian procedure is evaluated using physically acquired single-photon emission computed tomography (SPECT) projection data and MRI for the three-dimensional (3-D) Hoffman brain phantom. A clinically realistic count level is used. A cold lesion within the brain phantom is created during the SPECT scan but not during the MRI to demonstrate that the estimation procedure can detect ECT structure that is not present anatomically.« less

  18. Advances in computer simulation of genome evolution: toward more realistic evolutionary genomics analysis by approximate bayesian computation.

    PubMed

    Arenas, Miguel

    2015-04-01

    NGS technologies present a fast and cheap generation of genomic data. Nevertheless, ancestral genome inference is not so straightforward due to complex evolutionary processes acting on this material such as inversions, translocations, and other genome rearrangements that, in addition to their implicit complexity, can co-occur and confound ancestral inferences. Recently, models of genome evolution that accommodate such complex genomic events are emerging. This letter explores these novel evolutionary models and proposes their incorporation into robust statistical approaches based on computer simulations, such as approximate Bayesian computation, that may produce a more realistic evolutionary analysis of genomic data. Advantages and pitfalls in using these analytical methods are discussed. Potential applications of these ancestral genomic inferences are also pointed out.

  19. Development of dynamic Bayesian models for web application test management

    NASA Astrophysics Data System (ADS)

    Azarnova, T. V.; Polukhin, P. V.; Bondarenko, Yu V.; Kashirina, I. L.

    2018-03-01

    The mathematical apparatus of dynamic Bayesian networks is an effective and technically proven tool that can be used to model complex stochastic dynamic processes. According to the results of the research, mathematical models and methods of dynamic Bayesian networks provide a high coverage of stochastic tasks associated with error testing in multiuser software products operated in a dynamically changing environment. Formalized representation of the discrete test process as a dynamic Bayesian model allows us to organize the logical connection between individual test assets for multiple time slices. This approach gives an opportunity to present testing as a discrete process with set structural components responsible for the generation of test assets. Dynamic Bayesian network-based models allow us to combine in one management area individual units and testing components with different functionalities and a direct influence on each other in the process of comprehensive testing of various groups of computer bugs. The application of the proposed models provides an opportunity to use a consistent approach to formalize test principles and procedures, methods used to treat situational error signs, and methods used to produce analytical conclusions based on test results.

  20. Unification of field theory and maximum entropy methods for learning probability densities

    NASA Astrophysics Data System (ADS)

    Kinney, Justin B.

    2015-09-01

    The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory thus provides a natural test of the maximum entropy null hypothesis and, furthermore, returns an alternative (lower entropy) density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided.

  1. Unification of field theory and maximum entropy methods for learning probability densities.

    PubMed

    Kinney, Justin B

    2015-09-01

    The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory thus provides a natural test of the maximum entropy null hypothesis and, furthermore, returns an alternative (lower entropy) density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided.

  2. Bayesian hypothesis testing for human threat conditioning research: an introduction and the condir R package

    PubMed Central

    Krypotos, Angelos-Miltiadis; Klugkist, Irene; Engelhard, Iris M.

    2017-01-01

    ABSTRACT Threat conditioning procedures have allowed the experimental investigation of the pathogenesis of Post-Traumatic Stress Disorder. The findings of these procedures have also provided stable foundations for the development of relevant intervention programs (e.g. exposure therapy). Statistical inference of threat conditioning procedures is commonly based on p-values and Null Hypothesis Significance Testing (NHST). Nowadays, however, there is a growing concern about this statistical approach, as many scientists point to the various limitations of p-values and NHST. As an alternative, the use of Bayes factors and Bayesian hypothesis testing has been suggested. In this article, we apply this statistical approach to threat conditioning data. In order to enable the easy computation of Bayes factors for threat conditioning data we present a new R package named condir, which can be used either via the R console or via a Shiny application. This article provides both a non-technical introduction to Bayesian analysis for researchers using the threat conditioning paradigm, and the necessary tools for computing Bayes factors easily. PMID:29038683

  3. Specifying and Refining a Measurement Model for a Computer-Based Interactive Assessment

    ERIC Educational Resources Information Center

    Levy, Roy; Mislevy, Robert J.

    2004-01-01

    The challenges of modeling students' performance in computer-based interactive assessments include accounting for multiple aspects of knowledge and skill that arise in different situations and the conditional dependencies among multiple aspects of performance. This article describes a Bayesian approach to modeling and estimating cognitive models…

  4. A Monte Carlo–Based Bayesian Approach for Measuring Agreement in a Qualitative Scale

    PubMed Central

    Pérez Sánchez, Carlos Javier

    2014-01-01

    Agreement analysis has been an active research area whose techniques have been widely applied in psychology and other fields. However, statistical agreement among raters has been mainly considered from a classical statistics point of view. Bayesian methodology is a viable alternative that allows the inclusion of subjective initial information coming from expert opinions, personal judgments, or historical data. A Bayesian approach is proposed by providing a unified Monte Carlo–based framework to estimate all types of measures of agreement in a qualitative scale of response. The approach is conceptually simple and it has a low computational cost. Both informative and non-informative scenarios are considered. In case no initial information is available, the results are in line with the classical methodology, but providing more information on the measures of agreement. For the informative case, some guidelines are presented to elicitate the prior distribution. The approach has been applied to two applications related to schizophrenia diagnosis and sensory analysis. PMID:29881002

  5. A Fast Surrogate-facilitated Data-driven Bayesian Approach to Uncertainty Quantification of a Regional Groundwater Flow Model with Structural Error

    NASA Astrophysics Data System (ADS)

    Xu, T.; Valocchi, A. J.; Ye, M.; Liang, F.

    2016-12-01

    Due to simplification and/or misrepresentation of the real aquifer system, numerical groundwater flow and solute transport models are usually subject to model structural error. During model calibration, the hydrogeological parameters may be overly adjusted to compensate for unknown structural error. This may result in biased predictions when models are used to forecast aquifer response to new forcing. In this study, we extend a fully Bayesian method [Xu and Valocchi, 2015] to calibrate a real-world, regional groundwater flow model. The method uses a data-driven error model to describe model structural error and jointly infers model parameters and structural error. In this study, Bayesian inference is facilitated using high performance computing and fast surrogate models. The surrogate models are constructed using machine learning techniques to emulate the response simulated by the computationally expensive groundwater model. We demonstrate in the real-world case study that explicitly accounting for model structural error yields parameter posterior distributions that are substantially different from those derived by the classical Bayesian calibration that does not account for model structural error. In addition, the Bayesian with error model method gives significantly more accurate prediction along with reasonable credible intervals.

  6. Applying Bayesian Item Selection Approaches to Adaptive Tests Using Polytomous Items

    ERIC Educational Resources Information Center

    Penfield, Randall D.

    2006-01-01

    This study applied the maximum expected information (MEI) and the maximum posterior-weighted information (MPI) approaches of computer adaptive testing item selection to the case of a test using polytomous items following the partial credit model. The MEI and MPI approaches are described. A simulation study compared the efficiency of ability…

  7. Bayesian just-so stories in psychology and neuroscience.

    PubMed

    Bowers, Jeffrey S; Davis, Colin J

    2012-05-01

    According to Bayesian theories in psychology and neuroscience, minds and brains are (near) optimal in solving a wide range of tasks. We challenge this view and argue that more traditional, non-Bayesian approaches are more promising. We make 3 main arguments. First, we show that the empirical evidence for Bayesian theories in psychology is weak. This weakness relates to the many arbitrary ways that priors, likelihoods, and utility functions can be altered in order to account for the data that are obtained, making the models unfalsifiable. It further relates to the fact that Bayesian theories are rarely better at predicting data compared with alternative (and simpler) non-Bayesian theories. Second, we show that the empirical evidence for Bayesian theories in neuroscience is weaker still. There are impressive mathematical analyses showing how populations of neurons could compute in a Bayesian manner but little or no evidence that they do. Third, we challenge the general scientific approach that characterizes Bayesian theorizing in cognitive science. A common premise is that theories in psychology should largely be constrained by a rational analysis of what the mind ought to do. We question this claim and argue that many of the important constraints come from biological, evolutionary, and processing (algorithmic) considerations that have no adaptive relevance to the problem per se. In our view, these factors have contributed to the development of many Bayesian "just so" stories in psychology and neuroscience; that is, mathematical analyses of cognition that can be used to explain almost any behavior as optimal. 2012 APA, all rights reserved.

  8. Optimal speech motor control and token-to-token variability: a Bayesian modeling approach.

    PubMed

    Patri, Jean-François; Diard, Julien; Perrier, Pascal

    2015-12-01

    The remarkable capacity of the speech motor system to adapt to various speech conditions is due to an excess of degrees of freedom, which enables producing similar acoustical properties with different sets of control strategies. To explain how the central nervous system selects one of the possible strategies, a common approach, in line with optimal motor control theories, is to model speech motor planning as the solution of an optimality problem based on cost functions. Despite the success of this approach, one of its drawbacks is the intrinsic contradiction between the concept of optimality and the observed experimental intra-speaker token-to-token variability. The present paper proposes an alternative approach by formulating feedforward optimal control in a probabilistic Bayesian modeling framework. This is illustrated by controlling a biomechanical model of the vocal tract for speech production and by comparing it with an existing optimal control model (GEPPETO). The essential elements of this optimal control model are presented first. From them the Bayesian model is constructed in a progressive way. Performance of the Bayesian model is evaluated based on computer simulations and compared to the optimal control model. This approach is shown to be appropriate for solving the speech planning problem while accounting for variability in a principled way.

  9. Bayesian Approach to the Joint Inversion of Gravity and Magnetic Data, with Application to the Ismenius Area of Mars

    NASA Technical Reports Server (NTRS)

    Jewell, Jeffrey B.; Raymond, C.; Smrekar, S.; Millbury, C.

    2004-01-01

    This viewgraph presentation reviews a Bayesian approach to the inversion of gravity and magnetic data with specific application to the Ismenius Area of Mars. Many inverse problems encountered in geophysics and planetary science are well known to be non-unique (i.e. inversion of gravity the density structure of a body). In hopes of reducing the non-uniqueness of solutions, there has been interest in the joint analysis of data. An example is the joint inversion of gravity and magnetic data, with the assumption that the same physical anomalies generate both the observed magnetic and gravitational anomalies. In this talk, we formulate the joint analysis of different types of data in a Bayesian framework and apply the formalism to the inference of the density and remanent magnetization structure for a local region in the Ismenius area of Mars. The Bayesian approach allows prior information or constraints in the solutions to be incorporated in the inversion, with the "best" solutions those whose forward predictions most closely match the data while remaining consistent with assumed constraints. The application of this framework to the inversion of gravity and magnetic data on Mars reveals two typical challenges - the forward predictions of the data have a linear dependence on some of the quantities of interest, and non-linear dependence on others (termed the "linear" and "non-linear" variables, respectively). For observations with Gaussian noise, a Bayesian approach to inversion for "linear" variables reduces to a linear filtering problem, with an explicitly computable "error" matrix. However, for models whose forward predictions have non-linear dependencies, inference is no longer given by such a simple linear problem, and moreover, the uncertainty in the solution is no longer completely specified by a computable "error matrix". It is therefore important to develop methods for sampling from the full Bayesian posterior to provide a complete and statistically consistent picture of model uncertainty, and what has been learned from observations. We will discuss advanced numerical techniques, including Monte Carlo Markov

  10. Improved Accuracy Using Recursive Bayesian Estimation Based Language Model Fusion in ERP-Based BCI Typing Systems

    PubMed Central

    Orhan, U.; Erdogmus, D.; Roark, B.; Oken, B.; Purwar, S.; Hild, K. E.; Fowler, A.; Fried-Oken, M.

    2013-01-01

    RSVP Keyboard™ is an electroencephalography (EEG) based brain computer interface (BCI) typing system, designed as an assistive technology for the communication needs of people with locked-in syndrome (LIS). It relies on rapid serial visual presentation (RSVP) and does not require precise eye gaze control. Existing BCI typing systems which uses event related potentials (ERP) in EEG suffer from low accuracy due to low signal-to-noise ratio. Henceforth, RSVP Keyboard™ utilizes a context based decision making via incorporating a language model, to improve the accuracy of letter decisions. To further improve the contributions of the language model, we propose recursive Bayesian estimation, which relies on non-committing string decisions, and conduct an offline analysis, which compares it with the existing naïve Bayesian fusion approach. The results indicate the superiority of the recursive Bayesian fusion and in the next generation of RSVP Keyboard™ we plan to incorporate this new approach. PMID:23366432

  11. Application of bayesian networks to real-time flood risk estimation

    NASA Astrophysics Data System (ADS)

    Garrote, L.; Molina, M.; Blasco, G.

    2003-04-01

    This paper presents the application of a computational paradigm taken from the field of artificial intelligence - the bayesian network - to model the behaviour of hydrologic basins during floods. The final goal of this research is to develop representation techniques for hydrologic simulation models in order to define, develop and validate a mechanism, supported by a software environment, oriented to build decision models for the prediction and management of river floods in real time. The emphasis is placed on providing decision makers with tools to incorporate their knowledge of basin behaviour, usually formulated in terms of rainfall-runoff models, in the process of real-time decision making during floods. A rainfall-runoff model is only a step in the process of decision making. If a reliable rainfall forecast is available and the rainfall-runoff model is well calibrated, decisions can be based mainly on model results. However, in most practical situations, uncertainties in rainfall forecasts or model performance have to be incorporated in the decision process. The computation paradigm adopted for the simulation of hydrologic processes is the bayesian network. A bayesian network is a directed acyclic graph that represents causal influences between linked variables. Under this representation, uncertain qualitative variables are related through causal relations quantified with conditional probabilities. The solution algorithm allows the computation of the expected probability distribution of unknown variables conditioned to the observations. An approach to represent hydrologic processes by bayesian networks with temporal and spatial extensions is presented in this paper, together with a methodology for the development of bayesian models using results produced by deterministic hydrologic simulation models

  12. Initialization and Restart in Stochastic Local Search: Computing a Most Probable Explanation in Bayesian Networks

    NASA Technical Reports Server (NTRS)

    Mengshoel, Ole J.; Wilkins, David C.; Roth, Dan

    2010-01-01

    For hard computational problems, stochastic local search has proven to be a competitive approach to finding optimal or approximately optimal problem solutions. Two key research questions for stochastic local search algorithms are: Which algorithms are effective for initialization? When should the search process be restarted? In the present work we investigate these research questions in the context of approximate computation of most probable explanations (MPEs) in Bayesian networks (BNs). We introduce a novel approach, based on the Viterbi algorithm, to explanation initialization in BNs. While the Viterbi algorithm works on sequences and trees, our approach works on BNs with arbitrary topologies. We also give a novel formalization of stochastic local search, with focus on initialization and restart, using probability theory and mixture models. Experimentally, we apply our methods to the problem of MPE computation, using a stochastic local search algorithm known as Stochastic Greedy Search. By carefully optimizing both initialization and restart, we reduce the MPE search time for application BNs by several orders of magnitude compared to using uniform at random initialization without restart. On several BNs from applications, the performance of Stochastic Greedy Search is competitive with clique tree clustering, a state-of-the-art exact algorithm used for MPE computation in BNs.

  13. Parameter Estimation for a Turbulent Buoyant Jet Using Approximate Bayesian Computation

    NASA Astrophysics Data System (ADS)

    Christopher, Jason D.; Wimer, Nicholas T.; Hayden, Torrey R. S.; Lapointe, Caelan; Grooms, Ian; Rieker, Gregory B.; Hamlington, Peter E.

    2016-11-01

    Approximate Bayesian Computation (ABC) is a powerful tool that allows sparse experimental or other "truth" data to be used for the prediction of unknown model parameters in numerical simulations of real-world engineering systems. In this presentation, we introduce the ABC approach and then use ABC to predict unknown inflow conditions in simulations of a two-dimensional (2D) turbulent, high-temperature buoyant jet. For this test case, truth data are obtained from a simulation with known boundary conditions and problem parameters. Using spatially-sparse temperature statistics from the 2D buoyant jet truth simulation, we show that the ABC method provides accurate predictions of the true jet inflow temperature. The success of the ABC approach in the present test suggests that ABC is a useful and versatile tool for engineering fluid dynamics research.

  14. Model selection and parameter estimation in structural dynamics using approximate Bayesian computation

    NASA Astrophysics Data System (ADS)

    Ben Abdessalem, Anis; Dervilis, Nikolaos; Wagg, David; Worden, Keith

    2018-01-01

    This paper will introduce the use of the approximate Bayesian computation (ABC) algorithm for model selection and parameter estimation in structural dynamics. ABC is a likelihood-free method typically used when the likelihood function is either intractable or cannot be approached in a closed form. To circumvent the evaluation of the likelihood function, simulation from a forward model is at the core of the ABC algorithm. The algorithm offers the possibility to use different metrics and summary statistics representative of the data to carry out Bayesian inference. The efficacy of the algorithm in structural dynamics is demonstrated through three different illustrative examples of nonlinear system identification: cubic and cubic-quintic models, the Bouc-Wen model and the Duffing oscillator. The obtained results suggest that ABC is a promising alternative to deal with model selection and parameter estimation issues, specifically for systems with complex behaviours.

  15. Practical Bayesian tomography

    NASA Astrophysics Data System (ADS)

    Granade, Christopher; Combes, Joshua; Cory, D. G.

    2016-03-01

    In recent years, Bayesian methods have been proposed as a solution to a wide range of issues in quantum state and process tomography. State-of-the-art Bayesian tomography solutions suffer from three problems: numerical intractability, a lack of informative prior distributions, and an inability to track time-dependent processes. Here, we address all three problems. First, we use modern statistical methods, as pioneered by Huszár and Houlsby (2012 Phys. Rev. A 85 052120) and by Ferrie (2014 New J. Phys. 16 093035), to make Bayesian tomography numerically tractable. Our approach allows for practical computation of Bayesian point and region estimators for quantum states and channels. Second, we propose the first priors on quantum states and channels that allow for including useful experimental insight. Finally, we develop a method that allows tracking of time-dependent states and estimates the drift and diffusion processes affecting a state. We provide source code and animated visual examples for our methods.

  16. Bayes in biological anthropology.

    PubMed

    Konigsberg, Lyle W; Frankenberg, Susan R

    2013-12-01

    In this article, we both contend and illustrate that biological anthropologists, particularly in the Americas, often think like Bayesians but act like frequentists when it comes to analyzing a wide variety of data. In other words, while our research goals and perspectives are rooted in probabilistic thinking and rest on prior knowledge, we often proceed to use statistical hypothesis tests and confidence interval methods unrelated (or tenuously related) to the research questions of interest. We advocate for applying Bayesian analyses to a number of different bioanthropological questions, especially since many of the programming and computational challenges to doing so have been overcome in the past two decades. To facilitate such applications, this article explains Bayesian principles and concepts, and provides concrete examples of Bayesian computer simulations and statistics that address questions relevant to biological anthropology, focusing particularly on bioarchaeology and forensic anthropology. It also simultaneously reviews the use of Bayesian methods and inference within the discipline to date. This article is intended to act as primer to Bayesian methods and inference in biological anthropology, explaining the relationships of various methods to likelihoods or probabilities and to classical statistical models. Our contention is not that traditional frequentist statistics should be rejected outright, but that there are many situations where biological anthropology is better served by taking a Bayesian approach. To this end it is hoped that the examples provided in this article will assist researchers in choosing from among the broad array of statistical methods currently available. Copyright © 2013 Wiley Periodicals, Inc.

  17. Detangling complex relationships in forensic data: principles and use of causal networks and their application to clinical forensic science.

    PubMed

    Lefèvre, Thomas; Lepresle, Aude; Chariot, Patrick

    2015-09-01

    The search for complex, nonlinear relationships and causality in data is hindered by the availability of techniques in many domains, including forensic science. Linear multivariable techniques are useful but present some shortcomings. In the past decade, Bayesian approaches have been introduced in forensic science. To date, authors have mainly focused on providing an alternative to classical techniques for quantifying effects and dealing with uncertainty. Causal networks, including Bayesian networks, can help detangle complex relationships in data. A Bayesian network estimates the joint probability distribution of data and graphically displays dependencies between variables and the circulation of information between these variables. In this study, we illustrate the interest in utilizing Bayesian networks for dealing with complex data through an application in clinical forensic science. Evaluating the functional impairment of assault survivors is a complex task for which few determinants are known. As routinely estimated in France, the duration of this impairment can be quantified by days of 'Total Incapacity to Work' ('Incapacité totale de travail,' ITT). In this study, we used a Bayesian network approach to identify the injury type, victim category and time to evaluation as the main determinants of the 'Total Incapacity to Work' (TIW). We computed the conditional probabilities associated with the TIW node and its parents. We compared this approach with a multivariable analysis, and the results of both techniques were converging. Thus, Bayesian networks should be considered a reliable means to detangle complex relationships in data.

  18. Bayes and the Law

    PubMed Central

    Fenton, Norman; Neil, Martin; Berger, Daniel

    2016-01-01

    Although the last forty years has seen considerable growth in the use of statistics in legal proceedings, it is primarily classical statistical methods rather than Bayesian methods that have been used. Yet the Bayesian approach avoids many of the problems of classical statistics and is also well suited to a broader range of problems. This paper reviews the potential and actual use of Bayes in the law and explains the main reasons for its lack of impact on legal practice. These include misconceptions by the legal community about Bayes’ theorem, over-reliance on the use of the likelihood ratio and the lack of adoption of modern computational methods. We argue that Bayesian Networks (BNs), which automatically produce the necessary Bayesian calculations, provide an opportunity to address most concerns about using Bayes in the law. PMID:27398389

  19. Bayes and the Law.

    PubMed

    Fenton, Norman; Neil, Martin; Berger, Daniel

    2016-06-01

    Although the last forty years has seen considerable growth in the use of statistics in legal proceedings, it is primarily classical statistical methods rather than Bayesian methods that have been used. Yet the Bayesian approach avoids many of the problems of classical statistics and is also well suited to a broader range of problems. This paper reviews the potential and actual use of Bayes in the law and explains the main reasons for its lack of impact on legal practice. These include misconceptions by the legal community about Bayes' theorem, over-reliance on the use of the likelihood ratio and the lack of adoption of modern computational methods. We argue that Bayesian Networks (BNs), which automatically produce the necessary Bayesian calculations, provide an opportunity to address most concerns about using Bayes in the law.

  20. Bayesian Model Selection under Time Constraints

    NASA Astrophysics Data System (ADS)

    Hoege, M.; Nowak, W.; Illman, W. A.

    2017-12-01

    Bayesian model selection (BMS) provides a consistent framework for rating and comparing models in multi-model inference. In cases where models of vastly different complexity compete with each other, we also face vastly different computational runtimes of such models. For instance, time series of a quantity of interest can be simulated by an autoregressive process model that takes even less than a second for one run, or by a partial differential equations-based model with runtimes up to several hours or even days. The classical BMS is based on a quantity called Bayesian model evidence (BME). It determines the model weights in the selection process and resembles a trade-off between bias of a model and its complexity. However, in practice, the runtime of models is another weight relevant factor for model selection. Hence, we believe that it should be included, leading to an overall trade-off problem between bias, variance and computing effort. We approach this triple trade-off from the viewpoint of our ability to generate realizations of the models under a given computational budget. One way to obtain BME values is through sampling-based integration techniques. We argue with the fact that more expensive models can be sampled much less under time constraints than faster models (in straight proportion to their runtime). The computed evidence in favor of a more expensive model is statistically less significant than the evidence computed in favor of a faster model, since sampling-based strategies are always subject to statistical sampling error. We present a straightforward way to include this misbalance into the model weights that are the basis for model selection. Our approach follows directly from the idea of insufficient significance. It is based on a computationally cheap bootstrapping error estimate of model evidence and is easy to implement. The approach is illustrated in a small synthetic modeling study.

  1. Bayesian experimental design for models with intractable likelihoods.

    PubMed

    Drovandi, Christopher C; Pettitt, Anthony N

    2013-12-01

    In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables. © 2013, The International Biometric Society.

  2. Fast Bayesian experimental design: Laplace-based importance sampling for the expected information gain

    NASA Astrophysics Data System (ADS)

    Beck, Joakim; Dia, Ben Mansour; Espath, Luis F. R.; Long, Quan; Tempone, Raúl

    2018-06-01

    In calculating expected information gain in optimal Bayesian experimental design, the computation of the inner loop in the classical double-loop Monte Carlo requires a large number of samples and suffers from underflow if the number of samples is small. These drawbacks can be avoided by using an importance sampling approach. We present a computationally efficient method for optimal Bayesian experimental design that introduces importance sampling based on the Laplace method to the inner loop. We derive the optimal values for the method parameters in which the average computational cost is minimized according to the desired error tolerance. We use three numerical examples to demonstrate the computational efficiency of our method compared with the classical double-loop Monte Carlo, and a more recent single-loop Monte Carlo method that uses the Laplace method as an approximation of the return value of the inner loop. The first example is a scalar problem that is linear in the uncertain parameter. The second example is a nonlinear scalar problem. The third example deals with the optimal sensor placement for an electrical impedance tomography experiment to recover the fiber orientation in laminate composites.

  3. Valence-Dependent Belief Updating: Computational Validation

    PubMed Central

    Kuzmanovic, Bojana; Rigoux, Lionel

    2017-01-01

    People tend to update beliefs about their future outcomes in a valence-dependent way: they are likely to incorporate good news and to neglect bad news. However, belief formation is a complex process which depends not only on motivational factors such as the desire for favorable conclusions, but also on multiple cognitive variables such as prior beliefs, knowledge about personal vulnerabilities and resources, and the size of the probabilities and estimation errors. Thus, we applied computational modeling in order to test for valence-induced biases in updating while formally controlling for relevant cognitive factors. We compared biased and unbiased Bayesian models of belief updating, and specified alternative models based on reinforcement learning. The experiment consisted of 80 trials with 80 different adverse future life events. In each trial, participants estimated the base rate of one of these events and estimated their own risk of experiencing the event before and after being confronted with the actual base rate. Belief updates corresponded to the difference between the two self-risk estimates. Valence-dependent updating was assessed by comparing trials with good news (better-than-expected base rates) with trials with bad news (worse-than-expected base rates). After receiving bad relative to good news, participants' updates were smaller and deviated more strongly from rational Bayesian predictions, indicating a valence-induced bias. Model comparison revealed that the biased (i.e., optimistic) Bayesian model of belief updating better accounted for data than the unbiased (i.e., rational) Bayesian model, confirming that the valence of the new information influenced the amount of updating. Moreover, alternative computational modeling based on reinforcement learning demonstrated higher learning rates for good than for bad news, as well as a moderating role of personal knowledge. Finally, in this specific experimental context, the approach based on reinforcement learning was superior to the Bayesian approach. The computational validation of valence-dependent belief updating represents a novel support for a genuine optimism bias in human belief formation. Moreover, the precise control of relevant cognitive variables justifies the conclusion that the motivation to adopt the most favorable self-referential conclusions biases human judgments. PMID:28706499

  4. Valence-Dependent Belief Updating: Computational Validation.

    PubMed

    Kuzmanovic, Bojana; Rigoux, Lionel

    2017-01-01

    People tend to update beliefs about their future outcomes in a valence-dependent way: they are likely to incorporate good news and to neglect bad news. However, belief formation is a complex process which depends not only on motivational factors such as the desire for favorable conclusions, but also on multiple cognitive variables such as prior beliefs, knowledge about personal vulnerabilities and resources, and the size of the probabilities and estimation errors. Thus, we applied computational modeling in order to test for valence-induced biases in updating while formally controlling for relevant cognitive factors. We compared biased and unbiased Bayesian models of belief updating, and specified alternative models based on reinforcement learning. The experiment consisted of 80 trials with 80 different adverse future life events. In each trial, participants estimated the base rate of one of these events and estimated their own risk of experiencing the event before and after being confronted with the actual base rate. Belief updates corresponded to the difference between the two self-risk estimates. Valence-dependent updating was assessed by comparing trials with good news (better-than-expected base rates) with trials with bad news (worse-than-expected base rates). After receiving bad relative to good news, participants' updates were smaller and deviated more strongly from rational Bayesian predictions, indicating a valence-induced bias. Model comparison revealed that the biased (i.e., optimistic) Bayesian model of belief updating better accounted for data than the unbiased (i.e., rational) Bayesian model, confirming that the valence of the new information influenced the amount of updating. Moreover, alternative computational modeling based on reinforcement learning demonstrated higher learning rates for good than for bad news, as well as a moderating role of personal knowledge. Finally, in this specific experimental context, the approach based on reinforcement learning was superior to the Bayesian approach. The computational validation of valence-dependent belief updating represents a novel support for a genuine optimism bias in human belief formation. Moreover, the precise control of relevant cognitive variables justifies the conclusion that the motivation to adopt the most favorable self-referential conclusions biases human judgments.

  5. In-situ resource utilization for the human exploration of Mars : a Bayesian approach to valuation of precursor missions

    NASA Technical Reports Server (NTRS)

    Smith, Jeffrey H.

    2006-01-01

    The need for sufficient quantities of oxygen, water, and fuel resources to support a crew on the surface of Mars presents a critical logistical issue of whether to transport such resources from Earth or manufacture them on Mars. An approach based on the classical Wildcat Drilling Problem of Bayesian decision theory was applied to the problem of finding water in order to compute the expected value of precursor mission sample information. An implicit (required) probability of finding water on Mars was derived from the value of sample information using the expected mass savings of alternative precursor missions.

  6. Comparison of Co-Temporal Modeling Algorithms on Sparse Experimental Time Series Data Sets.

    PubMed

    Allen, Edward E; Norris, James L; John, David J; Thomas, Stan J; Turkett, William H; Fetrow, Jacquelyn S

    2010-01-01

    Multiple approaches for reverse-engineering biological networks from time-series data have been proposed in the computational biology literature. These approaches can be classified by their underlying mathematical algorithms, such as Bayesian or algebraic techniques, as well as by their time paradigm, which includes next-state and co-temporal modeling. The types of biological relationships, such as parent-child or siblings, discovered by these algorithms are quite varied. It is important to understand the strengths and weaknesses of the various algorithms and time paradigms on actual experimental data. We assess how well the co-temporal implementations of three algorithms, continuous Bayesian, discrete Bayesian, and computational algebraic, can 1) identify two types of entity relationships, parent and sibling, between biological entities, 2) deal with experimental sparse time course data, and 3) handle experimental noise seen in replicate data sets. These algorithms are evaluated, using the shuffle index metric, for how well the resulting models match literature models in terms of siblings and parent relationships. Results indicate that all three co-temporal algorithms perform well, at a statistically significant level, at finding sibling relationships, but perform relatively poorly in finding parent relationships.

  7. Bayesian Estimation of the Spatially Varying Completeness Magnitude of Earthquake Catalogs

    NASA Astrophysics Data System (ADS)

    Mignan, A.; Werner, M.; Wiemer, S.; Chen, C.; Wu, Y.

    2010-12-01

    Assessing the completeness magnitude Mc of earthquake catalogs is an essential prerequisite for any seismicity analysis. We employ a simple model to compute Mc in space, based on the proximity to seismic stations in a network. We show that a relationship of the form Mcpred(d) = ad^b+c, with d the distance to the 5th nearest seismic station, fits the observations well. We then propose a new Mc mapping approach, the Bayesian Magnitude of Completeness (BMC) method, based on a 2-step procedure: (1) a spatial resolution optimization to minimize spatial heterogeneities and uncertainties in Mc estimates and (2) a Bayesian approach that merges prior information about Mc based on the proximity to seismic stations with locally observed values weighted by their respective uncertainties. This new methodology eliminates most weaknesses associated with current Mc mapping procedures: the radius that defines which earthquakes to include in the local magnitude distribution is chosen according to an objective criterion and there are no gaps in the spatial estimation of Mc. The method solely requires the coordinates of seismic stations. Here, we investigate the Taiwan Central Weather Bureau (CWB) earthquake catalog by computing a Mc map for the period 1994-2010.

  8. MDTS: automatic complex materials design using Monte Carlo tree search.

    PubMed

    M Dieb, Thaer; Ju, Shenghong; Yoshizoe, Kazuki; Hou, Zhufeng; Shiomi, Junichiro; Tsuda, Koji

    2017-01-01

    Complex materials design is often represented as a black-box combinatorial optimization problem. In this paper, we present a novel python library called MDTS (Materials Design using Tree Search). Our algorithm employs a Monte Carlo tree search approach, which has shown exceptional performance in computer Go game. Unlike evolutionary algorithms that require user intervention to set parameters appropriately, MDTS has no tuning parameters and works autonomously in various problems. In comparison to a Bayesian optimization package, our algorithm showed competitive search efficiency and superior scalability. We succeeded in designing large Silicon-Germanium (Si-Ge) alloy structures that Bayesian optimization could not deal with due to excessive computational cost. MDTS is available at https://github.com/tsudalab/MDTS.

  9. MDTS: automatic complex materials design using Monte Carlo tree search

    NASA Astrophysics Data System (ADS)

    Dieb, Thaer M.; Ju, Shenghong; Yoshizoe, Kazuki; Hou, Zhufeng; Shiomi, Junichiro; Tsuda, Koji

    2017-12-01

    Complex materials design is often represented as a black-box combinatorial optimization problem. In this paper, we present a novel python library called MDTS (Materials Design using Tree Search). Our algorithm employs a Monte Carlo tree search approach, which has shown exceptional performance in computer Go game. Unlike evolutionary algorithms that require user intervention to set parameters appropriately, MDTS has no tuning parameters and works autonomously in various problems. In comparison to a Bayesian optimization package, our algorithm showed competitive search efficiency and superior scalability. We succeeded in designing large Silicon-Germanium (Si-Ge) alloy structures that Bayesian optimization could not deal with due to excessive computational cost. MDTS is available at https://github.com/tsudalab/MDTS.

  10. Bayesian X-ray computed tomography using a three-level hierarchical prior model

    NASA Astrophysics Data System (ADS)

    Wang, Li; Mohammad-Djafari, Ali; Gac, Nicolas

    2017-06-01

    In recent decades X-ray Computed Tomography (CT) image reconstruction has been largely developed in both medical and industrial domain. In this paper, we propose using the Bayesian inference approach with a new hierarchical prior model. In the proposed model, a generalised Student-t distribution is used to enforce the Haar transformation of images to be sparse. Comparisons with some state of the art methods are presented. It is shown that by using the proposed model, the sparsity of sparse representation of images is enforced, so that edges of images are preserved. Simulation results are also provided to demonstrate the effectiveness of the new hierarchical model for reconstruction with fewer projections.

  11. Experience With Bayesian Image Based Surface Modeling

    NASA Technical Reports Server (NTRS)

    Stutz, John C.

    2005-01-01

    Bayesian surface modeling from images requires modeling both the surface and the image generation process, in order to optimize the models by comparing actual and generated images. Thus it differs greatly, both conceptually and in computational difficulty, from conventional stereo surface recovery techniques. But it offers the possibility of using any number of images, taken under quite different conditions, and by different instruments that provide independent and often complementary information, to generate a single surface model that fuses all available information. I describe an implemented system, with a brief introduction to the underlying mathematical models and the compromises made for computational efficiency. I describe successes and failures achieved on actual imagery, where we went wrong and what we did right, and how our approach could be improved. Lastly I discuss how the same approach can be extended to distinct types of instruments, to achieve true sensor fusion.

  12. The Importance of Proving the Null

    PubMed Central

    Gallistel, C. R.

    2010-01-01

    Null hypotheses are simple, precise, and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit mass of prior probability), the more the null is favored. A general solution is a sensitivity analysis: Compute the odds for or against the null as a function of the limit(s) on the vagueness of the alternative. If the odds on the null approach 1 from above as the hypothesized maximum size of the possible effect approaches 0, then the data favor the null over any vaguer alternative to it. The simple computations and the intuitive graphic representation of the analysis are illustrated by the analysis of diverse examples from the current literature. They pose 3 common experimental questions: (a) Are 2 means the same? (b) Is performance at chance? (c) Are factors additive? PMID:19348549

  13. Assessing system reliability and allocating resources: a bayesian approach that integrates multi-level data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Graves, Todd L; Hamada, Michael S

    2008-01-01

    Good estimates of the reliability of a system make use of test data and expert knowledge at all available levels. Furthermore, by integrating all these information sources, one can determine how best to allocate scarce testing resources to reduce uncertainty. Both of these goals are facilitated by modern Bayesian computational methods. We apply these tools to examples that were previously solvable only through the use of ingenious approximations, and use genetic algorithms to guide resource allocation.

  14. Efficient Mean Field Variational Algorithm for Data Assimilation (Invited)

    NASA Astrophysics Data System (ADS)

    Vrettas, M. D.; Cornford, D.; Opper, M.

    2013-12-01

    Data assimilation algorithms combine available observations of physical systems with the assumed model dynamics in a systematic manner, to produce better estimates of initial conditions for prediction. Broadly they can be categorized in three main approaches: (a) sequential algorithms, (b) sampling methods and (c) variational algorithms which transform the density estimation problem to an optimization problem. However, given finite computational resources, only a handful of ensemble Kalman filters and 4DVar algorithms have been applied operationally to very high dimensional geophysical applications, such as weather forecasting. In this paper we present a recent extension to our variational Bayesian algorithm which seeks the ';optimal' posterior distribution over the continuous time states, within a family of non-stationary Gaussian processes. Our initial work on variational Bayesian approaches to data assimilation, unlike the well-known 4DVar method which seeks only the most probable solution, computes the best time varying Gaussian process approximation to the posterior smoothing distribution for dynamical systems that can be represented by stochastic differential equations. This approach was based on minimising the Kullback-Leibler divergence, over paths, between the true posterior and our Gaussian process approximation. Whilst the observations were informative enough to keep the posterior smoothing density close to Gaussian the algorithm proved very effective on low dimensional systems (e.g. O(10)D). However for higher dimensional systems, the high computational demands make the algorithm prohibitively expensive. To overcome the difficulties presented in the original framework and make our approach more efficient in higher dimensional systems we have been developing a new mean field version of the algorithm which treats the state variables at any given time as being independent in the posterior approximation, while still accounting for their relationships in the mean solution arising from the original system dynamics. Here we present this new mean field approach, illustrating its performance on a range of benchmark data assimilation problems whose dimensionality varies from O(10) to O(10^3)D. We emphasise that the variational Bayesian approach we adopt, unlike other variational approaches, provides a natural bound on the marginal likelihood of the observations given the model parameters which also allows for inference of (hyper-) parameters such as observational errors, parameters in the dynamical model and model error representation. We also stress that since our approach is intrinsically parallel it can be implemented very efficiently to address very long data assimilation time windows. Moreover, like most traditional variational approaches our Bayesian variational method has the benefit of being posed as an optimisation problem therefore its complexity can be tuned to the available computational resources. We finish with a sketch of possible future directions.

  15. Efficient Bayesian experimental design for contaminant source identification

    NASA Astrophysics Data System (ADS)

    Zhang, Jiangjiang; Zeng, Lingzao; Chen, Cheng; Chen, Dingjiang; Wu, Laosheng

    2015-01-01

    In this study, an efficient full Bayesian approach is developed for the optimal sampling well location design and source parameters identification of groundwater contaminants. An information measure, i.e., the relative entropy, is employed to quantify the information gain from concentration measurements in identifying unknown parameters. In this approach, the sampling locations that give the maximum expected relative entropy are selected as the optimal design. After the sampling locations are determined, a Bayesian approach based on Markov Chain Monte Carlo (MCMC) is used to estimate unknown parameters. In both the design and estimation, the contaminant transport equation is required to be solved many times to evaluate the likelihood. To reduce the computational burden, an interpolation method based on the adaptive sparse grid is utilized to construct a surrogate for the contaminant transport equation. The approximated likelihood can be evaluated directly from the surrogate, which greatly accelerates the design and estimation process. The accuracy and efficiency of our approach are demonstrated through numerical case studies. It is shown that the methods can be used to assist in both single sampling location and monitoring network design for contaminant source identifications in groundwater.

  16. Efficient Probabilistic Diagnostics for Electrical Power Systems

    NASA Technical Reports Server (NTRS)

    Mengshoel, Ole J.; Chavira, Mark; Cascio, Keith; Poll, Scott; Darwiche, Adnan; Uckun, Serdar

    2008-01-01

    We consider in this work the probabilistic approach to model-based diagnosis when applied to electrical power systems (EPSs). Our probabilistic approach is formally well-founded, as it based on Bayesian networks and arithmetic circuits. We investigate the diagnostic task known as fault isolation, and pay special attention to meeting two of the main challenges . model development and real-time reasoning . often associated with real-world application of model-based diagnosis technologies. To address the challenge of model development, we develop a systematic approach to representing electrical power systems as Bayesian networks, supported by an easy-to-use speci.cation language. To address the real-time reasoning challenge, we compile Bayesian networks into arithmetic circuits. Arithmetic circuit evaluation supports real-time diagnosis by being predictable and fast. In essence, we introduce a high-level EPS speci.cation language from which Bayesian networks that can diagnose multiple simultaneous failures are auto-generated, and we illustrate the feasibility of using arithmetic circuits, compiled from Bayesian networks, for real-time diagnosis on real-world EPSs of interest to NASA. The experimental system is a real-world EPS, namely the Advanced Diagnostic and Prognostic Testbed (ADAPT) located at the NASA Ames Research Center. In experiments with the ADAPT Bayesian network, which currently contains 503 discrete nodes and 579 edges, we .nd high diagnostic accuracy in scenarios where one to three faults, both in components and sensors, were inserted. The time taken to compute the most probable explanation using arithmetic circuits has a small mean of 0.2625 milliseconds and standard deviation of 0.2028 milliseconds. In experiments with data from ADAPT we also show that arithmetic circuit evaluation substantially outperforms joint tree propagation and variable elimination, two alternative algorithms for diagnosis using Bayesian network inference.

  17. Bayesian component separation: The Planck experience

    NASA Astrophysics Data System (ADS)

    Wehus, Ingunn Kathrine; Eriksen, Hans Kristian

    2018-05-01

    Bayesian component separation techniques have played a central role in the data reduction process of Planck. The most important strength of this approach is its global nature, in which a parametric and physical model is fitted to the data. Such physical modeling allows the user to constrain very general data models, and jointly probe cosmological, astrophysical and instrumental parameters. This approach also supports statistically robust goodness-of-fit tests in terms of data-minus-model residual maps, which are essential for identifying residual systematic effects in the data. The main challenges are high code complexity and computational cost. Whether or not these costs are justified for a given experiment depends on its final uncertainty budget. We therefore predict that the importance of Bayesian component separation techniques is likely to increase with time for intensity mapping experiments, similar to what has happened in the CMB field, as observational techniques mature, and their overall sensitivity improves.

  18. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brewer, Brendon J.; Foreman-Mackey, Daniel; Hogg, David W., E-mail: bj.brewer@auckland.ac.nz

    We present and implement a probabilistic (Bayesian) method for producing catalogs from images of stellar fields. The method is capable of inferring the number of sources N in the image and can also handle the challenges introduced by noise, overlapping sources, and an unknown point-spread function. The luminosity function of the stars can also be inferred, even when the precise luminosity of each star is uncertain, via the use of a hierarchical Bayesian model. The computational feasibility of the method is demonstrated on two simulated images with different numbers of stars. We find that our method successfully recovers the inputmore » parameter values along with principled uncertainties even when the field is crowded. We also compare our results with those obtained from the SExtractor software. While the two approaches largely agree about the fluxes of the bright stars, the Bayesian approach provides more accurate inferences about the faint stars and the number of stars, particularly in the crowded case.« less

  19. Bayesian Latent Class Analysis Tutorial.

    PubMed

    Li, Yuelin; Lord-Bessen, Jennifer; Shiyko, Mariya; Loeb, Rebecca

    2018-01-01

    This article is a how-to guide on Bayesian computation using Gibbs sampling, demonstrated in the context of Latent Class Analysis (LCA). It is written for students in quantitative psychology or related fields who have a working knowledge of Bayes Theorem and conditional probability and have experience in writing computer programs in the statistical language R . The overall goals are to provide an accessible and self-contained tutorial, along with a practical computation tool. We begin with how Bayesian computation is typically described in academic articles. Technical difficulties are addressed by a hypothetical, worked-out example. We show how Bayesian computation can be broken down into a series of simpler calculations, which can then be assembled together to complete a computationally more complex model. The details are described much more explicitly than what is typically available in elementary introductions to Bayesian modeling so that readers are not overwhelmed by the mathematics. Moreover, the provided computer program shows how Bayesian LCA can be implemented with relative ease. The computer program is then applied in a large, real-world data set and explained line-by-line. We outline the general steps in how to extend these considerations to other methodological applications. We conclude with suggestions for further readings.

  20. Bayesian analyses of time-interval data for environmental radiation monitoring.

    PubMed

    Luo, Peng; Sharp, Julia L; DeVol, Timothy A

    2013-01-01

    Time-interval (time difference between two consecutive pulses) analysis based on the principles of Bayesian inference was investigated for online radiation monitoring. Using experimental and simulated data, Bayesian analysis of time-interval data [Bayesian (ti)] was compared with Bayesian and a conventional frequentist analysis of counts in a fixed count time [Bayesian (cnt) and single interval test (SIT), respectively]. The performances of the three methods were compared in terms of average run length (ARL) and detection probability for several simulated detection scenarios. Experimental data were acquired with a DGF-4C system in list mode. Simulated data were obtained using Monte Carlo techniques to obtain a random sampling of the Poisson distribution. All statistical algorithms were developed using the R Project for statistical computing. Bayesian analysis of time-interval information provided a similar detection probability as Bayesian analysis of count information, but the authors were able to make a decision with fewer pulses at relatively higher radiation levels. In addition, for the cases with very short presence of the source (< count time), time-interval information is more sensitive to detect a change than count information since the source data is averaged by the background data over the entire count time. The relationships of the source time, change points, and modifications to the Bayesian approach for increasing detection probability are presented.

  1. Validation of Bayesian analysis of compartmental kinetic models in medical imaging.

    PubMed

    Sitek, Arkadiusz; Li, Quanzheng; El Fakhri, Georges; Alpert, Nathaniel M

    2016-10-01

    Kinetic compartmental analysis is frequently used to compute physiologically relevant quantitative values from time series of images. In this paper, a new approach based on Bayesian analysis to obtain information about these parameters is presented and validated. The closed-form of the posterior distribution of kinetic parameters is derived with a hierarchical prior to model the standard deviation of normally distributed noise. Markov chain Monte Carlo methods are used for numerical estimation of the posterior distribution. Computer simulations of the kinetics of F18-fluorodeoxyglucose (FDG) are used to demonstrate drawing statistical inferences about kinetic parameters and to validate the theory and implementation. Additionally, point estimates of kinetic parameters and covariance of those estimates are determined using the classical non-linear least squares approach. Posteriors obtained using methods proposed in this work are accurate as no significant deviation from the expected shape of the posterior was found (one-sided P>0.08). It is demonstrated that the results obtained by the standard non-linear least-square methods fail to provide accurate estimation of uncertainty for the same data set (P<0.0001). The results of this work validate new methods for a computer simulations of FDG kinetics. Results show that in situations where the classical approach fails in accurate estimation of uncertainty, Bayesian estimation provides an accurate information about the uncertainties in the parameters. Although a particular example of FDG kinetics was used in the paper, the methods can be extended for different pharmaceuticals and imaging modalities. Copyright © 2016 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.

  2. Bayesian ionospheric multi-instrument 3D tomography

    NASA Astrophysics Data System (ADS)

    Norberg, Johannes; Vierinen, Juha; Roininen, Lassi

    2017-04-01

    The tomographic reconstruction of ionospheric electron densities is an inverse problem that cannot be solved without relatively strong regularising additional information. % Especially the vertical electron density profile is determined predominantly by the regularisation. % %Often utilised regularisations in ionospheric tomography include smoothness constraints and iterative methods with initial ionospheric models. % Despite its crucial role, the regularisation is often hidden in the algorithm as a numerical procedure without physical understanding. % % The Bayesian methodology provides an interpretative approach for the problem, as the regularisation can be given in a physically meaningful and quantifiable prior probability distribution. % The prior distribution can be based on ionospheric physics, other available ionospheric measurements and their statistics. % Updating the prior with measurements results as the posterior distribution that carries all the available information combined. % From the posterior distribution, the most probable state of the ionosphere can then be solved with the corresponding probability intervals. % Altogether, the Bayesian methodology provides understanding on how strong the given regularisation is, what is the information gained with the measurements and how reliable the final result is. % In addition, the combination of different measurements and temporal development can be taken into account in a very intuitive way. However, a direct implementation of the Bayesian approach requires inversion of large covariance matrices resulting in computational infeasibility. % In the presented method, Gaussian Markov random fields are used to form a sparse matrix approximations for the covariances. % The approach makes the problem computationally feasible while retaining the probabilistic and physical interpretation. Here, the Bayesian method with Gaussian Markov random fields is applied for ionospheric 3D tomography over Northern Europe. % Multi-instrument measurements are utilised from TomoScand receiver network for Low Earth orbit beacon satellite signals, GNSS receiver networks, as well as from EISCAT ionosondes and incoherent scatter radars. % %The performance is demonstrated in three-dimensional spatial domain with temporal development also taken into account.

  3. An extended Kalman filter approach to non-stationary Bayesian estimation of reduced-order vocal fold model parameters.

    PubMed

    Hadwin, Paul J; Peterson, Sean D

    2017-04-01

    The Bayesian framework for parameter inference provides a basis from which subject-specific reduced-order vocal fold models can be generated. Previously, it has been shown that a particle filter technique is capable of producing estimates and associated credibility intervals of time-varying reduced-order vocal fold model parameters. However, the particle filter approach is difficult to implement and has a high computational cost, which can be barriers to clinical adoption. This work presents an alternative estimation strategy based upon Kalman filtering aimed at reducing the computational cost of subject-specific model development. The robustness of this approach to Gaussian and non-Gaussian noise is discussed. The extended Kalman filter (EKF) approach is found to perform very well in comparison with the particle filter technique at dramatically lower computational cost. Based upon the test cases explored, the EKF is comparable in terms of accuracy to the particle filter technique when greater than 6000 particles are employed; if less particles are employed, the EKF actually performs better. For comparable levels of accuracy, the solution time is reduced by 2 orders of magnitude when employing the EKF. By virtue of the approximations used in the EKF, however, the credibility intervals tend to be slightly underpredicted.

  4. Probabilistic Assessment of Planet Habitability and Biosignatures

    NASA Astrophysics Data System (ADS)

    Bixel, A.; Apai, D.

    2017-11-01

    We have computed probabilistic constraints on the bulk properties of Proxima Cen b informed by priors from Kepler and RV follow-up. We will extend this approach into a Bayesian framework to assess the habitability of directly imaged planets.

  5. Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology

    PubMed Central

    Murakami, Yohei

    2014-01-01

    Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832

  6. The Approximate Bayesian Computation methods in the localization of the atmospheric contamination source

    NASA Astrophysics Data System (ADS)

    Kopka, P.; Wawrzynczak, A.; Borysiewicz, M.

    2015-09-01

    In many areas of application, a central problem is a solution to the inverse problem, especially estimation of the unknown model parameters to model the underlying dynamics of a physical system precisely. In this situation, the Bayesian inference is a powerful tool to combine observed data with prior knowledge to gain the probability distribution of searched parameters. We have applied the modern methodology named Sequential Approximate Bayesian Computation (S-ABC) to the problem of tracing the atmospheric contaminant source. The ABC is technique commonly used in the Bayesian analysis of complex models and dynamic system. Sequential methods can significantly increase the efficiency of the ABC. In the presented algorithm, the input data are the on-line arriving concentrations of released substance registered by distributed sensor network from OVER-LAND ATMOSPHERIC DISPERSION (OLAD) experiment. The algorithm output are the probability distributions of a contamination source parameters i.e. its particular location, release rate, speed and direction of the movement, start time and duration. The stochastic approach presented in this paper is completely general and can be used in other fields where the parameters of the model bet fitted to the observable data should be found.

  7. Bayesian feature selection for high-dimensional linear regression via the Ising approximation with applications to genomics.

    PubMed

    Fisher, Charles K; Mehta, Pankaj

    2015-06-01

    Feature selection, identifying a subset of variables that are relevant for predicting a response, is an important and challenging component of many methods in statistics and machine learning. Feature selection is especially difficult and computationally intensive when the number of variables approaches or exceeds the number of samples, as is often the case for many genomic datasets. Here, we introduce a new approach--the Bayesian Ising Approximation (BIA)-to rapidly calculate posterior probabilities for feature relevance in L2 penalized linear regression. In the regime where the regression problem is strongly regularized by the prior, we show that computing the marginal posterior probabilities for features is equivalent to computing the magnetizations of an Ising model with weak couplings. Using a mean field approximation, we show it is possible to rapidly compute the feature selection path described by the posterior probabilities as a function of the L2 penalty. We present simulations and analytical results illustrating the accuracy of the BIA on some simple regression problems. Finally, we demonstrate the applicability of the BIA to high-dimensional regression by analyzing a gene expression dataset with nearly 30 000 features. These results also highlight the impact of correlations between features on Bayesian feature selection. An implementation of the BIA in C++, along with data for reproducing our gene expression analyses, are freely available at http://physics.bu.edu/∼pankajm/BIACode. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  8. An Intelligent Tutoring System for Classifying Students into Instructional Treatments with Mastery Scores. Research Report 94-15.

    ERIC Educational Resources Information Center

    Vos, Hans J.

    As part of a project formulating optimal rules for decision making in computer assisted instructional systems in which the computer is used as a decision support tool, an approach that simultaneously optimizes classification of students into two treatments, each followed by a mastery decision, is presented using the framework of Bayesian decision…

  9. Inference of epidemiological parameters from household stratified data

    PubMed Central

    Walker, James N.; Ross, Joshua V.

    2017-01-01

    We consider a continuous-time Markov chain model of SIR disease dynamics with two levels of mixing. For this so-called stochastic households model, we provide two methods for inferring the model parameters—governing within-household transmission, recovery, and between-household transmission—from data of the day upon which each individual became infectious and the household in which each infection occurred, as might be available from First Few Hundred studies. Each method is a form of Bayesian Markov Chain Monte Carlo that allows us to calculate a joint posterior distribution for all parameters and hence the household reproduction number and the early growth rate of the epidemic. The first method performs exact Bayesian inference using a standard data-augmentation approach; the second performs approximate Bayesian inference based on a likelihood approximation derived from branching processes. These methods are compared for computational efficiency and posteriors from each are compared. The branching process is shown to be a good approximation and remains computationally efficient as the amount of data is increased. PMID:29045456

  10. A full-spectral Bayesian reconstruction approach based on the material decomposition model applied in dual-energy computed tomography.

    PubMed

    Cai, C; Rodet, T; Legoupil, S; Mohammad-Djafari, A

    2013-11-01

    Dual-energy computed tomography (DECT) makes it possible to get two fractions of basis materials without segmentation. One is the soft-tissue equivalent water fraction and the other is the hard-matter equivalent bone fraction. Practical DECT measurements are usually obtained with polychromatic x-ray beams. Existing reconstruction approaches based on linear forward models without counting the beam polychromaticity fail to estimate the correct decomposition fractions and result in beam-hardening artifacts (BHA). The existing BHA correction approaches either need to refer to calibration measurements or suffer from the noise amplification caused by the negative-log preprocessing and the ill-conditioned water and bone separation problem. To overcome these problems, statistical DECT reconstruction approaches based on nonlinear forward models counting the beam polychromaticity show great potential for giving accurate fraction images. This work proposes a full-spectral Bayesian reconstruction approach which allows the reconstruction of high quality fraction images from ordinary polychromatic measurements. This approach is based on a Gaussian noise model with unknown variance assigned directly to the projections without taking negative-log. Referring to Bayesian inferences, the decomposition fractions and observation variance are estimated by using the joint maximum a posteriori (MAP) estimation method. Subject to an adaptive prior model assigned to the variance, the joint estimation problem is then simplified into a single estimation problem. It transforms the joint MAP estimation problem into a minimization problem with a nonquadratic cost function. To solve it, the use of a monotone conjugate gradient algorithm with suboptimal descent steps is proposed. The performance of the proposed approach is analyzed with both simulated and experimental data. The results show that the proposed Bayesian approach is robust to noise and materials. It is also necessary to have the accurate spectrum information about the source-detector system. When dealing with experimental data, the spectrum can be predicted by a Monte Carlo simulator. For the materials between water and bone, less than 5% separation errors are observed on the estimated decomposition fractions. The proposed approach is a statistical reconstruction approach based on a nonlinear forward model counting the full beam polychromaticity and applied directly to the projections without taking negative-log. Compared to the approaches based on linear forward models and the BHA correction approaches, it has advantages in noise robustness and reconstruction accuracy.

  11. Bayesian hierarchical functional data analysis via contaminated informative priors.

    PubMed

    Scarpa, Bruno; Dunson, David B

    2009-09-01

    A variety of flexible approaches have been proposed for functional data analysis, allowing both the mean curve and the distribution about the mean to be unknown. Such methods are most useful when there is limited prior information. Motivated by applications to modeling of temperature curves in the menstrual cycle, this article proposes a flexible approach for incorporating prior information in semiparametric Bayesian analyses of hierarchical functional data. The proposed approach is based on specifying the distribution of functions as a mixture of a parametric hierarchical model and a nonparametric contamination. The parametric component is chosen based on prior knowledge, while the contamination is characterized as a functional Dirichlet process. In the motivating application, the contamination component allows unanticipated curve shapes in unhealthy menstrual cycles. Methods are developed for posterior computation, and the approach is applied to data from a European fecundability study.

  12. Taking error into account when fitting models using Approximate Bayesian Computation.

    PubMed

    van der Vaart, Elske; Prangle, Dennis; Sibly, Richard M

    2018-03-01

    Stochastic computer simulations are often the only practical way of answering questions relating to ecological management. However, due to their complexity, such models are difficult to calibrate and evaluate. Approximate Bayesian Computation (ABC) offers an increasingly popular approach to this problem, widely applied across a variety of fields. However, ensuring the accuracy of ABC's estimates has been difficult. Here, we obtain more accurate estimates by incorporating estimation of error into the ABC protocol. We show how this can be done where the data consist of repeated measures of the same quantity and errors may be assumed to be normally distributed and independent. We then derive the correct acceptance probabilities for a probabilistic ABC algorithm, and update the coverage test with which accuracy is assessed. We apply this method, which we call error-calibrated ABC, to a toy example and a realistic 14-parameter simulation model of earthworms that is used in environmental risk assessment. A comparison with exact methods and the diagnostic coverage test show that our approach improves estimation of parameter values and their credible intervals for both models. © 2017 by the Ecological Society of America.

  13. Hamiltonian Monte Carlo acceleration using surrogate functions with random bases.

    PubMed

    Zhang, Cheng; Shahbaba, Babak; Zhao, Hongkai

    2017-11-01

    For big data analysis, high computational cost for Bayesian methods often limits their applications in practice. In recent years, there have been many attempts to improve computational efficiency of Bayesian inference. Here we propose an efficient and scalable computational technique for a state-of-the-art Markov chain Monte Carlo methods, namely, Hamiltonian Monte Carlo. The key idea is to explore and exploit the structure and regularity in parameter space for the underlying probabilistic model to construct an effective approximation of its geometric properties. To this end, we build a surrogate function to approximate the target distribution using properly chosen random bases and an efficient optimization process. The resulting method provides a flexible, scalable, and efficient sampling algorithm, which converges to the correct target distribution. We show that by choosing the basis functions and optimization process differently, our method can be related to other approaches for the construction of surrogate functions such as generalized additive models or Gaussian process models. Experiments based on simulated and real data show that our approach leads to substantially more efficient sampling algorithms compared to existing state-of-the-art methods.

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Farrell, Kathryn, E-mail: kfarrell@ices.utexas.edu; Oden, J. Tinsley, E-mail: oden@ices.utexas.edu; Faghihi, Danial, E-mail: danial@ices.utexas.edu

    A general adaptive modeling algorithm for selection and validation of coarse-grained models of atomistic systems is presented. A Bayesian framework is developed to address uncertainties in parameters, data, and model selection. Algorithms for computing output sensitivities to parameter variances, model evidence and posterior model plausibilities for given data, and for computing what are referred to as Occam Categories in reference to a rough measure of model simplicity, make up components of the overall approach. Computational results are provided for representative applications.

  15. Statistical comparison of a hybrid approach with approximate and exact inference models for Fusion 2+

    NASA Astrophysics Data System (ADS)

    Lee, K. David; Wiesenfeld, Eric; Gelfand, Andrew

    2007-04-01

    One of the greatest challenges in modern combat is maintaining a high level of timely Situational Awareness (SA). In many situations, computational complexity and accuracy considerations make the development and deployment of real-time, high-level inference tools very difficult. An innovative hybrid framework that combines Bayesian inference, in the form of Bayesian Networks, and Possibility Theory, in the form of Fuzzy Logic systems, has recently been introduced to provide a rigorous framework for high-level inference. In previous research, the theoretical basis and benefits of the hybrid approach have been developed. However, lacking is a concrete experimental comparison of the hybrid framework with traditional fusion methods, to demonstrate and quantify this benefit. The goal of this research, therefore, is to provide a statistical analysis on the comparison of the accuracy and performance of hybrid network theory, with pure Bayesian and Fuzzy systems and an inexact Bayesian system approximated using Particle Filtering. To accomplish this task, domain specific models will be developed under these different theoretical approaches and then evaluated, via Monte Carlo Simulation, in comparison to situational ground truth to measure accuracy and fidelity. Following this, a rigorous statistical analysis of the performance results will be performed, to quantify the benefit of hybrid inference to other fusion tools.

  16. Bayesian correction for covariate measurement error: A frequentist evaluation and comparison with regression calibration.

    PubMed

    Bartlett, Jonathan W; Keogh, Ruth H

    2018-06-01

    Bayesian approaches for handling covariate measurement error are well established and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper, we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration, arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to regression calibration and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next, we describe the closely related maximum likelihood and multiple imputation approaches and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of regression calibration and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey.

  17. Integrating probabilistic models of perception and interactive neural networks: a historical and tutorial review

    PubMed Central

    McClelland, James L.

    2013-01-01

    This article seeks to establish a rapprochement between explicitly Bayesian models of contextual effects in perception and neural network models of such effects, particularly the connectionist interactive activation (IA) model of perception. The article is in part an historical review and in part a tutorial, reviewing the probabilistic Bayesian approach to understanding perception and how it may be shaped by context, and also reviewing ideas about how such probabilistic computations may be carried out in neural networks, focusing on the role of context in interactive neural networks, in which both bottom-up and top-down signals affect the interpretation of sensory inputs. It is pointed out that connectionist units that use the logistic or softmax activation functions can exactly compute Bayesian posterior probabilities when the bias terms and connection weights affecting such units are set to the logarithms of appropriate probabilistic quantities. Bayesian concepts such the prior, likelihood, (joint and marginal) posterior, probability matching and maximizing, and calculating vs. sampling from the posterior are all reviewed and linked to neural network computations. Probabilistic and neural network models are explicitly linked to the concept of a probabilistic generative model that describes the relationship between the underlying target of perception (e.g., the word intended by a speaker or other source of sensory stimuli) and the sensory input that reaches the perceiver for use in inferring the underlying target. It is shown how a new version of the IA model called the multinomial interactive activation (MIA) model can sample correctly from the joint posterior of a proposed generative model for perception of letters in words, indicating that interactive processing is fully consistent with principled probabilistic computation. Ways in which these computations might be realized in real neural systems are also considered. PMID:23970868

  18. Integrating probabilistic models of perception and interactive neural networks: a historical and tutorial review.

    PubMed

    McClelland, James L

    2013-01-01

    This article seeks to establish a rapprochement between explicitly Bayesian models of contextual effects in perception and neural network models of such effects, particularly the connectionist interactive activation (IA) model of perception. The article is in part an historical review and in part a tutorial, reviewing the probabilistic Bayesian approach to understanding perception and how it may be shaped by context, and also reviewing ideas about how such probabilistic computations may be carried out in neural networks, focusing on the role of context in interactive neural networks, in which both bottom-up and top-down signals affect the interpretation of sensory inputs. It is pointed out that connectionist units that use the logistic or softmax activation functions can exactly compute Bayesian posterior probabilities when the bias terms and connection weights affecting such units are set to the logarithms of appropriate probabilistic quantities. Bayesian concepts such the prior, likelihood, (joint and marginal) posterior, probability matching and maximizing, and calculating vs. sampling from the posterior are all reviewed and linked to neural network computations. Probabilistic and neural network models are explicitly linked to the concept of a probabilistic generative model that describes the relationship between the underlying target of perception (e.g., the word intended by a speaker or other source of sensory stimuli) and the sensory input that reaches the perceiver for use in inferring the underlying target. It is shown how a new version of the IA model called the multinomial interactive activation (MIA) model can sample correctly from the joint posterior of a proposed generative model for perception of letters in words, indicating that interactive processing is fully consistent with principled probabilistic computation. Ways in which these computations might be realized in real neural systems are also considered.

  19. Using Bayesian Networks for Candidate Generation in Consistency-based Diagnosis

    NASA Technical Reports Server (NTRS)

    Narasimhan, Sriram; Mengshoel, Ole

    2008-01-01

    Consistency-based diagnosis relies heavily on the assumption that discrepancies between model predictions and sensor observations can be detected accurately. When sources of uncertainty like sensor noise and model abstraction exist robust schemes have to be designed to make a binary decision on whether predictions are consistent with observations. This risks the occurrence of false alarms and missed alarms when an erroneous decision is made. Moreover when multiple sensors (with differing sensing properties) are available the degree of match between predictions and observations can be used to guide the search for fault candidates. In this paper we propose a novel approach to handle this problem using Bayesian networks. In the consistency- based diagnosis formulation, automatically generated Bayesian networks are used to encode a probabilistic measure of fit between predictions and observations. A Bayesian network inference algorithm is used to compute most probable fault candidates.

  20. Robust Learning of High-dimensional Biological Networks with Bayesian Networks

    NASA Astrophysics Data System (ADS)

    Nägele, Andreas; Dejori, Mathäus; Stetter, Martin

    Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.

  1. Bayesian Group Bridge for Bi-level Variable Selection.

    PubMed

    Mallick, Himel; Yi, Nengjun

    2017-06-01

    A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.

  2. On a full Bayesian inference for force reconstruction problems

    NASA Astrophysics Data System (ADS)

    Aucejo, M.; De Smet, O.

    2018-05-01

    In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.

  3. Data free inference with processed data products

    DOE PAGES

    Chowdhary, K.; Najm, H. N.

    2014-07-12

    Here, we consider the context of probabilistic inference of model parameters given error bars or confidence intervals on model output values, when the data is unavailable. We introduce a class of algorithms in a Bayesian framework, relying on maximum entropy arguments and approximate Bayesian computation methods, to generate consistent data with the given summary statistics. Once we obtain consistent data sets, we pool the respective posteriors, to arrive at a single, averaged density on the parameters. This approach allows us to perform accurate forward uncertainty propagation consistent with the reported statistics.

  4. BCM: toolkit for Bayesian analysis of Computational Models using samplers.

    PubMed

    Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A

    2016-10-21

    Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.

  5. Fundamentals and Recent Developments in Approximate Bayesian Computation

    PubMed Central

    Lintusaari, Jarno; Gutmann, Michael U.; Dutta, Ritabrata; Kaski, Samuel; Corander, Jukka

    2017-01-01

    Abstract Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.] PMID:28175922

  6. Free will in Bayesian and inverse Bayesian inference-driven endo-consciousness.

    PubMed

    Gunji, Yukio-Pegio; Minoura, Mai; Kojima, Kei; Horry, Yoichi

    2017-12-01

    How can we link challenging issues related to consciousness and/or qualia with natural science? The introduction of endo-perspective, instead of exo-perspective, as proposed by Matsuno, Rössler, and Gunji, is considered one of the most promising candidate approaches. Here, we distinguish the endo-from the exo-perspective in terms of whether the external is or is not directly operated. In the endo-perspective, the external can be neither perceived nor recognized directly; rather, one can only indirectly summon something outside of the perspective, which can be illustrated by a causation-reversal pair. On one hand, causation logically proceeds from the cause to the effect. On the other hand, a reversal from the effect to the cause is non-logical and is equipped with a metaphorical structure. We argue that the differences in exo- and endo-perspectives result not from the difference between Western and Eastern cultures, but from differences between modernism and animism. Here, a causation-reversal pair described using a pair of upward (from premise to consequence) and downward (from consequence to premise) causation and a pair of Bayesian and inverse Bayesian inference (BIB inference). Accordingly, the notion of endo-consciousness is proposed as an agent equipped with BIB inference. We also argue that BIB inference can yield both highly efficient computations through Bayesian interference and robust computations through inverse Bayesian inference. By adapting a logical model of the free will theorem to the BIB inference, we show that endo-consciousness can explain free will as a regression of the controllability of voluntary action. Copyright © 2017. Published by Elsevier Ltd.

  7. Bayesian analysis of rare events

    NASA Astrophysics Data System (ADS)

    Straub, Daniel; Papaioannou, Iason; Betz, Wolfgang

    2016-06-01

    In many areas of engineering and science there is an interest in predicting the probability of rare events, in particular in applications related to safety and security. Increasingly, such predictions are made through computer models of physical systems in an uncertainty quantification framework. Additionally, with advances in IT, monitoring and sensor technology, an increasing amount of data on the performance of the systems is collected. This data can be used to reduce uncertainty, improve the probability estimates and consequently enhance the management of rare events and associated risks. Bayesian analysis is the ideal method to include the data into the probabilistic model. It ensures a consistent probabilistic treatment of uncertainty, which is central in the prediction of rare events, where extrapolation from the domain of observation is common. We present a framework for performing Bayesian updating of rare event probabilities, termed BUS. It is based on a reinterpretation of the classical rejection-sampling approach to Bayesian analysis, which enables the use of established methods for estimating probabilities of rare events. By drawing upon these methods, the framework makes use of their computational efficiency. These methods include the First-Order Reliability Method (FORM), tailored importance sampling (IS) methods and Subset Simulation (SuS). In this contribution, we briefly review these methods in the context of the BUS framework and investigate their applicability to Bayesian analysis of rare events in different settings. We find that, for some applications, FORM can be highly efficient and is surprisingly accurate, enabling Bayesian analysis of rare events with just a few model evaluations. In a general setting, BUS implemented through IS and SuS is more robust and flexible.

  8. Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models.

    PubMed

    Daunizeau, J; Friston, K J; Kiebel, S J

    2009-11-01

    In this paper, we describe a general variational Bayesian approach for approximate inference on nonlinear stochastic dynamic models. This scheme extends established approximate inference on hidden-states to cover: (i) nonlinear evolution and observation functions, (ii) unknown parameters and (precision) hyperparameters and (iii) model comparison and prediction under uncertainty. Model identification or inversion entails the estimation of the marginal likelihood or evidence of a model. This difficult integration problem can be finessed by optimising a free-energy bound on the evidence using results from variational calculus. This yields a deterministic update scheme that optimises an approximation to the posterior density on the unknown model variables. We derive such a variational Bayesian scheme in the context of nonlinear stochastic dynamic hierarchical models, for both model identification and time-series prediction. The computational complexity of the scheme is comparable to that of an extended Kalman filter, which is critical when inverting high dimensional models or long time-series. Using Monte-Carlo simulations, we assess the estimation efficiency of this variational Bayesian approach using three stochastic variants of chaotic dynamic systems. We also demonstrate the model comparison capabilities of the method, its self-consistency and its predictive power.

  9. A Bayesian approach to earthquake source studies

    NASA Astrophysics Data System (ADS)

    Minson, Sarah

    Bayesian sampling has several advantages over conventional optimization approaches to solving inverse problems. It produces the distribution of all possible models sampled proportionally to how much each model is consistent with the data and the specified prior information, and thus images the entire solution space, revealing the uncertainties and trade-offs in the model. Bayesian sampling is applicable to both linear and non-linear modeling, and the values of the model parameters being sampled can be constrained based on the physics of the process being studied and do not have to be regularized. However, these methods are computationally challenging for high-dimensional problems. Until now the computational expense of Bayesian sampling has been too great for it to be practicable for most geophysical problems. I present a new parallel sampling algorithm called CATMIP for Cascading Adaptive Tempered Metropolis In Parallel. This technique, based on Transitional Markov chain Monte Carlo, makes it possible to sample distributions in many hundreds of dimensions, if the forward model is fast, or to sample computationally expensive forward models in smaller numbers of dimensions. The design of the algorithm is independent of the model being sampled, so CATMIP can be applied to many areas of research. I use CATMIP to produce a finite fault source model for the 2007 Mw 7.7 Tocopilla, Chile earthquake. Surface displacements from the earthquake were recorded by six interferograms and twelve local high-rate GPS stations. Because of the wealth of near-fault data, the source process is well-constrained. I find that the near-field high-rate GPS data have significant resolving power above and beyond the slip distribution determined from static displacements. The location and magnitude of the maximum displacement are resolved. The rupture almost certainly propagated at sub-shear velocities. The full posterior distribution can be used not only to calculate source parameters but also to determine their uncertainties. So while kinematic source modeling and the estimation of source parameters is not new, with CATMIP I am able to use Bayesian sampling to determine which parts of the source process are well-constrained and which are not.

  10. Parameter Estimation for a Pulsating Turbulent Buoyant Jet Using Approximate Bayesian Computation

    NASA Astrophysics Data System (ADS)

    Christopher, Jason; Wimer, Nicholas; Lapointe, Caelan; Hayden, Torrey; Grooms, Ian; Rieker, Greg; Hamlington, Peter

    2017-11-01

    Approximate Bayesian Computation (ABC) is a powerful tool that allows sparse experimental or other ``truth'' data to be used for the prediction of unknown parameters, such as flow properties and boundary conditions, in numerical simulations of real-world engineering systems. Here we introduce the ABC approach and then use ABC to predict unknown inflow conditions in simulations of a two-dimensional (2D) turbulent, high-temperature buoyant jet. For this test case, truth data are obtained from a direct numerical simulation (DNS) with known boundary conditions and problem parameters, while the ABC procedure utilizes lower fidelity large eddy simulations. Using spatially-sparse statistics from the 2D buoyant jet DNS, we show that the ABC method provides accurate predictions of true jet inflow parameters. The success of the ABC approach in the present test suggests that ABC is a useful and versatile tool for predicting flow information, such as boundary conditions, that can be difficult to determine experimentally.

  11. The importance of proving the null.

    PubMed

    Gallistel, C R

    2009-04-01

    Null hypotheses are simple, precise, and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit mass of prior probability), the more the null is favored. A general solution is a sensitivity analysis: Compute the odds for or against the null as a function of the limit(s) on the vagueness of the alternative. If the odds on the null approach 1 from above as the hypothesized maximum size of the possible effect approaches 0, then the data favor the null over any vaguer alternative to it. The simple computations and the intuitive graphic representation of the analysis are illustrated by the analysis of diverse examples from the current literature. They pose 3 common experimental questions: (a) Are 2 means the same? (b) Is performance at chance? (c) Are factors additive? (c) 2009 APA, all rights reserved

  12. Mean Field Variational Bayesian Data Assimilation

    NASA Astrophysics Data System (ADS)

    Vrettas, M.; Cornford, D.; Opper, M.

    2012-04-01

    Current data assimilation schemes propose a range of approximate solutions to the classical data assimilation problem, particularly state estimation. Broadly there are three main active research areas: ensemble Kalman filter methods which rely on statistical linearization of the model evolution equations, particle filters which provide a discrete point representation of the posterior filtering or smoothing distribution and 4DVAR methods which seek the most likely posterior smoothing solution. In this paper we present a recent extension to our variational Bayesian algorithm which seeks the most probably posterior distribution over the states, within the family of non-stationary Gaussian processes. Our original work on variational Bayesian approaches to data assimilation sought the best approximating time varying Gaussian process to the posterior smoothing distribution for stochastic dynamical systems. This approach was based on minimising the Kullback-Leibler divergence between the true posterior over paths, and our Gaussian process approximation. So long as the observation density was sufficiently high to bring the posterior smoothing density close to Gaussian the algorithm proved very effective, on lower dimensional systems. However for higher dimensional systems, the algorithm was computationally very demanding. We have been developing a mean field version of the algorithm which treats the state variables at a given time as being independent in the posterior approximation, but still accounts for their relationships between each other in the mean solution arising from the original dynamical system. In this work we present the new mean field variational Bayesian approach, illustrating its performance on a range of classical data assimilation problems. We discuss the potential and limitations of the new approach. We emphasise that the variational Bayesian approach we adopt, in contrast to other variational approaches, provides a bound on the marginal likelihood of the observations given parameters in the model which also allows inference of parameters such as observation errors, and parameters in the model and model error representation, particularly if this is written as a deterministic form with small additive noise. We stress that our approach can address very long time window and weak constraint settings. However like traditional variational approaches our Bayesian variational method has the benefit of being posed as an optimisation problem. We finish with a sketch of the future directions for our approach.

  13. Bayesian seismic inversion based on rock-physics prior modeling for the joint estimation of acoustic impedance, porosity and lithofacies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Passos de Figueiredo, Leandro, E-mail: leandrop.fgr@gmail.com; Grana, Dario; Santos, Marcio

    We propose a Bayesian approach for seismic inversion to estimate acoustic impedance, porosity and lithofacies within the reservoir conditioned to post-stack seismic and well data. The link between elastic and petrophysical properties is given by a joint prior distribution for the logarithm of impedance and porosity, based on a rock-physics model. The well conditioning is performed through a background model obtained by well log interpolation. Two different approaches are presented: in the first approach, the prior is defined by a single Gaussian distribution, whereas in the second approach it is defined by a Gaussian mixture to represent the well datamore » multimodal distribution and link the Gaussian components to different geological lithofacies. The forward model is based on a linearized convolutional model. For the single Gaussian case, we obtain an analytical expression for the posterior distribution, resulting in a fast algorithm to compute the solution of the inverse problem, i.e. the posterior distribution of acoustic impedance and porosity as well as the facies probability given the observed data. For the Gaussian mixture prior, it is not possible to obtain the distributions analytically, hence we propose a Gibbs algorithm to perform the posterior sampling and obtain several reservoir model realizations, allowing an uncertainty analysis of the estimated properties and lithofacies. Both methodologies are applied to a real seismic dataset with three wells to obtain 3D models of acoustic impedance, porosity and lithofacies. The methodologies are validated through a blind well test and compared to a standard Bayesian inversion approach. Using the probability of the reservoir lithofacies, we also compute a 3D isosurface probability model of the main oil reservoir in the studied field.« less

  14. Probabilistic techniques for obtaining accurate patient counts in Clinical Data Warehouses

    PubMed Central

    Myers, Risa B.; Herskovic, Jorge R.

    2011-01-01

    Proposal and execution of clinical trials, computation of quality measures and discovery of correlation between medical phenomena are all applications where an accurate count of patients is needed. However, existing sources of this type of patient information, including Clinical Data Warehouses (CDW) may be incomplete or inaccurate. This research explores applying probabilistic techniques, supported by the MayBMS probabilistic database, to obtain accurate patient counts from a clinical data warehouse containing synthetic patient data. We present a synthetic clinical data warehouse (CDW), and populate it with simulated data using a custom patient data generation engine. We then implement, evaluate and compare different techniques for obtaining patients counts. We model billing as a test for the presence of a condition. We compute billing’s sensitivity and specificity both by conducting a “Simulated Expert Review” where a representative sample of records are reviewed and labeled by experts, and by obtaining the ground truth for every record. We compute the posterior probability of a patient having a condition through a “Bayesian Chain”, using Bayes’ Theorem to calculate the probability of a patient having a condition after each visit. The second method is a “one-shot” approach that computes the probability of a patient having a condition based on whether the patient is ever billed for the condition Our results demonstrate the utility of probabilistic approaches, which improve on the accuracy of raw counts. In particular, the simulated review paired with a single application of Bayes’ Theorem produces the best results, with an average error rate of 2.1% compared to 43.7% for the straightforward billing counts. Overall, this research demonstrates that Bayesian probabilistic approaches improve patient counts on simulated patient populations. We believe that total patient counts based on billing data are one of the many possible applications of our Bayesian framework. Use of these probabilistic techniques will enable more accurate patient counts and better results for applications requiring this metric. PMID:21986292

  15. Computational Approaches to Spatial Orientation: From Transfer Functions to Dynamic Bayesian Inference

    PubMed Central

    MacNeilage, Paul R.; Ganesan, Narayan; Angelaki, Dora E.

    2008-01-01

    Spatial orientation is the sense of body orientation and self-motion relative to the stationary environment, fundamental to normal waking behavior and control of everyday motor actions including eye movements, postural control, and locomotion. The brain achieves spatial orientation by integrating visual, vestibular, and somatosensory signals. Over the past years, considerable progress has been made toward understanding how these signals are processed by the brain using multiple computational approaches that include frequency domain analysis, the concept of internal models, observer theory, Bayesian theory, and Kalman filtering. Here we put these approaches in context by examining the specific questions that can be addressed by each technique and some of the scientific insights that have resulted. We conclude with a recent application of particle filtering, a probabilistic simulation technique that aims to generate the most likely state estimates by incorporating internal models of sensor dynamics and physical laws and noise associated with sensory processing as well as prior knowledge or experience. In this framework, priors for low angular velocity and linear acceleration can explain the phenomena of velocity storage and frequency segregation, both of which have been modeled previously using arbitrary low-pass filtering. How Kalman and particle filters may be implemented by the brain is an emerging field. Unlike past neurophysiological research that has aimed to characterize mean responses of single neurons, investigations of dynamic Bayesian inference should attempt to characterize population activities that constitute probabilistic representations of sensory and prior information. PMID:18842952

  16. New seismogenic stress fields for southern Italy from a Bayesian approach

    NASA Astrophysics Data System (ADS)

    Totaro, Cristina; Orecchio, Barbara; Presti, Debora; Scolaro, Silvia; Neri, Giancarlo

    2017-04-01

    A new database of high-quality waveform inversion focal mechanism has been compiled for southern Italy by integrating the highest quality solutions, available from literature and catalogues, and 146 newly-computed ones. All the selected focal mechanisms are (i) coming from the Italian CMT, Regional CMT and TDMT catalogues (Pondrelli et al., PEPI 2006, PEPI 2011; http://www.ingv.it), or (ii) computed by using the Cut And Paste (CAP) method (Zhao & Helmberger, BSSA 1994; Zhu & Helmberger, BSSA 1996). Specific tests have been carried out in order to evaluate the robustness of the obtained solutions (e.g., by varying both seismic network configuration and Earth structure parameters) and to estimate uncertainties on the focal mechanism parameters. Only the resulting highest-quality solutions have been enclosed in the database, that has then been used for computation of posterior density distributions of stress tensor components by a Bayesian method (Arnold & Townend, GJI 2007). This algorithm furnishes the posterior density function of the principal components of stress tensor (maximum σ1, intermediate σ2, and minimum σ3 compressive stress, respectively) and the stress-magnitude ratio (R). Before stress computation, we applied the k-means clustering algorithm to subdivide the focal mechanism catalog on the basis of earthquake locations. This approach allows identifying the sectors to be investigated without any "a priori" constraint from faulting type distribution. The large amount of data and the application of the Bayesian algorithm allowed us to provide a more accurate local-to-regional scale stress distribution that has shed new light on the kinematics and dynamics of this very complex area, where lithospheric unit configuration and geodynamic engines are still strongly debated. The new high-quality information here furnished will then represent very useful tools and constraints for future geophysical analyses and geodynamic modeling.

  17. A Bayesian network approach to the database search problem in criminal proceedings

    PubMed Central

    2012-01-01

    Background The ‘database search problem’, that is, the strengthening of a case - in terms of probative value - against an individual who is found as a result of a database search, has been approached during the last two decades with substantial mathematical analyses, accompanied by lively debate and centrally opposing conclusions. This represents a challenging obstacle in teaching but also hinders a balanced and coherent discussion of the topic within the wider scientific and legal community. This paper revisits and tracks the associated mathematical analyses in terms of Bayesian networks. Their derivation and discussion for capturing probabilistic arguments that explain the database search problem are outlined in detail. The resulting Bayesian networks offer a distinct view on the main debated issues, along with further clarity. Methods As a general framework for representing and analyzing formal arguments in probabilistic reasoning about uncertain target propositions (that is, whether or not a given individual is the source of a crime stain), this paper relies on graphical probability models, in particular, Bayesian networks. This graphical probability modeling approach is used to capture, within a single model, a series of key variables, such as the number of individuals in a database, the size of the population of potential crime stain sources, and the rarity of the corresponding analytical characteristics in a relevant population. Results This paper demonstrates the feasibility of deriving Bayesian network structures for analyzing, representing, and tracking the database search problem. The output of the proposed models can be shown to agree with existing but exclusively formulaic approaches. Conclusions The proposed Bayesian networks allow one to capture and analyze the currently most well-supported but reputedly counter-intuitive and difficult solution to the database search problem in a way that goes beyond the traditional, purely formulaic expressions. The method’s graphical environment, along with its computational and probabilistic architectures, represents a rich package that offers analysts and discussants with additional modes of interaction, concise representation, and coherent communication. PMID:22849390

  18. Bayesian inference based on dual generalized order statistics from the exponentiated Weibull model

    NASA Astrophysics Data System (ADS)

    Al Sobhi, Mashail M.

    2015-02-01

    Bayesian estimation for the two parameters and the reliability function of the exponentiated Weibull model are obtained based on dual generalized order statistics (DGOS). Also, Bayesian prediction bounds for future DGOS from exponentiated Weibull model are obtained. The symmetric and asymmetric loss functions are considered for Bayesian computations. The Markov chain Monte Carlo (MCMC) methods are used for computing the Bayes estimates and prediction bounds. The results have been specialized to the lower record values. Comparisons are made between Bayesian and maximum likelihood estimators via Monte Carlo simulation.

  19. Structural mapping in statistical word problems: A relational reasoning approach to Bayesian inference.

    PubMed

    Johnson, Eric D; Tubau, Elisabet

    2017-06-01

    Presenting natural frequencies facilitates Bayesian inferences relative to using percentages. Nevertheless, many people, including highly educated and skilled reasoners, still fail to provide Bayesian responses to these computationally simple problems. We show that the complexity of relational reasoning (e.g., the structural mapping between the presented and requested relations) can help explain the remaining difficulties. With a non-Bayesian inference that required identical arithmetic but afforded a more direct structural mapping, performance was universally high. Furthermore, reducing the relational demands of the task through questions that directed reasoners to use the presented statistics, as compared with questions that prompted the representation of a second, similar sample, also significantly improved reasoning. Distinct error patterns were also observed between these presented- and similar-sample scenarios, which suggested differences in relational-reasoning strategies. On the other hand, while higher numeracy was associated with better Bayesian reasoning, higher-numerate reasoners were not immune to the relational complexity of the task. Together, these findings validate the relational-reasoning view of Bayesian problem solving and highlight the importance of considering not only the presented task structure, but also the complexity of the structural alignment between the presented and requested relations.

  20. A bayesian approach to classification criteria for spectacled eiders

    USGS Publications Warehouse

    Taylor, B.L.; Wade, P.R.; Stehn, R.A.; Cochrane, J.F.

    1996-01-01

    To facilitate decisions to classify species according to risk of extinction, we used Bayesian methods to analyze trend data for the Spectacled Eider, an arctic sea duck. Trend data from three independent surveys of the Yukon-Kuskokwim Delta were analyzed individually and in combination to yield posterior distributions for population growth rates. We used classification criteria developed by the recovery team for Spectacled Eiders that seek to equalize errors of under- or overprotecting the species. We conducted both a Bayesian decision analysis and a frequentist (classical statistical inference) decision analysis. Bayesian decision analyses are computationally easier, yield basically the same results, and yield results that are easier to explain to nonscientists. With the exception of the aerial survey analysis of the 10 most recent years, both Bayesian and frequentist methods indicated that an endangered classification is warranted. The discrepancy between surveys warrants further research. Although the trend data are abundance indices, we used a preliminary estimate of absolute abundance to demonstrate how to calculate extinction distributions using the joint probability distributions for population growth rate and variance in growth rate generated by the Bayesian analysis. Recent apparent increases in abundance highlight the need for models that apply to declining and then recovering species.

  1. Bayesian Analysis of Biogeography when the Number of Areas is Large

    PubMed Central

    Landis, Michael J.; Matzke, Nicholas J.; Moore, Brian R.; Huelsenbeck, John P.

    2013-01-01

    Historical biogeography is increasingly studied from an explicitly statistical perspective, using stochastic models to describe the evolution of species range as a continuous-time Markov process of dispersal between and extinction within a set of discrete geographic areas. The main constraint of these methods is the computational limit on the number of areas that can be specified. We propose a Bayesian approach for inferring biogeographic history that extends the application of biogeographic models to the analysis of more realistic problems that involve a large number of areas. Our solution is based on a “data-augmentation” approach, in which we first populate the tree with a history of biogeographic events that is consistent with the observed species ranges at the tips of the tree. We then calculate the likelihood of a given history by adopting a mechanistic interpretation of the instantaneous-rate matrix, which specifies both the exponential waiting times between biogeographic events and the relative probabilities of each biogeographic change. We develop this approach in a Bayesian framework, marginalizing over all possible biogeographic histories using Markov chain Monte Carlo (MCMC). Besides dramatically increasing the number of areas that can be accommodated in a biogeographic analysis, our method allows the parameters of a given biogeographic model to be estimated and different biogeographic models to be objectively compared. Our approach is implemented in the program, BayArea. [ancestral area analysis; Bayesian biogeographic inference; data augmentation; historical biogeography; Markov chain Monte Carlo.] PMID:23736102

  2. Selecting Summary Statistics in Approximate Bayesian Computation for Calibrating Stochastic Models

    PubMed Central

    Burr, Tom

    2013-01-01

    Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the “go-to” option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example. PMID:24288668

  3. Selecting summary statistics in approximate Bayesian computation for calibrating stochastic models.

    PubMed

    Burr, Tom; Skurikhin, Alexei

    2013-01-01

    Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the "go-to" option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example.

  4. Bayesian exponential random graph modelling of interhospital patient referral networks.

    PubMed

    Caimo, Alberto; Pallotti, Francesca; Lomi, Alessandro

    2017-08-15

    Using original data that we have collected on referral relations between 110 hospitals serving a large regional community, we show how recently derived Bayesian exponential random graph models may be adopted to illuminate core empirical issues in research on relational coordination among healthcare organisations. We show how a rigorous Bayesian computation approach supports a fully probabilistic analytical framework that alleviates well-known problems in the estimation of model parameters of exponential random graph models. We also show how the main structural features of interhospital patient referral networks that prior studies have described can be reproduced with accuracy by specifying the system of local dependencies that produce - but at the same time are induced by - decentralised collaborative arrangements between hospitals. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  5. Fast model updating coupling Bayesian inference and PGD model reduction

    NASA Astrophysics Data System (ADS)

    Rubio, Paul-Baptiste; Louf, François; Chamoin, Ludovic

    2018-04-01

    The paper focuses on a coupled Bayesian-Proper Generalized Decomposition (PGD) approach for the real-time identification and updating of numerical models. The purpose is to use the most general case of Bayesian inference theory in order to address inverse problems and to deal with different sources of uncertainties (measurement and model errors, stochastic parameters). In order to do so with a reasonable CPU cost, the idea is to replace the direct model called for Monte-Carlo sampling by a PGD reduced model, and in some cases directly compute the probability density functions from the obtained analytical formulation. This procedure is first applied to a welding control example with the updating of a deterministic parameter. In the second application, the identification of a stochastic parameter is studied through a glued assembly example.

  6. How Recent History Affects Perception: The Normative Approach and Its Heuristic Approximation

    PubMed Central

    Raviv, Ofri; Ahissar, Merav; Loewenstein, Yonatan

    2012-01-01

    There is accumulating evidence that prior knowledge about expectations plays an important role in perception. The Bayesian framework is the standard computational approach to explain how prior knowledge about the distribution of expected stimuli is incorporated with noisy observations in order to improve performance. However, it is unclear what information about the prior distribution is acquired by the perceptual system over short periods of time and how this information is utilized in the process of perceptual decision making. Here we address this question using a simple two-tone discrimination task. We find that the “contraction bias”, in which small magnitudes are overestimated and large magnitudes are underestimated, dominates the pattern of responses of human participants. This contraction bias is consistent with the Bayesian hypothesis in which the true prior information is available to the decision-maker. However, a trial-by-trial analysis of the pattern of responses reveals that the contribution of most recent trials to performance is overweighted compared with the predictions of a standard Bayesian model. Moreover, we study participants' performance in a-typical distributions of stimuli and demonstrate substantial deviations from the ideal Bayesian detector, suggesting that the brain utilizes a heuristic approximation of the Bayesian inference. We propose a biologically plausible model, in which decision in the two-tone discrimination task is based on a comparison between the second tone and an exponentially-decaying average of the first tone and past tones. We show that this model accounts for both the contraction bias and the deviations from the ideal Bayesian detector hypothesis. These findings demonstrate the power of Bayesian-like heuristics in the brain, as well as their limitations in their failure to fully adapt to novel environments. PMID:23133343

  7. Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling

    NASA Astrophysics Data System (ADS)

    Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.

    2017-04-01

    Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.

  8. An adaptive sparse-grid high-order stochastic collocation method for Bayesian inference in groundwater reactive transport modeling

    NASA Astrophysics Data System (ADS)

    Zhang, Guannan; Lu, Dan; Ye, Ming; Gunzburger, Max; Webster, Clayton

    2013-10-01

    Bayesian analysis has become vital to uncertainty quantification in groundwater modeling, but its application has been hindered by the computational cost associated with numerous model executions required by exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, a new approach is developed to improve the computational efficiency of Bayesian inference by constructing a surrogate of the PPDF, using an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, this paper utilizes a compactly supported higher-order hierarchical basis to construct the surrogate system, resulting in a significant reduction in the number of required model executions. In addition, using the hierarchical surplus as an error indicator allows locally adaptive refinement of sparse grids in the parameter space, which further improves computational efficiency. To efficiently build the surrogate system for the PPDF with multiple significant modes, optimization techniques are used to identify the modes, for which high-probability regions are defined and components of the aSG-hSC approximation are constructed. After the surrogate is determined, the PPDF can be evaluated by sampling the surrogate system directly without model execution, resulting in improved efficiency of the surrogate-based MCMC compared with conventional MCMC. The developed method is evaluated using two synthetic groundwater reactive transport models. The first example involves coupled linear reactions and demonstrates the accuracy of our high-order hierarchical basis approach in approximating high-dimensional posteriori distribution. The second example is highly nonlinear because of the reactions of uranium surface complexation, and demonstrates how the iterative aSG-hSC method is able to capture multimodal and non-Gaussian features of PPDF caused by model nonlinearity. Both experiments show that aSG-hSC is an effective and efficient tool for Bayesian inference.

  9. Bayesian Software Health Management for Aircraft Guidance, Navigation, and Control

    NASA Technical Reports Server (NTRS)

    Schumann, Johann; Mbaya, Timmy; Menghoel, Ole

    2011-01-01

    Modern aircraft, both piloted fly-by-wire commercial aircraft as well as UAVs, more and more depend on highly complex safety critical software systems with many sensors and computer-controlled actuators. Despite careful design and V&V of the software, severe incidents have happened due to malfunctioning software. In this paper, we discuss the use of Bayesian networks (BNs) to monitor the health of the on-board software and sensor system, and to perform advanced on-board diagnostic reasoning. We will focus on the approach to develop reliable and robust health models for the combined software and sensor systems.

  10. Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach

    PubMed Central

    Boitard, Simon; Rodríguez, Willy; Jay, Flora; Mona, Stefano; Austerlitz, Frédéric

    2016-01-01

    Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles. PMID:26943927

  11. Bayesian probabilistic approach for inverse source determination from limited and noisy chemical or biological sensor concentration measurements

    NASA Astrophysics Data System (ADS)

    Yee, Eugene

    2007-04-01

    Although a great deal of research effort has been focused on the forward prediction of the dispersion of contaminants (e.g., chemical and biological warfare agents) released into the turbulent atmosphere, much less work has been directed toward the inverse prediction of agent source location and strength from the measured concentration, even though the importance of this problem for a number of practical applications is obvious. In general, the inverse problem of source reconstruction is ill-posed and unsolvable without additional information. It is demonstrated that a Bayesian probabilistic inferential framework provides a natural and logically consistent method for source reconstruction from a limited number of noisy concentration data. In particular, the Bayesian approach permits one to incorporate prior knowledge about the source as well as additional information regarding both model and data errors. The latter enables a rigorous determination of the uncertainty in the inference of the source parameters (e.g., spatial location, emission rate, release time, etc.), hence extending the potential of the methodology as a tool for quantitative source reconstruction. A model (or, source-receptor relationship) that relates the source distribution to the concentration data measured by a number of sensors is formulated, and Bayesian probability theory is used to derive the posterior probability density function of the source parameters. A computationally efficient methodology for determination of the likelihood function for the problem, based on an adjoint representation of the source-receptor relationship, is described. Furthermore, we describe the application of efficient stochastic algorithms based on Markov chain Monte Carlo (MCMC) for sampling from the posterior distribution of the source parameters, the latter of which is required to undertake the Bayesian computation. The Bayesian inferential methodology for source reconstruction is validated against real dispersion data for two cases involving contaminant dispersion in highly disturbed flows over urban and complex environments where the idealizations of horizontal homogeneity and/or temporal stationarity in the flow cannot be applied to simplify the problem. Furthermore, the methodology is applied to the case of reconstruction of multiple sources.

  12. Comparing hierarchical models via the marginalized deviance information criterion.

    PubMed

    Quintero, Adrian; Lesaffre, Emmanuel

    2018-07-20

    Hierarchical models are extensively used in pharmacokinetics and longitudinal studies. When the estimation is performed from a Bayesian approach, model comparison is often based on the deviance information criterion (DIC). In hierarchical models with latent variables, there are several versions of this statistic: the conditional DIC (cDIC) that incorporates the latent variables in the focus of the analysis and the marginalized DIC (mDIC) that integrates them out. Regardless of the asymptotic and coherency difficulties of cDIC, this alternative is usually used in Markov chain Monte Carlo (MCMC) methods for hierarchical models because of practical convenience. The mDIC criterion is more appropriate in most cases but requires integration of the likelihood, which is computationally demanding and not implemented in Bayesian software. Therefore, we consider a method to compute mDIC by generating replicate samples of the latent variables that need to be integrated out. This alternative can be easily conducted from the MCMC output of Bayesian packages and is widely applicable to hierarchical models in general. Additionally, we propose some approximations in order to reduce the computational complexity for large-sample situations. The method is illustrated with simulated data sets and 2 medical studies, evidencing that cDIC may be misleading whilst mDIC appears pertinent. Copyright © 2018 John Wiley & Sons, Ltd.

  13. Bayesian analysis of rare events

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Straub, Daniel, E-mail: straub@tum.de; Papaioannou, Iason; Betz, Wolfgang

    2016-06-01

    In many areas of engineering and science there is an interest in predicting the probability of rare events, in particular in applications related to safety and security. Increasingly, such predictions are made through computer models of physical systems in an uncertainty quantification framework. Additionally, with advances in IT, monitoring and sensor technology, an increasing amount of data on the performance of the systems is collected. This data can be used to reduce uncertainty, improve the probability estimates and consequently enhance the management of rare events and associated risks. Bayesian analysis is the ideal method to include the data into themore » probabilistic model. It ensures a consistent probabilistic treatment of uncertainty, which is central in the prediction of rare events, where extrapolation from the domain of observation is common. We present a framework for performing Bayesian updating of rare event probabilities, termed BUS. It is based on a reinterpretation of the classical rejection-sampling approach to Bayesian analysis, which enables the use of established methods for estimating probabilities of rare events. By drawing upon these methods, the framework makes use of their computational efficiency. These methods include the First-Order Reliability Method (FORM), tailored importance sampling (IS) methods and Subset Simulation (SuS). In this contribution, we briefly review these methods in the context of the BUS framework and investigate their applicability to Bayesian analysis of rare events in different settings. We find that, for some applications, FORM can be highly efficient and is surprisingly accurate, enabling Bayesian analysis of rare events with just a few model evaluations. In a general setting, BUS implemented through IS and SuS is more robust and flexible.« less

  14. Quantification of downscaled precipitation uncertainties via Bayesian inference

    NASA Astrophysics Data System (ADS)

    Nury, A. H.; Sharma, A.; Marshall, L. A.

    2017-12-01

    Prediction of precipitation from global climate model (GCM) outputs remains critical to decision-making in water-stressed regions. In this regard, downscaling of GCM output has been a useful tool for analysing future hydro-climatological states. Several downscaling approaches have been developed for precipitation downscaling, including those using dynamical or statistical downscaling methods. Frequently, outputs from dynamical downscaling are not readily transferable across regions for significant methodical and computational difficulties. Statistical downscaling approaches provide a flexible and efficient alternative, providing hydro-climatological outputs across multiple temporal and spatial scales in many locations. However these approaches are subject to significant uncertainty, arising due to uncertainty in the downscaled model parameters and in the use of different reanalysis products for inferring appropriate model parameters. Consequently, these will affect the performance of simulation in catchment scale. This study develops a Bayesian framework for modelling downscaled daily precipitation from GCM outputs. This study aims to introduce uncertainties in downscaling evaluating reanalysis datasets against observational rainfall data over Australia. In this research a consistent technique for quantifying downscaling uncertainties by means of Bayesian downscaling frame work has been proposed. The results suggest that there are differences in downscaled precipitation occurrences and extremes.

  15. A Comparison of the β-Substitution Method and a Bayesian Method for Analyzing Left-Censored Data

    PubMed Central

    Huynh, Tran; Quick, Harrison; Ramachandran, Gurumurthy; Banerjee, Sudipto; Stenzel, Mark; Sandler, Dale P.; Engel, Lawrence S.; Kwok, Richard K.; Blair, Aaron; Stewart, Patricia A.

    2016-01-01

    Classical statistical methods for analyzing exposure data with values below the detection limits are well described in the occupational hygiene literature, but an evaluation of a Bayesian approach for handling such data is currently lacking. Here, we first describe a Bayesian framework for analyzing censored data. We then present the results of a simulation study conducted to compare the β-substitution method with a Bayesian method for exposure datasets drawn from lognormal distributions and mixed lognormal distributions with varying sample sizes, geometric standard deviations (GSDs), and censoring for single and multiple limits of detection. For each set of factors, estimates for the arithmetic mean (AM), geometric mean, GSD, and the 95th percentile (X0.95) of the exposure distribution were obtained. We evaluated the performance of each method using relative bias, the root mean squared error (rMSE), and coverage (the proportion of the computed 95% uncertainty intervals containing the true value). The Bayesian method using non-informative priors and the β-substitution method were generally comparable in bias and rMSE when estimating the AM and GM. For the GSD and the 95th percentile, the Bayesian method with non-informative priors was more biased and had a higher rMSE than the β-substitution method, but use of more informative priors generally improved the Bayesian method’s performance, making both the bias and the rMSE more comparable to the β-substitution method. An advantage of the Bayesian method is that it provided estimates of uncertainty for these parameters of interest and good coverage, whereas the β-substitution method only provided estimates of uncertainty for the AM, and coverage was not as consistent. Selection of one or the other method depends on the needs of the practitioner, the availability of prior information, and the distribution characteristics of the measurement data. We suggest the use of Bayesian methods if the practitioner has the computational resources and prior information, as the method would generally provide accurate estimates and also provides the distributions of all of the parameters, which could be useful for making decisions in some applications. PMID:26209598

  16. Inference of reaction rate parameters based on summary statistics from experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Khalil, Mohammad; Chowdhary, Kamaljit Singh; Safta, Cosmin

    Here, we present the results of an application of Bayesian inference and maximum entropy methods for the estimation of the joint probability density for the Arrhenius rate para meters of the rate coefficient of the H 2/O 2-mechanism chain branching reaction H + O 2 → OH + O. Available published data is in the form of summary statistics in terms of nominal values and error bars of the rate coefficient of this reaction at a number of temperature values obtained from shock-tube experiments. Our approach relies on generating data, in this case OH concentration profiles, consistent with the givenmore » summary statistics, using Approximate Bayesian Computation methods and a Markov Chain Monte Carlo procedure. The approach permits the forward propagation of parametric uncertainty through the computational model in a manner that is consistent with the published statistics. A consensus joint posterior on the parameters is obtained by pooling the posterior parameter densities given each consistent data set. To expedite this process, we construct efficient surrogates for the OH concentration using a combination of Pad'e and polynomial approximants. These surrogate models adequately represent forward model observables and their dependence on input parameters and are computationally efficient to allow their use in the Bayesian inference procedure. We also utilize Gauss-Hermite quadrature with Gaussian proposal probability density functions for moment computation resulting in orders of magnitude speedup in data likelihood evaluation. Despite the strong non-linearity in the model, the consistent data sets all res ult in nearly Gaussian conditional parameter probability density functions. The technique also accounts for nuisance parameters in the form of Arrhenius parameters of other rate coefficients with prescribed uncertainty. The resulting pooled parameter probability density function is propagated through stoichiometric hydrogen-air auto-ignition computations to illustrate the need to account for correlation among the Arrhenius rate parameters of one reaction and across rate parameters of different reactions.« less

  17. Inference of reaction rate parameters based on summary statistics from experiments

    DOE PAGES

    Khalil, Mohammad; Chowdhary, Kamaljit Singh; Safta, Cosmin; ...

    2016-10-15

    Here, we present the results of an application of Bayesian inference and maximum entropy methods for the estimation of the joint probability density for the Arrhenius rate para meters of the rate coefficient of the H 2/O 2-mechanism chain branching reaction H + O 2 → OH + O. Available published data is in the form of summary statistics in terms of nominal values and error bars of the rate coefficient of this reaction at a number of temperature values obtained from shock-tube experiments. Our approach relies on generating data, in this case OH concentration profiles, consistent with the givenmore » summary statistics, using Approximate Bayesian Computation methods and a Markov Chain Monte Carlo procedure. The approach permits the forward propagation of parametric uncertainty through the computational model in a manner that is consistent with the published statistics. A consensus joint posterior on the parameters is obtained by pooling the posterior parameter densities given each consistent data set. To expedite this process, we construct efficient surrogates for the OH concentration using a combination of Pad'e and polynomial approximants. These surrogate models adequately represent forward model observables and their dependence on input parameters and are computationally efficient to allow their use in the Bayesian inference procedure. We also utilize Gauss-Hermite quadrature with Gaussian proposal probability density functions for moment computation resulting in orders of magnitude speedup in data likelihood evaluation. Despite the strong non-linearity in the model, the consistent data sets all res ult in nearly Gaussian conditional parameter probability density functions. The technique also accounts for nuisance parameters in the form of Arrhenius parameters of other rate coefficients with prescribed uncertainty. The resulting pooled parameter probability density function is propagated through stoichiometric hydrogen-air auto-ignition computations to illustrate the need to account for correlation among the Arrhenius rate parameters of one reaction and across rate parameters of different reactions.« less

  18. Influence of gene flow on divergence dating - implications for the speciation history of Takydromus grass lizards.

    PubMed

    Tseng, Shu-Ping; Li, Shou-Hsien; Hsieh, Chia-Hung; Wang, Hurng-Yi; Lin, Si-Min

    2014-10-01

    Dating the time of divergence and understanding speciation processes are central to the study of the evolutionary history of organisms but are notoriously difficult. The difficulty is largely rooted in variations in the ancestral population size or in the genealogy variation across loci. To depict the speciation processes and divergence histories of three monophyletic Takydromus species endemic to Taiwan, we sequenced 20 nuclear loci and combined with one mitochondrial locus published in GenBank. They were analysed by a multispecies coalescent approach within a Bayesian framework. Divergence dating based on the gene tree approach showed high variation among loci, and the divergence was estimated at an earlier date than when derived by the species-tree approach. To test whether variations in the ancestral population size accounted for the majority of this variation, we conducted computer inferences using isolation-with-migration (IM) and approximate Bayesian computation (ABC) frameworks. The results revealed that gene flow during the early stage of speciation was strongly favoured over the isolation model, and the initiation of the speciation process was far earlier than the dates estimated by gene- and species-based divergence dating. Due to their limited dispersal ability, it is suggested that geographical isolation may have played a major role in the divergence of these Takydromus species. Nevertheless, this study reveals a more complex situation and demonstrates that gene flow during the speciation process cannot be overlooked and may have a great impact on divergence dating. By using multilocus data and incorporating Bayesian coalescence approaches, we provide a more biologically realistic framework for delineating the divergence history of Takydromus. © 2014 John Wiley & Sons Ltd.

  19. Accounting for model error in Bayesian solutions to hydrogeophysical inverse problems using a local basis approach

    NASA Astrophysics Data System (ADS)

    Köpke, Corinna; Irving, James; Elsheikh, Ahmed H.

    2018-06-01

    Bayesian solutions to geophysical and hydrological inverse problems are dependent upon a forward model linking subsurface physical properties to measured data, which is typically assumed to be perfectly known in the inversion procedure. However, to make the stochastic solution of the inverse problem computationally tractable using methods such as Markov-chain-Monte-Carlo (MCMC), fast approximations of the forward model are commonly employed. This gives rise to model error, which has the potential to significantly bias posterior statistics if not properly accounted for. Here, we present a new methodology for dealing with the model error arising from the use of approximate forward solvers in Bayesian solutions to hydrogeophysical inverse problems. Our approach is geared towards the common case where this error cannot be (i) effectively characterized through some parametric statistical distribution; or (ii) estimated by interpolating between a small number of computed model-error realizations. To this end, we focus on identification and removal of the model-error component of the residual during MCMC using a projection-based approach, whereby the orthogonal basis employed for the projection is derived in each iteration from the K-nearest-neighboring entries in a model-error dictionary. The latter is constructed during the inversion and grows at a specified rate as the iterations proceed. We demonstrate the performance of our technique on the inversion of synthetic crosshole ground-penetrating radar travel-time data considering three different subsurface parameterizations of varying complexity. Synthetic data are generated using the eikonal equation, whereas a straight-ray forward model is assumed for their inversion. In each case, our developed approach enables us to remove posterior bias and obtain a more realistic characterization of uncertainty.

  20. A Bayesian network analysis of posttraumatic stress disorder symptoms in adults reporting childhood sexual abuse

    PubMed Central

    McNally, Richard J.; Heeren, Alexandre; Robinaugh, Donald J.

    2017-01-01

    ABSTRACT Background: The network approach to mental disorders offers a novel framework for conceptualizing posttraumatic stress disorder (PTSD) as a causal system of interacting symptoms. Objective: In this study, we extended this work by estimating the structure of relations among PTSD symptoms in adults reporting personal histories of childhood sexual abuse (CSA; N = 179).   Method: We employed two complementary methods. First, using the graphical LASSO, we computed a sparse, regularized partial correlation network revealing associations (edges) between pairs of PTSD symptoms (nodes). Next, using a Bayesian approach, we computed a directed acyclic graph (DAG) to estimate a directed, potentially causal model of the relations among symptoms. Results: For the first network, we found that physiological reactivity to reminders of trauma, dreams about the trauma, and lost of interest in previously enjoyed activities were highly central nodes. However, stability analyses suggest that these findings were unstable across subsets of our sample. The DAG suggests that becoming physiologically reactive and upset in response to reminders of the trauma may be key drivers of other symptoms in adult survivors of CSA. Conclusions: Our study illustrates the strengths and limitations of these network analytic approaches to PTSD. PMID:29038690

  1. Coestimation of recombination, substitution and molecular adaptation rates by approximate Bayesian computation.

    PubMed

    Lopes, J S; Arenas, M; Posada, D; Beaumont, M A

    2014-03-01

    The estimation of parameters in molecular evolution may be biased when some processes are not considered. For example, the estimation of selection at the molecular level using codon-substitution models can have an upward bias when recombination is ignored. Here we address the joint estimation of recombination, molecular adaptation and substitution rates from coding sequences using approximate Bayesian computation (ABC). We describe the implementation of a regression-based strategy for choosing subsets of summary statistics for coding data, and show that this approach can accurately infer recombination allowing for intracodon recombination breakpoints, molecular adaptation and codon substitution rates. We demonstrate that our ABC approach can outperform other analytical methods under a variety of evolutionary scenarios. We also show that although the choice of the codon-substitution model is important, our inferences are robust to a moderate degree of model misspecification. In addition, we demonstrate that our approach can accurately choose the evolutionary model that best fits the data, providing an alternative for when the use of full-likelihood methods is impracticable. Finally, we applied our ABC method to co-estimate recombination, substitution and molecular adaptation rates from 24 published human immunodeficiency virus 1 coding data sets.

  2. An Uncertainty Quantification Framework for Prognostics and Condition-Based Monitoring

    NASA Technical Reports Server (NTRS)

    Sankararaman, Shankar; Goebel, Kai

    2014-01-01

    This paper presents a computational framework for uncertainty quantification in prognostics in the context of condition-based monitoring of aerospace systems. The different sources of uncertainty and the various uncertainty quantification activities in condition-based prognostics are outlined in detail, and it is demonstrated that the Bayesian subjective approach is suitable for interpreting uncertainty in online monitoring. A state-space model-based framework for prognostics, that can rigorously account for the various sources of uncertainty, is presented. Prognostics consists of two important steps. First, the state of the system is estimated using Bayesian tracking, and then, the future states of the system are predicted until failure, thereby computing the remaining useful life of the system. The proposed framework is illustrated using the power system of a planetary rover test-bed, which is being developed and studied at NASA Ames Research Center.

  3. al3c: high-performance software for parameter inference using Approximate Bayesian Computation.

    PubMed

    Stram, Alexander H; Marjoram, Paul; Chen, Gary K

    2015-11-01

    The development of Approximate Bayesian Computation (ABC) algorithms for parameter inference which are both computationally efficient and scalable in parallel computing environments is an important area of research. Monte Carlo rejection sampling, a fundamental component of ABC algorithms, is trivial to distribute over multiple processors but is inherently inefficient. While development of algorithms such as ABC Sequential Monte Carlo (ABC-SMC) help address the inherent inefficiencies of rejection sampling, such approaches are not as easily scaled on multiple processors. As a result, current Bayesian inference software offerings that use ABC-SMC lack the ability to scale in parallel computing environments. We present al3c, a C++ framework for implementing ABC-SMC in parallel. By requiring only that users define essential functions such as the simulation model and prior distribution function, al3c abstracts the user from both the complexities of parallel programming and the details of the ABC-SMC algorithm. By using the al3c framework, the user is able to scale the ABC-SMC algorithm in parallel computing environments for his or her specific application, with minimal programming overhead. al3c is offered as a static binary for Linux and OS-X computing environments. The user completes an XML configuration file and C++ plug-in template for the specific application, which are used by al3c to obtain the desired results. Users can download the static binaries, source code, reference documentation and examples (including those in this article) by visiting https://github.com/ahstram/al3c. astram@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Attention in a Bayesian Framework

    PubMed Central

    Whiteley, Louise; Sahani, Maneesh

    2012-01-01

    The behavioral phenomena of sensory attention are thought to reflect the allocation of a limited processing resource, but there is little consensus on the nature of the resource or why it should be limited. Here we argue that a fundamental bottleneck emerges naturally within Bayesian models of perception, and use this observation to frame a new computational account of the need for, and action of, attention – unifying diverse attentional phenomena in a way that goes beyond previous inferential, probabilistic and Bayesian models. Attentional effects are most evident in cluttered environments, and include both selective phenomena, where attention is invoked by cues that point to particular stimuli, and integrative phenomena, where attention is invoked dynamically by endogenous processing. However, most previous Bayesian accounts of attention have focused on describing relatively simple experimental settings, where cues shape expectations about a small number of upcoming stimuli and thus convey “prior” information about clearly defined objects. While operationally consistent with the experiments it seeks to describe, this view of attention as prior seems to miss many essential elements of both its selective and integrative roles, and thus cannot be easily extended to complex environments. We suggest that the resource bottleneck stems from the computational intractability of exact perceptual inference in complex settings, and that attention reflects an evolved mechanism for approximate inference which can be shaped to refine the local accuracy of perception. We show that this approach extends the simple picture of attention as prior, so as to provide a unified and computationally driven account of both selective and integrative attentional phenomena. PMID:22712010

  5. Real-time inversions for finite fault slip models and rupture geometry based on high-rate GPS data

    USGS Publications Warehouse

    Minson, Sarah E.; Murray, Jessica R.; Langbein, John O.; Gomberg, Joan S.

    2015-01-01

    We present an inversion strategy capable of using real-time high-rate GPS data to simultaneously solve for a distributed slip model and fault geometry in real time as a rupture unfolds. We employ Bayesian inference to find the optimal fault geometry and the distribution of possible slip models for that geometry using a simple analytical solution. By adopting an analytical Bayesian approach, we can solve this complex inversion problem (including calculating the uncertainties on our results) in real time. Furthermore, since the joint inversion for distributed slip and fault geometry can be computed in real time, the time required to obtain a source model of the earthquake does not depend on the computational cost. Instead, the time required is controlled by the duration of the rupture and the time required for information to propagate from the source to the receivers. We apply our modeling approach, called Bayesian Evidence-based Fault Orientation and Real-time Earthquake Slip, to the 2011 Tohoku-oki earthquake, 2003 Tokachi-oki earthquake, and a simulated Hayward fault earthquake. In all three cases, the inversion recovers the magnitude, spatial distribution of slip, and fault geometry in real time. Since our inversion relies on static offsets estimated from real-time high-rate GPS data, we also present performance tests of various approaches to estimating quasi-static offsets in real time. We find that the raw high-rate time series are the best data to use for determining the moment magnitude of the event, but slightly smoothing the raw time series helps stabilize the inversion for fault geometry.

  6. Predicting uncertainty in future marine ice sheet volume using Bayesian statistical methods

    NASA Astrophysics Data System (ADS)

    Davis, A. D.

    2015-12-01

    The marine ice instability can trigger rapid retreat of marine ice streams. Recent observations suggest that marine ice systems in West Antarctica have begun retreating. However, unknown ice dynamics, computationally intensive mathematical models, and uncertain parameters in these models make predicting retreat rate and ice volume difficult. In this work, we fuse current observational data with ice stream/shelf models to develop probabilistic predictions of future grounded ice sheet volume. Given observational data (e.g., thickness, surface elevation, and velocity) and a forward model that relates uncertain parameters (e.g., basal friction and basal topography) to these observations, we use a Bayesian framework to define a posterior distribution over the parameters. A stochastic predictive model then propagates uncertainties in these parameters to uncertainty in a particular quantity of interest (QoI)---here, the volume of grounded ice at a specified future time. While the Bayesian approach can in principle characterize the posterior predictive distribution of the QoI, the computational cost of both the forward and predictive models makes this effort prohibitively expensive. To tackle this challenge, we introduce a new Markov chain Monte Carlo method that constructs convergent approximations of the QoI target density in an online fashion, yielding accurate characterizations of future ice sheet volume at significantly reduced computational cost.Our second goal is to attribute uncertainty in these Bayesian predictions to uncertainties in particular parameters. Doing so can help target data collection, for the purpose of constraining the parameters that contribute most strongly to uncertainty in the future volume of grounded ice. For instance, smaller uncertainties in parameters to which the QoI is highly sensitive may account for more variability in the prediction than larger uncertainties in parameters to which the QoI is less sensitive. We use global sensitivity analysis to help answer this question, and make the computation of sensitivity indices computationally tractable using a combination of polynomial chaos and Monte Carlo techniques.

  7. Children Can Solve Bayesian Problems: The Role of Representation in Mental Computation

    ERIC Educational Resources Information Center

    Zhu, Liqi; Gigerenzer, Gerd

    2006-01-01

    Can children reason the Bayesian way? We argue that the answer to this question depends on how numbers are represented, because a representation can do part of the computation. We test, for the first time, whether Bayesian reasoning can be elicited in children by means of natural frequencies. We show that when information was presented to fourth,…

  8. Basics of Bayesian methods.

    PubMed

    Ghosh, Sujit K

    2010-01-01

    Bayesian methods are rapidly becoming popular tools for making statistical inference in various fields of science including biology, engineering, finance, and genetics. One of the key aspects of Bayesian inferential method is its logical foundation that provides a coherent framework to utilize not only empirical but also scientific information available to a researcher. Prior knowledge arising from scientific background, expert judgment, or previously collected data is used to build a prior distribution which is then combined with current data via the likelihood function to characterize the current state of knowledge using the so-called posterior distribution. Bayesian methods allow the use of models of complex physical phenomena that were previously too difficult to estimate (e.g., using asymptotic approximations). Bayesian methods offer a means of more fully understanding issues that are central to many practical problems by allowing researchers to build integrated models based on hierarchical conditional distributions that can be estimated even with limited amounts of data. Furthermore, advances in numerical integration methods, particularly those based on Monte Carlo methods, have made it possible to compute the optimal Bayes estimators. However, there is a reasonably wide gap between the background of the empirically trained scientists and the full weight of Bayesian statistical inference. Hence, one of the goals of this chapter is to bridge the gap by offering elementary to advanced concepts that emphasize linkages between standard approaches and full probability modeling via Bayesian methods.

  9. Bayesian approach for three-dimensional aquifer characterization at the Hanford 300 Area

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Murakami, Haruko; Chen, X.; Hahn, Melanie S.

    2010-10-21

    This study presents a stochastic, three-dimensional characterization of a heterogeneous hydraulic conductivity field within DOE's Hanford 300 Area site, Washington, by assimilating large-scale, constant-rate injection test data with small-scale, three-dimensional electromagnetic borehole flowmeter (EBF) measurement data. We first inverted the injection test data to estimate the transmissivity field, using zeroth-order temporal moments of pressure buildup curves. We applied a newly developed Bayesian geostatistical inversion framework, the method of anchored distributions (MAD), to obtain a joint posterior distribution of geostatistical parameters and local log-transmissivities at multiple locations. The unique aspects of MAD that make it suitable for this purpose are itsmore » ability to integrate multi-scale, multi-type data within a Bayesian framework and to compute a nonparametric posterior distribution. After we combined the distribution of transmissivities with depth-discrete relative-conductivity profile from EBF data, we inferred the three-dimensional geostatistical parameters of the log-conductivity field, using the Bayesian model-based geostatistics. Such consistent use of the Bayesian approach throughout the procedure enabled us to systematically incorporate data uncertainty into the final posterior distribution. The method was tested in a synthetic study and validated using the actual data that was not part of the estimation. Results showed broader and skewed posterior distributions of geostatistical parameters except for the mean, which suggests the importance of inferring the entire distribution to quantify the parameter uncertainty.« less

  10. Bayesian Estimation of Combined Accuracy for Tests with Verification Bias

    PubMed Central

    Broemeling, Lyle D.

    2011-01-01

    This presentation will emphasize the estimation of the combined accuracy of two or more tests when verification bias is present. Verification bias occurs when some of the subjects are not subject to the gold standard. The approach is Bayesian where the estimation of test accuracy is based on the posterior distribution of the relevant parameter. Accuracy of two combined binary tests is estimated employing either “believe the positive” or “believe the negative” rule, then the true and false positive fractions for each rule are computed for two tests. In order to perform the analysis, the missing at random assumption is imposed, and an interesting example is provided by estimating the combined accuracy of CT and MRI to diagnose lung cancer. The Bayesian approach is extended to two ordinal tests when verification bias is present, and the accuracy of the combined tests is based on the ROC area of the risk function. An example involving mammography with two readers with extreme verification bias illustrates the estimation of the combined test accuracy for ordinal tests. PMID:26859487

  11. Bayesian dynamic mediation analysis.

    PubMed

    Huang, Jing; Yuan, Ying

    2017-12-01

    Most existing methods for mediation analysis assume that mediation is a stationary, time-invariant process, which overlooks the inherently dynamic nature of many human psychological processes and behavioral activities. In this article, we consider mediation as a dynamic process that continuously changes over time. We propose Bayesian multilevel time-varying coefficient models to describe and estimate such dynamic mediation effects. By taking the nonparametric penalized spline approach, the proposed method is flexible and able to accommodate any shape of the relationship between time and mediation effects. Simulation studies show that the proposed method works well and faithfully reflects the true nature of the mediation process. By modeling mediation effect nonparametrically as a continuous function of time, our method provides a valuable tool to help researchers obtain a more complete understanding of the dynamic nature of the mediation process underlying psychological and behavioral phenomena. We also briefly discuss an alternative approach of using dynamic autoregressive mediation model to estimate the dynamic mediation effect. The computer code is provided to implement the proposed Bayesian dynamic mediation analysis. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  12. A Two-Step Bayesian Approach for Propensity Score Analysis: Simulations and Case Study

    ERIC Educational Resources Information Center

    Kaplan, David; Chen, Jianshen

    2012-01-01

    A two-step Bayesian propensity score approach is introduced that incorporates prior information in the propensity score equation and outcome equation without the problems associated with simultaneous Bayesian propensity score approaches. The corresponding variance estimators are also provided. The two-step Bayesian propensity score is provided for…

  13. Bayesian-Driven First-Principles Calculations for Accelerating Exploration of Fast Ion Conductors for Rechargeable Battery Application.

    PubMed

    Jalem, Randy; Kanamori, Kenta; Takeuchi, Ichiro; Nakayama, Masanobu; Yamasaki, Hisatsugu; Saito, Toshiya

    2018-04-11

    Safe and robust batteries are urgently requested today for power sources of electric vehicles. Thus, a growing interest has been noted for fabricating those with solid electrolytes. Materials search by density functional theory (DFT) methods offers great promise for finding new solid electrolytes but the evaluation is known to be computationally expensive, particularly on ion migration property. In this work, we proposed a Bayesian-optimization-driven DFT-based approach to efficiently screen for compounds with low ion migration energies ([Formula: see text]. We demonstrated this on 318 tavorite-type Li- and Na-containing compounds. We found that the scheme only requires ~30% of the total DFT-[Formula: see text] evaluations on the average to recover the optimal compound ~90% of the time. Its recovery performance for desired compounds in the tavorite search space is ~2× more than random search (i.e., for [Formula: see text] < 0.3 eV). Our approach offers a promising way for addressing computational bottlenecks in large-scale material screening for fast ionic conductors.

  14. A new prior for bayesian anomaly detection: application to biosurveillance.

    PubMed

    Shen, Y; Cooper, G F

    2010-01-01

    Bayesian anomaly detection computes posterior probabilities of anomalous events by combining prior beliefs and evidence from data. However, the specification of prior probabilities can be challenging. This paper describes a Bayesian prior in the context of disease outbreak detection. The goal is to provide a meaningful, easy-to-use prior that yields a posterior probability of an outbreak that performs at least as well as a standard frequentist approach. If this goal is achieved, the resulting posterior could be usefully incorporated into a decision analysis about how to act in light of a possible disease outbreak. This paper describes a Bayesian method for anomaly detection that combines learning from data with a semi-informative prior probability over patterns of anomalous events. A univariate version of the algorithm is presented here for ease of illustration of the essential ideas. The paper describes the algorithm in the context of disease-outbreak detection, but it is general and can be used in other anomaly detection applications. For this application, the semi-informative prior specifies that an increased count over baseline is expected for the variable being monitored, such as the number of respiratory chief complaints per day at a given emergency department. The semi-informative prior is derived based on the baseline prior, which is estimated from using historical data. The evaluation reported here used semi-synthetic data to evaluate the detection performance of the proposed Bayesian method and a control chart method, which is a standard frequentist algorithm that is closest to the Bayesian method in terms of the type of data it uses. The disease-outbreak detection performance of the Bayesian method was statistically significantly better than that of the control chart method when proper baseline periods were used to estimate the baseline behavior to avoid seasonal effects. When using longer baseline periods, the Bayesian method performed as well as the control chart method. The time complexity of the Bayesian algorithm is linear in the number of the observed events being monitored, due to a novel, closed-form derivation that is introduced in the paper. This paper introduces a novel prior probability for Bayesian outbreak detection that is expressive, easy-to-apply, computationally efficient, and performs as well or better than a standard frequentist method.

  15. Portfolios in Stochastic Local Search: Efficiently Computing Most Probable Explanations in Bayesian Networks

    NASA Technical Reports Server (NTRS)

    Mengshoel, Ole J.; Roth, Dan; Wilkins, David C.

    2001-01-01

    Portfolio methods support the combination of different algorithms and heuristics, including stochastic local search (SLS) heuristics, and have been identified as a promising approach to solve computationally hard problems. While successful in experiments, theoretical foundations and analytical results for portfolio-based SLS heuristics are less developed. This article aims to improve the understanding of the role of portfolios of heuristics in SLS. We emphasize the problem of computing most probable explanations (MPEs) in Bayesian networks (BNs). Algorithmically, we discuss a portfolio-based SLS algorithm for MPE computation, Stochastic Greedy Search (SGS). SGS supports the integration of different initialization operators (or initialization heuristics) and different search operators (greedy and noisy heuristics), thereby enabling new analytical and experimental results. Analytically, we introduce a novel Markov chain model tailored to portfolio-based SLS algorithms including SGS, thereby enabling us to analytically form expected hitting time results that explain empirical run time results. For a specific BN, we show the benefit of using a homogenous initialization portfolio. To further illustrate the portfolio approach, we consider novel additive search heuristics for handling determinism in the form of zero entries in conditional probability tables in BNs. Our additive approach adds rather than multiplies probabilities when computing the utility of an explanation. We motivate the additive measure by studying the dramatic impact of zero entries in conditional probability tables on the number of zero-probability explanations, which again complicates the search process. We consider the relationship between MAXSAT and MPE, and show that additive utility (or gain) is a generalization, to the probabilistic setting, of MAXSAT utility (or gain) used in the celebrated GSAT and WalkSAT algorithms and their descendants. Utilizing our Markov chain framework, we show that expected hitting time is a rational function - i.e. a ratio of two polynomials - of the probability of applying an additive search operator. Experimentally, we report on synthetically generated BNs as well as BNs from applications, and compare SGSs performance to that of Hugin, which performs BN inference by compilation to and propagation in clique trees. On synthetic networks, SGS speeds up computation by approximately two orders of magnitude compared to Hugin. In application networks, our approach is highly competitive in Bayesian networks with a high degree of determinism. In addition to showing that stochastic local search can be competitive with clique tree clustering, our empirical results provide an improved understanding of the circumstances under which portfolio-based SLS outperforms clique tree clustering and vice versa.

  16. Approximate Bayesian Computation in the estimation of the parameters of the Forbush decrease model

    NASA Astrophysics Data System (ADS)

    Wawrzynczak, A.; Kopka, P.

    2017-12-01

    Realistic modeling of the complicated phenomena as Forbush decrease of the galactic cosmic ray intensity is a quite challenging task. One aspect is a numerical solution of the Fokker-Planck equation in five-dimensional space (three spatial variables, the time and particles energy). The second difficulty arises from a lack of detailed knowledge about the spatial and time profiles of the parameters responsible for the creation of the Forbush decrease. Among these parameters, the central role plays a diffusion coefficient. Assessment of the correctness of the proposed model can be done only by comparison of the model output with the experimental observations of the galactic cosmic ray intensity. We apply the Approximate Bayesian Computation (ABC) methodology to match the Forbush decrease model to experimental data. The ABC method is becoming increasing exploited for dynamic complex problems in which the likelihood function is costly to compute. The main idea of all ABC methods is to accept samples as an approximate posterior draw if its associated modeled data are close enough to the observed one. In this paper, we present application of the Sequential Monte Carlo Approximate Bayesian Computation algorithm scanning the space of the diffusion coefficient parameters. The proposed algorithm is adopted to create the model of the Forbush decrease observed by the neutron monitors at the Earth in March 2002. The model of the Forbush decrease is based on the stochastic approach to the solution of the Fokker-Planck equation.

  17. With or without you: predictive coding and Bayesian inference in the brain

    PubMed Central

    Aitchison, Laurence; Lengyel, Máté

    2018-01-01

    Two theoretical ideas have emerged recently with the ambition to provide a unifying functional explanation of neural population coding and dynamics: predictive coding and Bayesian inference. Here, we describe the two theories and their combination into a single framework: Bayesian predictive coding. We clarify how the two theories can be distinguished, despite sharing core computational concepts and addressing an overlapping set of empirical phenomena. We argue that predictive coding is an algorithmic / representational motif that can serve several different computational goals of which Bayesian inference is but one. Conversely, while Bayesian inference can utilize predictive coding, it can also be realized by a variety of other representations. We critically evaluate the experimental evidence supporting Bayesian predictive coding and discuss how to test it more directly. PMID:28942084

  18. Bayesian latent structure modeling of walking behavior in a physical activity intervention

    PubMed Central

    Lawson, Andrew B; Ellerbe, Caitlyn; Carroll, Rachel; Alia, Kassandra; Coulon, Sandra; Wilson, Dawn K; VanHorn, M Lee; St George, Sara M

    2017-01-01

    The analysis of walking behavior in a physical activity intervention is considered. A Bayesian latent structure modeling approach is proposed whereby the ability and willingness of participants is modeled via latent effects. The dropout process is jointly modeled via a linked survival model. Computational issues are addressed via posterior sampling and a simulated evaluation of the longitudinal model’s ability to recover latent structure and predictor effects is considered. We evaluate the effect of a variety of socio-psychological and spatial neighborhood predictors on the propensity to walk and the estimation of latent ability and willingness in the full study. PMID:24741000

  19. Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation.

    PubMed

    Cornuet, Jean-Marie; Santos, Filipe; Beaumont, Mark A; Robert, Christian P; Marin, Jean-Michel; Balding, David J; Guillemaud, Thomas; Estoup, Arnaud

    2008-12-01

    Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for simple and standard situations typically involving a small number of samples. We propose here a computer program (DIY ABC) for inference based on approximate Bayesian computation (ABC), in which scenarios can be customized by the user to fit many complex situations involving any number of populations and samples. Such scenarios involve any combination of population divergences, admixtures and population size changes. DIY ABC can be used to compare competing scenarios, estimate parameters for one or more scenarios and compute bias and precision measures for a given scenario and known values of parameters (the current version applies to unlinked microsatellite data). This article describes key methods used in the program and provides its main features. The analysis of one simulated and one real dataset, both with complex evolutionary scenarios, illustrates the main possibilities of DIY ABC. The software DIY ABC is freely available at http://www.montpellier.inra.fr/CBGP/diyabc.

  20. Accounting for model error in Bayesian solutions to hydrogeophysical inverse problems using a local basis approach

    NASA Astrophysics Data System (ADS)

    Irving, J.; Koepke, C.; Elsheikh, A. H.

    2017-12-01

    Bayesian solutions to geophysical and hydrological inverse problems are dependent upon a forward process model linking subsurface parameters to measured data, which is typically assumed to be known perfectly in the inversion procedure. However, in order to make the stochastic solution of the inverse problem computationally tractable using, for example, Markov-chain-Monte-Carlo (MCMC) methods, fast approximations of the forward model are commonly employed. This introduces model error into the problem, which has the potential to significantly bias posterior statistics and hamper data integration efforts if not properly accounted for. Here, we present a new methodology for addressing the issue of model error in Bayesian solutions to hydrogeophysical inverse problems that is geared towards the common case where these errors cannot be effectively characterized globally through some parametric statistical distribution or locally based on interpolation between a small number of computed realizations. Rather than focusing on the construction of a global or local error model, we instead work towards identification of the model-error component of the residual through a projection-based approach. In this regard, pairs of approximate and detailed model runs are stored in a dictionary that grows at a specified rate during the MCMC inversion procedure. At each iteration, a local model-error basis is constructed for the current test set of model parameters using the K-nearest neighbour entries in the dictionary, which is then used to separate the model error from the other error sources before computing the likelihood of the proposed set of model parameters. We demonstrate the performance of our technique on the inversion of synthetic crosshole ground-penetrating radar traveltime data for three different subsurface parameterizations of varying complexity. The synthetic data are generated using the eikonal equation, whereas a straight-ray forward model is assumed in the inversion procedure. In each case, the developed model-error approach enables to remove posterior bias and obtain a more realistic characterization of uncertainty.

  1. Parametric Bayesian priors and better choice of negative examples improve protein function prediction.

    PubMed

    Youngs, Noah; Penfold-Brown, Duncan; Drew, Kevin; Shasha, Dennis; Bonneau, Richard

    2013-05-01

    Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction. We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html

  2. Gaussian process surrogates for failure detection: A Bayesian experimental design approach

    NASA Astrophysics Data System (ADS)

    Wang, Hongqiao; Lin, Guang; Li, Jinglai

    2016-05-01

    An important task of uncertainty quantification is to identify the probability of undesired events, in particular, system failures, caused by various sources of uncertainties. In this work we consider the construction of Gaussian process surrogates for failure detection and failure probability estimation. In particular, we consider the situation that the underlying computer models are extremely expensive, and in this setting, determining the sampling points in the state space is of essential importance. We formulate the problem as an optimal experimental design for Bayesian inferences of the limit state (i.e., the failure boundary) and propose an efficient numerical scheme to solve the resulting optimization problem. In particular, the proposed limit-state inference method is capable of determining multiple sampling points at a time, and thus it is well suited for problems where multiple computer simulations can be performed in parallel. The accuracy and performance of the proposed method is demonstrated by both academic and practical examples.

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pichara, Karim; Protopapas, Pavlos

    We present an automatic classification method for astronomical catalogs with missing data. We use Bayesian networks and a probabilistic graphical model that allows us to perform inference to predict missing values given observed data and dependency relationships between variables. To learn a Bayesian network from incomplete data, we use an iterative algorithm that utilizes sampling methods and expectation maximization to estimate the distributions and probabilistic dependencies of variables from data with missing values. To test our model, we use three catalogs with missing data (SAGE, Two Micron All Sky Survey, and UBVI) and one complete catalog (MACHO). We examine howmore » classification accuracy changes when information from missing data catalogs is included, how our method compares to traditional missing data approaches, and at what computational cost. Integrating these catalogs with missing data, we find that classification of variable objects improves by a few percent and by 15% for quasar detection while keeping the computational cost the same.« less

  4. Bayesian Treed Multivariate Gaussian Process with Adaptive Design: Application to a Carbon Capture Unit

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Konomi, Bledar A.; Karagiannis, Georgios; Sarkar, Avik

    2014-05-16

    Computer experiments (numerical simulations) are widely used in scientific research to study and predict the behavior of complex systems, which usually have responses consisting of a set of distinct outputs. The computational cost of the simulations at high resolution are often expensive and become impractical for parametric studies at different input values. To overcome these difficulties we develop a Bayesian treed multivariate Gaussian process (BTMGP) as an extension of the Bayesian treed Gaussian process (BTGP) in order to model and evaluate a multivariate process. A suitable choice of covariance function and the prior distributions facilitates the different Markov chain Montemore » Carlo (MCMC) movements. We utilize this model to sequentially sample the input space for the most informative values, taking into account model uncertainty and expertise gained. A simulation study demonstrates the use of the proposed method and compares it with alternative approaches. We apply the sequential sampling technique and BTMGP to model the multiphase flow in a full scale regenerator of a carbon capture unit. The application presented in this paper is an important tool for research into carbon dioxide emissions from thermal power plants.« less

  5. A New Bayesian Approach for Estimating the Presence of a Suspected Compound in Routine Screening Analysis.

    PubMed

    Woldegebriel, Michael; Vivó-Truyols, Gabriel

    2016-10-04

    A novel method for compound identification in liquid chromatography-high resolution mass spectrometry (LC-HRMS) is proposed. The method, based on Bayesian statistics, accommodates all possible uncertainties involved, from instrumentation up to data analysis into a single model yielding the probability of the compound of interest being present/absent in the sample. This approach differs from the classical methods in two ways. First, it is probabilistic (instead of deterministic); hence, it computes the probability that the compound is (or is not) present in a sample. Second, it answers the hypothesis "the compound is present", opposed to answering the question "the compound feature is present". This second difference implies a shift in the way data analysis is tackled, since the probability of interfering compounds (i.e., isomers and isobaric compounds) is also taken into account.

  6. Bayesian models based on test statistics for multiple hypothesis testing problems.

    PubMed

    Ji, Yuan; Lu, Yiling; Mills, Gordon B

    2008-04-01

    We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.

  7. Hypothesis testing on the fractal structure of behavioral sequences: the Bayesian assessment of scaling methodology.

    PubMed

    Moscoso del Prado Martín, Fermín

    2013-12-01

    I introduce the Bayesian assessment of scaling (BAS), a simple but powerful Bayesian hypothesis contrast methodology that can be used to test hypotheses on the scaling regime exhibited by a sequence of behavioral data. Rather than comparing parametric models, as typically done in previous approaches, the BAS offers a direct, nonparametric way to test whether a time series exhibits fractal scaling. The BAS provides a simpler and faster test than do previous methods, and the code for making the required computations is provided. The method also enables testing of finely specified hypotheses on the scaling indices, something that was not possible with the previously available methods. I then present 4 simulation studies showing that the BAS methodology outperforms the other methods used in the psychological literature. I conclude with a discussion of methodological issues on fractal analyses in experimental psychology. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  8. Sparse Bayesian learning for DOA estimation with mutual coupling.

    PubMed

    Dai, Jisheng; Hu, Nan; Xu, Weichao; Chang, Chunqi

    2015-10-16

    Sparse Bayesian learning (SBL) has given renewed interest to the problem of direction-of-arrival (DOA) estimation. It is generally assumed that the measurement matrix in SBL is precisely known. Unfortunately, this assumption may be invalid in practice due to the imperfect manifold caused by unknown or misspecified mutual coupling. This paper describes a modified SBL method for joint estimation of DOAs and mutual coupling coefficients with uniform linear arrays (ULAs). Unlike the existing method that only uses stationary priors, our new approach utilizes a hierarchical form of the Student t prior to enforce the sparsity of the unknown signal more heavily. We also provide a distinct Bayesian inference for the expectation-maximization (EM) algorithm, which can update the mutual coupling coefficients more efficiently. Another difference is that our method uses an additional singular value decomposition (SVD) to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise.

  9. A hierarchical, ontology-driven Bayesian concept for ubiquitous medical environments--a case study for pulmonary diseases.

    PubMed

    Maragoudakis, Manolis; Lymberopoulos, Dimitrios; Fakotakis, Nikos; Spiropoulos, Kostas

    2008-01-01

    The present paper extends work on an existing computer-based Decision Support System (DSS) that aims to provide assistance to physicians as regards to pulmonary diseases. The extension deals with allowing for a hierarchical decomposition of the task, at different levels of domain granularity, using a novel approach, i.e. Hierarchical Bayesian Networks. The proposed framework uses data from various networking appliances such as mobile phones and wireless medical sensors to establish a ubiquitous environment for medical treatment of pulmonary diseases. Domain knowledge is encoded at the upper levels of the hierarchy, thus making the process of generalization easier to accomplish. The experimental results were carried out under the Pulmonary Department, University Regional Hospital Patras, Patras, Greece. They have supported our initial beliefs about the ability of Bayesian networks to provide an effective, yet semantically-oriented, means of prognosis and reasoning under conditions of uncertainty.

  10. Uncertainty Reduction using Bayesian Inference and Sensitivity Analysis: A Sequential Approach to the NASA Langley Uncertainty Quantification Challenge

    NASA Technical Reports Server (NTRS)

    Sankararaman, Shankar

    2016-01-01

    This paper presents a computational framework for uncertainty characterization and propagation, and sensitivity analysis under the presence of aleatory and epistemic un- certainty, and develops a rigorous methodology for efficient refinement of epistemic un- certainty by identifying important epistemic variables that significantly affect the overall performance of an engineering system. The proposed methodology is illustrated using the NASA Langley Uncertainty Quantification Challenge (NASA-LUQC) problem that deals with uncertainty analysis of a generic transport model (GTM). First, Bayesian inference is used to infer subsystem-level epistemic quantities using the subsystem-level model and corresponding data. Second, tools of variance-based global sensitivity analysis are used to identify four important epistemic variables (this limitation specified in the NASA-LUQC is reflective of practical engineering situations where not all epistemic variables can be refined due to time/budget constraints) that significantly affect system-level performance. The most significant contribution of this paper is the development of the sequential refine- ment methodology, where epistemic variables for refinement are not identified all-at-once. Instead, only one variable is first identified, and then, Bayesian inference and global sensi- tivity calculations are repeated to identify the next important variable. This procedure is continued until all 4 variables are identified and the refinement in the system-level perfor- mance is computed. The advantages of the proposed sequential refinement methodology over the all-at-once uncertainty refinement approach are explained, and then applied to the NASA Langley Uncertainty Quantification Challenge problem.

  11. Covariate Balance in Bayesian Propensity Score Approaches for Observational Studies

    ERIC Educational Resources Information Center

    Chen, Jianshen; Kaplan, David

    2015-01-01

    Bayesian alternatives to frequentist propensity score approaches have recently been proposed. However, few studies have investigated their covariate balancing properties. This article compares a recently developed two-step Bayesian propensity score approach to the frequentist approach with respect to covariate balance. The effects of different…

  12. Computer-Simulation Surrogates for Optimization: Application to Trapezoidal Ducts and Axisymmetric Bodies

    NASA Technical Reports Server (NTRS)

    Otto, John C.; Paraschivoiu, Marius; Yesilyurt, Serhat; Patera, Anthony T.

    1995-01-01

    Engineering design and optimization efforts using computational systems rapidly become resource intensive. The goal of the surrogate-based approach is to perform a complete optimization with limited resources. In this paper we present a Bayesian-validated approach that informs the designer as to how well the surrogate performs; in particular, our surrogate framework provides precise (albeit probabilistic) bounds on the errors incurred in the surrogate-for-simulation substitution. The theory and algorithms of our computer{simulation surrogate framework are first described. The utility of the framework is then demonstrated through two illustrative examples: maximization of the flowrate of fully developed ow in trapezoidal ducts; and design of an axisymmetric body that achieves a target Stokes drag.

  13. Bayesian analysis of biogeography when the number of areas is large.

    PubMed

    Landis, Michael J; Matzke, Nicholas J; Moore, Brian R; Huelsenbeck, John P

    2013-11-01

    Historical biogeography is increasingly studied from an explicitly statistical perspective, using stochastic models to describe the evolution of species range as a continuous-time Markov process of dispersal between and extinction within a set of discrete geographic areas. The main constraint of these methods is the computational limit on the number of areas that can be specified. We propose a Bayesian approach for inferring biogeographic history that extends the application of biogeographic models to the analysis of more realistic problems that involve a large number of areas. Our solution is based on a "data-augmentation" approach, in which we first populate the tree with a history of biogeographic events that is consistent with the observed species ranges at the tips of the tree. We then calculate the likelihood of a given history by adopting a mechanistic interpretation of the instantaneous-rate matrix, which specifies both the exponential waiting times between biogeographic events and the relative probabilities of each biogeographic change. We develop this approach in a Bayesian framework, marginalizing over all possible biogeographic histories using Markov chain Monte Carlo (MCMC). Besides dramatically increasing the number of areas that can be accommodated in a biogeographic analysis, our method allows the parameters of a given biogeographic model to be estimated and different biogeographic models to be objectively compared. Our approach is implemented in the program, BayArea.

  14. Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling.

    PubMed

    Chatzis, Sotirios P; Andreou, Andreas S

    2015-11-01

    Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.

  15. EEG-fMRI Bayesian framework for neural activity estimation: a simulation study

    NASA Astrophysics Data System (ADS)

    Croce, Pierpaolo; Basti, Alessio; Marzetti, Laura; Zappasodi, Filippo; Del Gratta, Cosimo

    2016-12-01

    Objective. Due to the complementary nature of electroencephalography (EEG) and functional magnetic resonance imaging (fMRI), and given the possibility of simultaneous acquisition, the joint data analysis can afford a better understanding of the underlying neural activity estimation. In this simulation study we want to show the benefit of the joint EEG-fMRI neural activity estimation in a Bayesian framework. Approach. We built a dynamic Bayesian framework in order to perform joint EEG-fMRI neural activity time course estimation. The neural activity is originated by a given brain area and detected by means of both measurement techniques. We have chosen a resting state neural activity situation to address the worst case in terms of the signal-to-noise ratio. To infer information by EEG and fMRI concurrently we used a tool belonging to the sequential Monte Carlo (SMC) methods: the particle filter (PF). Main results. First, despite a high computational cost, we showed the feasibility of such an approach. Second, we obtained an improvement in neural activity reconstruction when using both EEG and fMRI measurements. Significance. The proposed simulation shows the improvements in neural activity reconstruction with EEG-fMRI simultaneous data. The application of such an approach to real data allows a better comprehension of the neural dynamics.

  16. Nonlinear finite element model updating for damage identification of civil structures using batch Bayesian estimation

    NASA Astrophysics Data System (ADS)

    Ebrahimian, Hamed; Astroza, Rodrigo; Conte, Joel P.; de Callafon, Raymond A.

    2017-02-01

    This paper presents a framework for structural health monitoring (SHM) and damage identification of civil structures. This framework integrates advanced mechanics-based nonlinear finite element (FE) modeling and analysis techniques with a batch Bayesian estimation approach to estimate time-invariant model parameters used in the FE model of the structure of interest. The framework uses input excitation and dynamic response of the structure and updates a nonlinear FE model of the structure to minimize the discrepancies between predicted and measured response time histories. The updated FE model can then be interrogated to detect, localize, classify, and quantify the state of damage and predict the remaining useful life of the structure. As opposed to recursive estimation methods, in the batch Bayesian estimation approach, the entire time history of the input excitation and output response of the structure are used as a batch of data to estimate the FE model parameters through a number of iterations. In the case of non-informative prior, the batch Bayesian method leads to an extended maximum likelihood (ML) estimation method to estimate jointly time-invariant model parameters and the measurement noise amplitude. The extended ML estimation problem is solved efficiently using a gradient-based interior-point optimization algorithm. Gradient-based optimization algorithms require the FE response sensitivities with respect to the model parameters to be identified. The FE response sensitivities are computed accurately and efficiently using the direct differentiation method (DDM). The estimation uncertainties are evaluated based on the Cramer-Rao lower bound (CRLB) theorem by computing the exact Fisher Information matrix using the FE response sensitivities with respect to the model parameters. The accuracy of the proposed uncertainty quantification approach is verified using a sampling approach based on the unscented transformation. Two validation studies, based on realistic structural FE models of a bridge pier and a moment resisting steel frame, are performed to validate the performance and accuracy of the presented nonlinear FE model updating approach and demonstrate its application to SHM. These validation studies show the excellent performance of the proposed framework for SHM and damage identification even in the presence of high measurement noise and/or way-out initial estimates of the model parameters. Furthermore, the detrimental effects of the input measurement noise on the performance of the proposed framework are illustrated and quantified through one of the validation studies.

  17. Species trees from consensus single nucleotide polymorphism (SNP) data: Testing phylogenetic approaches with simulated and empirical data.

    PubMed

    Schmidt-Lebuhn, Alexander N; Aitken, Nicola C; Chuah, Aaron

    2017-11-01

    Datasets of hundreds or thousands of SNPs (Single Nucleotide Polymorphisms) from multiple individuals per species are increasingly used to study population structure, species delimitation and shallow phylogenetics. The principal software tool to infer species or population trees from SNP data is currently the BEAST template SNAPP which uses a Bayesian coalescent analysis. However, it is computationally extremely demanding and tolerates only small amounts of missing data. We used simulated and empirical SNPs from plants (Australian Craspedia, Asteraceae, and Pelargonium, Geraniaceae) to compare species trees produced (1) by SNAPP, (2) using SVD quartets, and (3) using Bayesian and parsimony analysis with several different approaches to summarising data from multiple samples into one set of traits per species. Our aims were to explore the impact of tree topology and missing data on the results, and to test which data summarising and analyses approaches would best approximate the results obtained from SNAPP for empirical data. SVD quartets retrieved the correct topology from simulated data, as did SNAPP except in the case of a very unbalanced phylogeny. Both methods failed to retrieve the correct topology when large amounts of data were missing. Bayesian analysis of species level summary data scoring the two alleles of each SNP as independent characters and parsimony analysis of data scoring each SNP as one character produced trees with branch length distributions closest to the true trees on which SNPs were simulated. For empirical data, Bayesian inference and Dollo parsimony analysis of data scored allele-wise produced phylogenies most congruent with the results of SNAPP. In the case of study groups divergent enough for missing data to be phylogenetically informative (because of additional mutations preventing amplification of genomic fragments or bioinformatic establishment of homology), scoring of SNP data as a presence/absence matrix irrespective of allele content might be an additional option. As this depends on sampling across species being reasonably even and a random distribution of non-informative instances of missing data, however, further exploration of this approach is needed. Properly chosen data summary approaches to inferring species trees from SNP data may represent a potential alternative to currently available individual-level coalescent analyses especially for quick data exploration and when dealing with computationally demanding or patchy datasets. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.

  18. Student Modeling in Orthopedic Surgery Training: Exploiting Symbiosis between Temporal Bayesian Networks and Fine-Grained Didactic Analysis

    ERIC Educational Resources Information Center

    Chieu, Vu Minh; Luengo, Vanda; Vadcard, Lucile; Tonetti, Jerome

    2010-01-01

    Cognitive approaches have been used for student modeling in intelligent tutoring systems (ITSs). Many of those systems have tackled fundamental subjects such as mathematics, physics, and computer programming. The change of the student's cognitive behavior over time, however, has not been considered and modeled systematically. Furthermore, the…

  19. Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

    PubMed

    Ekins, Sean; Madrid, Peter B; Sarker, Malabika; Li, Shao-Gang; Mittal, Nisha; Kumar, Pradeep; Wang, Xin; Stratton, Thomas P; Zimmerman, Matthew; Talcott, Carolyn; Bourbon, Pauline; Travers, Mike; Yadav, Maneesh; Freundlich, Joel S

    2015-01-01

    Integrated computational approaches for Mycobacterium tuberculosis (Mtb) are useful to identify new molecules that could lead to future tuberculosis (TB) drugs. Our approach uses information derived from the TBCyc pathway and genome database, the Collaborative Drug Discovery TB database combined with 3D pharmacophores and dual event Bayesian models of whole-cell activity and lack of cytotoxicity. We have prioritized a large number of molecules that may act as mimics of substrates and metabolites in the TB metabolome. We computationally searched over 200,000 commercial molecules using 66 pharmacophores based on substrates and metabolites from Mtb and further filtering with Bayesian models. We ultimately tested 110 compounds in vitro that resulted in two compounds of interest, BAS 04912643 and BAS 00623753 (MIC of 2.5 and 5 μg/mL, respectively). These molecules were used as a starting point for hit-to-lead optimization. The most promising class proved to be the quinoxaline di-N-oxides, evidenced by transcriptional profiling to induce mRNA level perturbations most closely resembling known protonophores. One of these, SRI58 exhibited an MIC = 1.25 μg/mL versus Mtb and a CC50 in Vero cells of >40 μg/mL, while featuring fair Caco-2 A-B permeability (2.3 x 10-6 cm/s), kinetic solubility (125 μM at pH 7.4 in PBS) and mouse metabolic stability (63.6% remaining after 1 h incubation with mouse liver microsomes). Despite demonstration of how a combined bioinformatics/cheminformatics approach afforded a small molecule with promising in vitro profiles, we found that SRI58 did not exhibit quantifiable blood levels in mice.

  20. Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Models for Mycobacterium tuberculosis Drug Discovery

    PubMed Central

    Sarker, Malabika; Li, Shao-Gang; Mittal, Nisha; Kumar, Pradeep; Wang, Xin; Stratton, Thomas P.; Zimmerman, Matthew; Talcott, Carolyn; Bourbon, Pauline; Travers, Mike; Yadav, Maneesh

    2015-01-01

    Integrated computational approaches for Mycobacterium tuberculosis (Mtb) are useful to identify new molecules that could lead to future tuberculosis (TB) drugs. Our approach uses information derived from the TBCyc pathway and genome database, the Collaborative Drug Discovery TB database combined with 3D pharmacophores and dual event Bayesian models of whole-cell activity and lack of cytotoxicity. We have prioritized a large number of molecules that may act as mimics of substrates and metabolites in the TB metabolome. We computationally searched over 200,000 commercial molecules using 66 pharmacophores based on substrates and metabolites from Mtb and further filtering with Bayesian models. We ultimately tested 110 compounds in vitro that resulted in two compounds of interest, BAS 04912643 and BAS 00623753 (MIC of 2.5 and 5 μg/mL, respectively). These molecules were used as a starting point for hit-to-lead optimization. The most promising class proved to be the quinoxaline di-N-oxides, evidenced by transcriptional profiling to induce mRNA level perturbations most closely resembling known protonophores. One of these, SRI58 exhibited an MIC = 1.25 μg/mL versus Mtb and a CC50 in Vero cells of >40 μg/mL, while featuring fair Caco-2 A-B permeability (2.3 x 10−6 cm/s), kinetic solubility (125 μM at pH 7.4 in PBS) and mouse metabolic stability (63.6% remaining after 1 h incubation with mouse liver microsomes). Despite demonstration of how a combined bioinformatics/cheminformatics approach afforded a small molecule with promising in vitro profiles, we found that SRI58 did not exhibit quantifiable blood levels in mice. PMID:26517557

  1. Bayesian seismic tomography by parallel interacting Markov chains

    NASA Astrophysics Data System (ADS)

    Gesret, Alexandrine; Bottero, Alexis; Romary, Thomas; Noble, Mark; Desassis, Nicolas

    2014-05-01

    The velocity field estimated by first arrival traveltime tomography is commonly used as a starting point for further seismological, mineralogical, tectonic or similar analysis. In order to interpret quantitatively the results, the tomography uncertainty values as well as their spatial distribution are required. The estimated velocity model is obtained through inverse modeling by minimizing an objective function that compares observed and computed traveltimes. This step is often performed by gradient-based optimization algorithms. The major drawback of such local optimization schemes, beyond the possibility of being trapped in a local minimum, is that they do not account for the multiple possible solutions of the inverse problem. They are therefore unable to assess the uncertainties linked to the solution. Within a Bayesian (probabilistic) framework, solving the tomography inverse problem aims at estimating the posterior probability density function of velocity model using a global sampling algorithm. Markov chains Monte-Carlo (MCMC) methods are known to produce samples of virtually any distribution. In such a Bayesian inversion, the total number of simulations we can afford is highly related to the computational cost of the forward model. Although fast algorithms have been recently developed for computing first arrival traveltimes of seismic waves, the complete browsing of the posterior distribution of velocity model is hardly performed, especially when it is high dimensional and/or multimodal. In the latter case, the chain may even stay stuck in one of the modes. In order to improve the mixing properties of classical single MCMC, we propose to make interact several Markov chains at different temperatures. This method can make efficient use of large CPU clusters, without increasing the global computational cost with respect to classical MCMC and is therefore particularly suited for Bayesian inversion. The exchanges between the chains allow a precise sampling of the high probability zones of the model space while avoiding the chains to end stuck in a probability maximum. This approach supplies thus a robust way to analyze the tomography imaging uncertainties. The interacting MCMC approach is illustrated on two synthetic examples of tomography of calibration shots such as encountered in induced microseismic studies. On the second application, a wavelet based model parameterization is presented that allows to significantly reduce the dimension of the problem, making thus the algorithm efficient even for a complex velocity model.

  2. Bayesian Computation Emerges in Generic Cortical Microcircuits through Spike-Timing-Dependent Plasticity

    PubMed Central

    Nessler, Bernhard; Pfeiffer, Michael; Buesing, Lars; Maass, Wolfgang

    2013-01-01

    The principles by which networks of neurons compute, and how spike-timing dependent plasticity (STDP) of synaptic weights generates and maintains their computational function, are unknown. Preceding work has shown that soft winner-take-all (WTA) circuits, where pyramidal neurons inhibit each other via interneurons, are a common motif of cortical microcircuits. We show through theoretical analysis and computer simulations that Bayesian computation is induced in these network motifs through STDP in combination with activity-dependent changes in the excitability of neurons. The fundamental components of this emergent Bayesian computation are priors that result from adaptation of neuronal excitability and implicit generative models for hidden causes that are created in the synaptic weights through STDP. In fact, a surprising result is that STDP is able to approximate a powerful principle for fitting such implicit generative models to high-dimensional spike inputs: Expectation Maximization. Our results suggest that the experimentally observed spontaneous activity and trial-to-trial variability of cortical neurons are essential features of their information processing capability, since their functional role is to represent probability distributions rather than static neural codes. Furthermore it suggests networks of Bayesian computation modules as a new model for distributed information processing in the cortex. PMID:23633941

  3. A Comparison of the β-Substitution Method and a Bayesian Method for Analyzing Left-Censored Data.

    PubMed

    Huynh, Tran; Quick, Harrison; Ramachandran, Gurumurthy; Banerjee, Sudipto; Stenzel, Mark; Sandler, Dale P; Engel, Lawrence S; Kwok, Richard K; Blair, Aaron; Stewart, Patricia A

    2016-01-01

    Classical statistical methods for analyzing exposure data with values below the detection limits are well described in the occupational hygiene literature, but an evaluation of a Bayesian approach for handling such data is currently lacking. Here, we first describe a Bayesian framework for analyzing censored data. We then present the results of a simulation study conducted to compare the β-substitution method with a Bayesian method for exposure datasets drawn from lognormal distributions and mixed lognormal distributions with varying sample sizes, geometric standard deviations (GSDs), and censoring for single and multiple limits of detection. For each set of factors, estimates for the arithmetic mean (AM), geometric mean, GSD, and the 95th percentile (X0.95) of the exposure distribution were obtained. We evaluated the performance of each method using relative bias, the root mean squared error (rMSE), and coverage (the proportion of the computed 95% uncertainty intervals containing the true value). The Bayesian method using non-informative priors and the β-substitution method were generally comparable in bias and rMSE when estimating the AM and GM. For the GSD and the 95th percentile, the Bayesian method with non-informative priors was more biased and had a higher rMSE than the β-substitution method, but use of more informative priors generally improved the Bayesian method's performance, making both the bias and the rMSE more comparable to the β-substitution method. An advantage of the Bayesian method is that it provided estimates of uncertainty for these parameters of interest and good coverage, whereas the β-substitution method only provided estimates of uncertainty for the AM, and coverage was not as consistent. Selection of one or the other method depends on the needs of the practitioner, the availability of prior information, and the distribution characteristics of the measurement data. We suggest the use of Bayesian methods if the practitioner has the computational resources and prior information, as the method would generally provide accurate estimates and also provides the distributions of all of the parameters, which could be useful for making decisions in some applications. © The Author 2015. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.

  4. Applying a Bayesian Approach to Identification of Orthotropic Elastic Constants from Full Field Displacement Measurements

    NASA Astrophysics Data System (ADS)

    Gogu, C.; Yin, W.; Haftka, R.; Ifju, P.; Molimard, J.; Le Riche, R.; Vautrin, A.

    2010-06-01

    A major challenge in the identification of material properties is handling different sources of uncertainty in the experiment and the modelling of the experiment for estimating the resulting uncertainty in the identified properties. Numerous improvements in identification methods have provided increasingly accurate estimates of various material properties. However, characterizing the uncertainty in the identified properties is still relatively crude. Different material properties obtained from a single test are not obtained with the same confidence. Typically the highest uncertainty is associated with respect to properties to which the experiment is the most insensitive. In addition, the uncertainty in different properties can be strongly correlated, so that obtaining only variance estimates may be misleading. A possible approach for handling the different sources of uncertainty and estimating the uncertainty in the identified properties is the Bayesian method. This method was introduced in the late 1970s in the context of identification [1] and has been applied since to different problems, notably identification of elastic constants from plate vibration experiments [2]-[4]. The applications of the method to these classical pointwise tests involved only a small number of measurements (typically ten natural frequencies in the previously cited vibration test) which facilitated the application of the Bayesian approach. For identifying elastic constants, full field strain or displacement measurements provide a high number of measured quantities (one measurement per image pixel) and hence a promise of smaller uncertainties in the properties. However, the high number of measurements represents also a major computational challenge in applying the Bayesian approach to full field measurements. To address this challenge we propose an approach based on the proper orthogonal decomposition (POD) of the full fields in order to drastically reduce their dimensionality. POD is based on projecting the full field images on a modal basis, constructed from sample simulations, and which can account for the variations of the full field as the elastic constants and other parameters of interest are varied. The fidelity of the decomposition depends on the number of basis vectors used. Typically even complex fields can be accurately represented with no more than a few dozen modes and for our problem we showed that only four or five modes are sufficient [5]. To further reduce the computational cost of the Bayesian approach we use response surface approximations of the POD coefficients of the fields. We show that 3rd degree polynomial response surface approximations provide a satisfying accuracy. The combination of POD decomposition and response surface methodology allows to bring down the computational time of the Bayesian identification to a few days. The proposed approach is applied to Moiré interferometry full field displacement measurements from a traction experiment on a plate with a hole. The laminate with a layup of [45,- 45,0]s is made out of a Toray® T800/3631 graphite/epoxy prepreg. The measured displacement maps are provided in Figure 1. The mean values of the identified properties joint probability density function are in agreement with previous identifications carried out on the same material. Furthermore the probability density function also provides the coefficient of variation with which the properties are identified as well as the correlations between the various properties. We find that while the longitudinal Young’s modulus is identified with good accuracy (low standard deviation), the Poisson’s ration is identified with much higher uncertainty. Several of the properties are also found to be correlated. The identified uncertainty structure of the elastic constants (i.e. variance co-variance matrix) has potential benefits to reliability analyses, by allowing a more accurate description of the input uncertainty. An additional advantage of the Bayesian approach is that it provides a natural way (in the form of the prior probability density function) for accounting for prior information that may be available on the material properties thought. This is of great interest for reducing the uncertainty on properties that can only be determined with low confidence from the plate with a hole experiment, such as Poisson’s ratio or transverse Young’s modulus in our case.

  5. Efficient inference for genetic association studies with multiple outcomes.

    PubMed

    Ruffieux, Helene; Davison, Anthony C; Hager, Jorg; Irincheeva, Irina

    2017-10-01

    Combined inference for heterogeneous high-dimensional data is critical in modern biology, where clinical and various kinds of molecular data may be available from a single study. Classical genetic association studies regress a single clinical outcome on many genetic variants one by one, but there is an increasing demand for joint analysis of many molecular outcomes and genetic variants in order to unravel functional interactions. Unfortunately, most existing approaches to joint modeling are either too simplistic to be powerful or are impracticable for computational reasons. Inspired by Richardson and others (2010, Bayesian Statistics 9), we consider a sparse multivariate regression model that allows simultaneous selection of predictors and associated responses. As Markov chain Monte Carlo (MCMC) inference on such models can be prohibitively slow when the number of genetic variants exceeds a few thousand, we propose a variational inference approach which produces posterior information very close to that of MCMC inference, at a much reduced computational cost. Extensive numerical experiments show that our approach outperforms popular variable selection methods and tailored Bayesian procedures, dealing within hours with problems involving hundreds of thousands of genetic variants and tens to hundreds of clinical or molecular outcomes. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  6. Fitting models of continuous trait evolution to incompletely sampled comparative data using approximate Bayesian computation.

    PubMed

    Slater, Graham J; Harmon, Luke J; Wegmann, Daniel; Joyce, Paul; Revell, Liam J; Alfaro, Michael E

    2012-03-01

    In recent years, a suite of methods has been developed to fit multiple rate models to phylogenetic comparative data. However, most methods have limited utility at broad phylogenetic scales because they typically require complete sampling of both the tree and the associated phenotypic data. Here, we develop and implement a new, tree-based method called MECCA (Modeling Evolution of Continuous Characters using ABC) that uses a hybrid likelihood/approximate Bayesian computation (ABC)-Markov-Chain Monte Carlo approach to simultaneously infer rates of diversification and trait evolution from incompletely sampled phylogenies and trait data. We demonstrate via simulation that MECCA has considerable power to choose among single versus multiple evolutionary rate models, and thus can be used to test hypotheses about changes in the rate of trait evolution across an incomplete tree of life. We finally apply MECCA to an empirical example of body size evolution in carnivores, and show that there is no evidence for an elevated rate of body size evolution in the pinnipeds relative to terrestrial carnivores. ABC approaches can provide a useful alternative set of tools for future macroevolutionary studies where likelihood-dependent approaches are lacking. © 2011 The Author(s). Evolution© 2011 The Society for the Study of Evolution.

  7. Linking big models to big data: efficient ecosystem model calibration through Bayesian model emulation

    NASA Astrophysics Data System (ADS)

    Fer, I.; Kelly, R.; Andrews, T.; Dietze, M.; Richardson, A. D.

    2016-12-01

    Our ability to forecast ecosystems is limited by how well we parameterize ecosystem models. Direct measurements for all model parameters are not always possible and inverse estimation of these parameters through Bayesian methods is computationally costly. A solution to computational challenges of Bayesian calibration is to approximate the posterior probability surface using a Gaussian Process that emulates the complex process-based model. Here we report the integration of this method within an ecoinformatics toolbox, Predictive Ecosystem Analyzer (PEcAn), and its application with two ecosystem models: SIPNET and ED2.1. SIPNET is a simple model, allowing application of MCMC methods both to the model itself and to its emulator. We used both approaches to assimilate flux (CO2 and latent heat), soil respiration, and soil carbon data from Bartlett Experimental Forest. This comparison showed that emulator is reliable in terms of convergence to the posterior distribution. A 10000-iteration MCMC analysis with SIPNET itself required more than two orders of magnitude greater computation time than an MCMC run of same length with its emulator. This difference would be greater for a more computationally demanding model. Validation of the emulator-calibrated SIPNET against both the assimilated data and out-of-sample data showed improved fit and reduced uncertainty around model predictions. We next applied the validated emulator method to the ED2, whose complexity precludes standard Bayesian data assimilation. We used the ED2 emulator to assimilate demographic data from a network of inventory plots. For validation of the calibrated ED2, we compared the model to results from Empirical Succession Mapping (ESM), a novel synthesis of successional patterns in Forest Inventory and Analysis data. Our results revealed that while the pre-assimilation ED2 formulation cannot capture the emergent demographic patterns from ESM analysis, constrained model parameters controlling demographic processes increased their agreement considerably.

  8. Prediction and assimilation of surf-zone processes using a Bayesian network: Part I: Forward models

    USGS Publications Warehouse

    Plant, Nathaniel G.; Holland, K. Todd

    2011-01-01

    Prediction of coastal processes, including waves, currents, and sediment transport, can be obtained from a variety of detailed geophysical-process models with many simulations showing significant skill. This capability supports a wide range of research and applied efforts that can benefit from accurate numerical predictions. However, the predictions are only as accurate as the data used to drive the models and, given the large temporal and spatial variability of the surf zone, inaccuracies in data are unavoidable such that useful predictions require corresponding estimates of uncertainty. We demonstrate how a Bayesian-network model can be used to provide accurate predictions of wave-height evolution in the surf zone given very sparse and/or inaccurate boundary-condition data. The approach is based on a formal treatment of a data-assimilation problem that takes advantage of significant reduction of the dimensionality of the model system. We demonstrate that predictions of a detailed geophysical model of the wave evolution are reproduced accurately using a Bayesian approach. In this surf-zone application, forward prediction skill was 83%, and uncertainties in the model inputs were accurately transferred to uncertainty in output variables. We also demonstrate that if modeling uncertainties were not conveyed to the Bayesian network (i.e., perfect data or model were assumed), then overly optimistic prediction uncertainties were computed. More consistent predictions and uncertainties were obtained by including model-parameter errors as a source of input uncertainty. Improved predictions (skill of 90%) were achieved because the Bayesian network simultaneously estimated optimal parameters while predicting wave heights.

  9. Continuous-time discrete-space models for animal movement

    USGS Publications Warehouse

    Hanks, Ephraim M.; Hooten, Mevin B.; Alldredge, Mat W.

    2015-01-01

    The processes influencing animal movement and resource selection are complex and varied. Past efforts to model behavioral changes over time used Bayesian statistical models with variable parameter space, such as reversible-jump Markov chain Monte Carlo approaches, which are computationally demanding and inaccessible to many practitioners. We present a continuous-time discrete-space (CTDS) model of animal movement that can be fit using standard generalized linear modeling (GLM) methods. This CTDS approach allows for the joint modeling of location-based as well as directional drivers of movement. Changing behavior over time is modeled using a varying-coefficient framework which maintains the computational simplicity of a GLM approach, and variable selection is accomplished using a group lasso penalty. We apply our approach to a study of two mountain lions (Puma concolor) in Colorado, USA.

  10. A Two-Step Bayesian Approach for Propensity Score Analysis: Simulations and Case Study.

    PubMed

    Kaplan, David; Chen, Jianshen

    2012-07-01

    A two-step Bayesian propensity score approach is introduced that incorporates prior information in the propensity score equation and outcome equation without the problems associated with simultaneous Bayesian propensity score approaches. The corresponding variance estimators are also provided. The two-step Bayesian propensity score is provided for three methods of implementation: propensity score stratification, weighting, and optimal full matching. Three simulation studies and one case study are presented to elaborate the proposed two-step Bayesian propensity score approach. Results of the simulation studies reveal that greater precision in the propensity score equation yields better recovery of the frequentist-based treatment effect. A slight advantage is shown for the Bayesian approach in small samples. Results also reveal that greater precision around the wrong treatment effect can lead to seriously distorted results. However, greater precision around the correct treatment effect parameter yields quite good results, with slight improvement seen with greater precision in the propensity score equation. A comparison of coverage rates for the conventional frequentist approach and proposed Bayesian approach is also provided. The case study reveals that credible intervals are wider than frequentist confidence intervals when priors are non-informative.

  11. Construction of monitoring model and algorithm design on passenger security during shipping based on improved Bayesian network.

    PubMed

    Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng

    2014-01-01

    A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping.

  12. Construction of Monitoring Model and Algorithm Design on Passenger Security during Shipping Based on Improved Bayesian Network

    PubMed Central

    Wang, Jiali; Zhang, Qingnian; Ji, Wenfeng

    2014-01-01

    A large number of data is needed by the computation of the objective Bayesian network, but the data is hard to get in actual computation. The calculation method of Bayesian network was improved in this paper, and the fuzzy-precise Bayesian network was obtained. Then, the fuzzy-precise Bayesian network was used to reason Bayesian network model when the data is limited. The security of passengers during shipping is affected by various factors, and it is hard to predict and control. The index system that has the impact on the passenger safety during shipping was established on basis of the multifield coupling theory in this paper. Meanwhile, the fuzzy-precise Bayesian network was applied to monitor the security of passengers in the shipping process. The model was applied to monitor the passenger safety during shipping of a shipping company in Hainan, and the effectiveness of this model was examined. This research work provides guidance for guaranteeing security of passengers during shipping. PMID:25254227

  13. Advances in Bayesian Modeling in Educational Research

    ERIC Educational Resources Information Center

    Levy, Roy

    2016-01-01

    In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…

  14. Time-reversal and Bayesian inversion

    NASA Astrophysics Data System (ADS)

    Debski, Wojciech

    2017-04-01

    Probabilistic inversion technique is superior to the classical optimization-based approach in all but one aspects. It requires quite exhaustive computations which prohibit its use in huge size inverse problems like global seismic tomography or waveform inversion to name a few. The advantages of the approach are, however, so appealing that there is an ongoing continuous afford to make the large inverse task as mentioned above manageable with the probabilistic inverse approach. One of the perspective possibility to achieve this goal relays on exploring the internal symmetry of the seismological modeling problems in hand - a time reversal and reciprocity invariance. This two basic properties of the elastic wave equation when incorporating into the probabilistic inversion schemata open a new horizons for Bayesian inversion. In this presentation we discuss the time reversal symmetry property, its mathematical aspects and propose how to combine it with the probabilistic inverse theory into a compact, fast inversion algorithm. We illustrate the proposed idea with the newly developed location algorithm TRMLOC and discuss its efficiency when applied to mining induced seismic data.

  15. Inverse problems and computational cell metabolic models: a statistical approach

    NASA Astrophysics Data System (ADS)

    Calvetti, D.; Somersalo, E.

    2008-07-01

    In this article, we give an overview of the Bayesian modelling of metabolic systems at the cellular and subcellular level. The models are based on detailed description of key biochemical reactions occurring in tissue, which may in turn be compartmentalized into cytosol and mitochondria, and of transports between the compartments. The classical deterministic approach which models metabolic systems as dynamical systems with Michaelis-Menten kinetics, is replaced by a stochastic extension where the model parameters are interpreted as random variables with an appropriate probability density. The inverse problem of cell metabolism in this setting consists of estimating the density of the model parameters. After discussing some possible approaches to solving the problem, we address the issue of how to assess the reliability of the predictions of a stochastic model by proposing an output analysis in terms of model uncertainties. Visualization modalities for organizing the large amount of information provided by the Bayesian dynamic sensitivity analysis are also illustrated.

  16. Open-loop-feedback control of serum drug concentrations: pharmacokinetic approaches to drug therapy.

    PubMed

    Jelliffe, R W

    1983-01-01

    Recent developments to optimize open-loop-feedback control of drug dosage regimens, generally applicable to pharmacokinetically oriented therapy with many drugs, involve computation of patient-individualized strategies for obtaining desired serum drug concentrations. Analyses of past therapy are performed by least squares, extended least squares, and maximum a posteriori probability Bayesian methods of fitting pharmacokinetic models to serum level data. Future possibilities for truly optimal open-loop-feedback therapy with full Bayesian methods, and conceivably for optimal closed-loop therapy in such data-poor clinical situations, are also discussed. Implementation of these various therapeutic strategies, using automated, locally controlled infusion devices, has also been achieved in prototype form.

  17. A Mixtures-of-Trees Framework for Multi-Label Classification

    PubMed Central

    Hong, Charmgil; Batal, Iyad; Hauskrecht, Milos

    2015-01-01

    We propose a new probabilistic approach for multi-label classification that aims to represent the class posterior distribution P(Y|X). Our approach uses a mixture of tree-structured Bayesian networks, which can leverage the computational advantages of conditional tree-structured models and the abilities of mixtures to compensate for tree-structured restrictions. We develop algorithms for learning the model from data and for performing multi-label predictions using the learned model. Experiments on multiple datasets demonstrate that our approach outperforms several state-of-the-art multi-label classification methods. PMID:25927011

  18. Reconstructing constructivism: causal models, Bayesian learning mechanisms, and the theory theory.

    PubMed

    Gopnik, Alison; Wellman, Henry M

    2012-11-01

    We propose a new version of the "theory theory" grounded in the computational framework of probabilistic causal models and Bayesian learning. Probabilistic models allow a constructivist but rigorous and detailed approach to cognitive development. They also explain the learning of both more specific causal hypotheses and more abstract framework theories. We outline the new theoretical ideas, explain the computational framework in an intuitive and nontechnical way, and review an extensive but relatively recent body of empirical results that supports these ideas. These include new studies of the mechanisms of learning. Children infer causal structure from statistical information, through their own actions on the world and through observations of the actions of others. Studies demonstrate these learning mechanisms in children from 16 months to 4 years old and include research on causal statistical learning, informal experimentation through play, and imitation and informal pedagogy. They also include studies of the variability and progressive character of intuitive theory change, particularly theory of mind. These studies investigate both the physical and the psychological and social domains. We conclude with suggestions for further collaborative projects between developmental and computational cognitive scientists.

  19. On selecting a prior for the precision parameter of Dirichlet process mixture models

    USGS Publications Warehouse

    Dorazio, R.M.

    2009-01-01

    In hierarchical mixture models the Dirichlet process is used to specify latent patterns of heterogeneity, particularly when the distribution of latent parameters is thought to be clustered (multimodal). The parameters of a Dirichlet process include a precision parameter ?? and a base probability measure G0. In problems where ?? is unknown and must be estimated, inferences about the level of clustering can be sensitive to the choice of prior assumed for ??. In this paper an approach is developed for computing a prior for the precision parameter ?? that can be used in the presence or absence of prior information about the level of clustering. This approach is illustrated in an analysis of counts of stream fishes. The results of this fully Bayesian analysis are compared with an empirical Bayes analysis of the same data and with a Bayesian analysis based on an alternative commonly used prior.

  20. Almost but not quite 2D, Non-linear Bayesian Inversion of CSEM Data

    NASA Astrophysics Data System (ADS)

    Ray, A.; Key, K.; Bodin, T.

    2013-12-01

    The geophysical inverse problem can be elegantly stated in a Bayesian framework where a probability distribution can be viewed as a statement of information regarding a random variable. After all, the goal of geophysical inversion is to provide information on the random variables of interest - physical properties of the earth's subsurface. However, though it may be simple to postulate, a practical difficulty of fully non-linear Bayesian inversion is the computer time required to adequately sample the model space and extract the information we seek. As a consequence, in geophysical problems where evaluation of a full 2D/3D forward model is computationally expensive, such as marine controlled source electromagnetic (CSEM) mapping of the resistivity of seafloor oil and gas reservoirs, Bayesian studies have largely been conducted with 1D forward models. While the 1D approximation is indeed appropriate for exploration targets with planar geometry and geological stratification, it only provides a limited, site-specific idea of uncertainty in resistivity with depth. In this work, we extend our fully non-linear 1D Bayesian inversion to a 2D model framework, without requiring the usual regularization of model resistivities in the horizontal or vertical directions used to stabilize quasi-2D inversions. In our approach, we use the reversible jump Markov-chain Monte-Carlo (RJ-MCMC) or trans-dimensional method and parameterize the subsurface in a 2D plane with Voronoi cells. The method is trans-dimensional in that the number of cells required to parameterize the subsurface is variable, and the cells dynamically move around and multiply or combine as demanded by the data being inverted. This approach allows us to expand our uncertainty analysis of resistivity at depth to more than a single site location, allowing for interactions between model resistivities at different horizontal locations along a traverse over an exploration target. While the model is parameterized in 2D, we efficiently evaluate the forward response using 1D profiles extracted from the model at the common-midpoints of the EM source-receiver pairs. Since the 1D approximation is locally valid at different midpoint locations, the computation time is far lower than is required by a full 2D or 3D simulation. We have applied this method to both synthetic and real CSEM survey data from the Scarborough gas field on the Northwest shelf of Australia, resulting in a spatially variable quantification of resistivity and its uncertainty in 2D. This Bayesian approach results in a large database of 2D models that comprise a posterior probability distribution, which we can subset to test various hypotheses about the range of model structures compatible with the data. For example, we can subset the model distributions to examine the hypothesis that a resistive reservoir extends overs a certain spatial extent. Depending on how this conditions other parts of the model space, light can be shed on the geological viability of the hypothesis. Since tackling spatially variable uncertainty and trade-offs in 2D and 3D is a challenging research problem, the insights gained from this work may prove valuable for subsequent full 2D and 3D Bayesian inversions.

  1. Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation

    PubMed Central

    Cornuet, Jean-Marie; Santos, Filipe; Beaumont, Mark A.; Robert, Christian P.; Marin, Jean-Michel; Balding, David J.; Guillemaud, Thomas; Estoup, Arnaud

    2008-01-01

    Summary: Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for simple and standard situations typically involving a small number of samples. We propose here a computer program (DIY ABC) for inference based on approximate Bayesian computation (ABC), in which scenarios can be customized by the user to fit many complex situations involving any number of populations and samples. Such scenarios involve any combination of population divergences, admixtures and population size changes. DIY ABC can be used to compare competing scenarios, estimate parameters for one or more scenarios and compute bias and precision measures for a given scenario and known values of parameters (the current version applies to unlinked microsatellite data). This article describes key methods used in the program and provides its main features. The analysis of one simulated and one real dataset, both with complex evolutionary scenarios, illustrates the main possibilities of DIY ABC. Availability: The software DIY ABC is freely available at http://www.montpellier.inra.fr/CBGP/diyabc. Contact: j.cornuet@imperial.ac.uk Supplementary information: Supplementary data are also available at http://www.montpellier.inra.fr/CBGP/diyabc PMID:18842597

  2. Meta-analysis of diagnostic test data: a bivariate Bayesian modeling approach.

    PubMed

    Verde, Pablo E

    2010-12-30

    In the last decades, the amount of published results on clinical diagnostic tests has expanded very rapidly. The counterpart to this development has been the formal evaluation and synthesis of diagnostic results. However, published results present substantial heterogeneity and they can be regarded as so far removed from the classical domain of meta-analysis, that they can provide a rather severe test of classical statistical methods. Recently, bivariate random effects meta-analytic methods, which model the pairs of sensitivities and specificities, have been presented from the classical point of view. In this work a bivariate Bayesian modeling approach is presented. This approach substantially extends the scope of classical bivariate methods by allowing the structural distribution of the random effects to depend on multiple sources of variability. Meta-analysis is summarized by the predictive posterior distributions for sensitivity and specificity. This new approach allows, also, to perform substantial model checking, model diagnostic and model selection. Statistical computations are implemented in the public domain statistical software (WinBUGS and R) and illustrated with real data examples. Copyright © 2010 John Wiley & Sons, Ltd.

  3. ABrox-A user-friendly Python module for approximate Bayesian computation with a focus on model comparison.

    PubMed

    Mertens, Ulf Kai; Voss, Andreas; Radev, Stefan

    2018-01-01

    We give an overview of the basic principles of approximate Bayesian computation (ABC), a class of stochastic methods that enable flexible and likelihood-free model comparison and parameter estimation. Our new open-source software called ABrox is used to illustrate ABC for model comparison on two prominent statistical tests, the two-sample t-test and the Levene-Test. We further highlight the flexibility of ABC compared to classical Bayesian hypothesis testing by computing an approximate Bayes factor for two multinomial processing tree models. Last but not least, throughout the paper, we introduce ABrox using the accompanied graphical user interface.

  4. User-customized brain computer interfaces using Bayesian optimization

    NASA Astrophysics Data System (ADS)

    Bashashati, Hossein; Ward, Rabab K.; Bashashati, Ali

    2016-04-01

    Objective. The brain characteristics of different people are not the same. Brain computer interfaces (BCIs) should thus be customized for each individual person. In motor-imagery based synchronous BCIs, a number of parameters (referred to as hyper-parameters) including the EEG frequency bands, the channels and the time intervals from which the features are extracted should be pre-determined based on each subject’s brain characteristics. Approach. To determine the hyper-parameter values, previous work has relied on manual or semi-automatic methods that are not applicable to high-dimensional search spaces. In this paper, we propose a fully automatic, scalable and computationally inexpensive algorithm that uses Bayesian optimization to tune these hyper-parameters. We then build different classifiers trained on the sets of hyper-parameter values proposed by the Bayesian optimization. A final classifier aggregates the results of the different classifiers. Main Results. We have applied our method to 21 subjects from three BCI competition datasets. We have conducted rigorous statistical tests, and have shown the positive impact of hyper-parameter optimization in improving the accuracy of BCIs. Furthermore, We have compared our results to those reported in the literature. Significance. Unlike the best reported results in the literature, which are based on more sophisticated feature extraction and classification methods, and rely on prestudies to determine the hyper-parameter values, our method has the advantage of being fully automated, uses less sophisticated feature extraction and classification methods, and yields similar or superior results compared to the best performing designs in the literature.

  5. Calibrated birth-death phylogenetic time-tree priors for bayesian inference.

    PubMed

    Heled, Joseph; Drummond, Alexei J

    2015-05-01

    Here we introduce a general class of multiple calibration birth-death tree priors for use in Bayesian phylogenetic inference. All tree priors in this class separate ancestral node heights into a set of "calibrated nodes" and "uncalibrated nodes" such that the marginal distribution of the calibrated nodes is user-specified whereas the density ratio of the birth-death prior is retained for trees with equal values for the calibrated nodes. We describe two formulations, one in which the calibration information informs the prior on ranked tree topologies, through the (conditional) prior, and the other which factorizes the prior on divergence times and ranked topologies, thus allowing uniform, or any arbitrary prior distribution on ranked topologies. Although the first of these formulations has some attractive properties, the algorithm we present for computing its prior density is computationally intensive. However, the second formulation is always faster and computationally efficient for up to six calibrations. We demonstrate the utility of the new class of multiple-calibration tree priors using both small simulations and a real-world analysis and compare the results to existing schemes. The two new calibrated tree priors described in this article offer greater flexibility and control of prior specification in calibrated time-tree inference and divergence time dating, and will remove the need for indirect approaches to the assessment of the combined effect of calibration densities and tree priors in Bayesian phylogenetic inference. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  6. Approximate Bayesian computation for forward modeling in cosmology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Akeret, Joël; Refregier, Alexandre; Amara, Adam

    Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to themore » posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release.« less

  7. Simulation-based estimation of mean and standard deviation for meta-analysis via Approximate Bayesian Computation (ABC).

    PubMed

    Kwon, Deukwoo; Reis, Isildinha M

    2015-08-12

    When conducting a meta-analysis of a continuous outcome, estimated means and standard deviations from the selected studies are required in order to obtain an overall estimate of the mean effect and its confidence interval. If these quantities are not directly reported in the publications, they must be estimated from other reported summary statistics, such as the median, the minimum, the maximum, and quartiles. We propose a simulation-based estimation approach using the Approximate Bayesian Computation (ABC) technique for estimating mean and standard deviation based on various sets of summary statistics found in published studies. We conduct a simulation study to compare the proposed ABC method with the existing methods of Hozo et al. (2005), Bland (2015), and Wan et al. (2014). In the estimation of the standard deviation, our ABC method performs better than the other methods when data are generated from skewed or heavy-tailed distributions. The corresponding average relative error (ARE) approaches zero as sample size increases. In data generated from the normal distribution, our ABC performs well. However, the Wan et al. method is best for estimating standard deviation under normal distribution. In the estimation of the mean, our ABC method is best regardless of assumed distribution. ABC is a flexible method for estimating the study-specific mean and standard deviation for meta-analysis, especially with underlying skewed or heavy-tailed distributions. The ABC method can be applied using other reported summary statistics such as the posterior mean and 95 % credible interval when Bayesian analysis has been employed.

  8. Combining Computational Methods for Hit to Lead Optimization in Mycobacterium tuberculosis Drug Discovery

    PubMed Central

    Ekins, Sean; Freundlich, Joel S.; Hobrath, Judith V.; White, E. Lucile; Reynolds, Robert C

    2013-01-01

    Purpose Tuberculosis treatments need to be shorter and overcome drug resistance. Our previous large scale phenotypic high-throughput screening against Mycobacterium tuberculosis (Mtb) has identified 737 active compounds and thousands that are inactive. We have used this data for building computational models as an approach to minimize the number of compounds tested. Methods A cheminformatics clustering approach followed by Bayesian machine learning models (based on publicly available Mtb screening data) was used to illustrate that application of these models for screening set selections can enrich the hit rate. Results In order to explore chemical diversity around active cluster scaffolds of the dose-response hits obtained from our previous Mtb screens a set of 1924 commercially available molecules have been selected and evaluated for antitubercular activity and cytotoxicity using Vero, THP-1 and HepG2 cell lines with 4.3%, 4.2% and 2.7% hit rates, respectively. We demonstrate that models incorporating antitubercular and cytotoxicity data in Vero cells can significantly enrich the selection of non-toxic actives compared to random selection. Across all cell lines, the Molecular Libraries Small Molecule Repository (MLSMR) and cytotoxicity model identified ~10% of the hits in the top 1% screened (>10 fold enrichment). We also showed that seven out of nine Mtb active compounds from different academic published studies and eight out of eleven Mtb active compounds from a pharmaceutical screen (GSK) would have been identified by these Bayesian models. Conclusion Combining clustering and Bayesian models represents a useful strategy for compound prioritization and hit-to lead optimization of antitubercular agents. PMID:24132686

  9. Kernel-density estimation and approximate Bayesian computation for flexible epidemiological model fitting in Python.

    PubMed

    Irvine, Michael A; Hollingsworth, T Déirdre

    2018-05-26

    Fitting complex models to epidemiological data is a challenging problem: methodologies can be inaccessible to all but specialists, there may be challenges in adequately describing uncertainty in model fitting, the complex models may take a long time to run, and it can be difficult to fully capture the heterogeneity in the data. We develop an adaptive approximate Bayesian computation scheme to fit a variety of epidemiologically relevant data with minimal hyper-parameter tuning by using an adaptive tolerance scheme. We implement a novel kernel density estimation scheme to capture both dispersed and multi-dimensional data, and directly compare this technique to standard Bayesian approaches. We then apply the procedure to a complex individual-based simulation of lymphatic filariasis, a human parasitic disease. The procedure and examples are released alongside this article as an open access library, with examples to aid researchers to rapidly fit models to data. This demonstrates that an adaptive ABC scheme with a general summary and distance metric is capable of performing model fitting for a variety of epidemiological data. It also does not require significant theoretical background to use and can be made accessible to the diverse epidemiological research community. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  10. Nested Sampling for Bayesian Model Comparison in the Context of Salmonella Disease Dynamics

    PubMed Central

    Dybowski, Richard; McKinley, Trevelyan J.; Mastroeni, Pietro; Restif, Olivier

    2013-01-01

    Understanding the mechanisms underlying the observed dynamics of complex biological systems requires the statistical assessment and comparison of multiple alternative models. Although this has traditionally been done using maximum likelihood-based methods such as Akaike's Information Criterion (AIC), Bayesian methods have gained in popularity because they provide more informative output in the form of posterior probability distributions. However, comparison between multiple models in a Bayesian framework is made difficult by the computational cost of numerical integration over large parameter spaces. A new, efficient method for the computation of posterior probabilities has recently been proposed and applied to complex problems from the physical sciences. Here we demonstrate how nested sampling can be used for inference and model comparison in biological sciences. We present a reanalysis of data from experimental infection of mice with Salmonella enterica showing the distribution of bacteria in liver cells. In addition to confirming the main finding of the original analysis, which relied on AIC, our approach provides: (a) integration across the parameter space, (b) estimation of the posterior parameter distributions (with visualisations of parameter correlations), and (c) estimation of the posterior predictive distributions for goodness-of-fit assessments of the models. The goodness-of-fit results suggest that alternative mechanistic models and a relaxation of the quasi-stationary assumption should be considered. PMID:24376528

  11. Bayesian networks in overlay recipe optimization

    NASA Astrophysics Data System (ADS)

    Binns, Lewis A.; Reynolds, Greg; Rigden, Timothy C.; Watkins, Stephen; Soroka, Andrew

    2005-05-01

    Currently, overlay measurements are characterized by "recipe", which defines both physical parameters such as focus, illumination et cetera, and also the software parameters such as algorithm to be used and regions of interest. Setting up these recipes requires both engineering time and wafer availability on an overlay tool, so reducing these requirements will result in higher tool productivity. One of the significant challenges to automating this process is that the parameters are highly and complexly correlated. At the same time, a high level of traceability and transparency is required in the recipe creation process, so a technique that maintains its decisions in terms of well defined physical parameters is desirable. Running time should be short, given the system (automatic recipe creation) is being implemented to reduce overheads. Finally, a failure of the system to determine acceptable parameters should be obvious, so a certainty metric is also desirable. The complex, nonlinear interactions make solution by an expert system difficult at best, especially in the verification of the resulting decision network. The transparency requirements tend to preclude classical neural networks and similar techniques. Genetic algorithms and other "global minimization" techniques require too much computational power (given system footprint and cost requirements). A Bayesian network, however, provides a solution to these requirements. Such a network, with appropriate priors, can be used during recipe creation / optimization not just to select a good set of parameters, but also to guide the direction of search, by evaluating the network state while only incomplete information is available. As a Bayesian network maintains an estimate of the probability distribution of nodal values, a maximum-entropy approach can be utilized to obtain a working recipe in a minimum or near-minimum number of steps. In this paper we discuss the potential use of a Bayesian network in such a capacity, reducing the amount of engineering intervention. We discuss the benefits of this approach, especially improved repeatability and traceability of the learning process, and quantification of uncertainty in decisions made. We also consider the problems associated with this approach, especially in detailed construction of network topology, validation of the Bayesian network and the recipes it generates, and issues arising from the integration of a Bayesian network with a complex multithreaded application; these primarily relate to maintaining Bayesian network and system architecture integrity.

  12. Introduction to Bayesian statistical approaches to compositional analyses of transgenic crops 1. Model validation and setting the stage.

    PubMed

    Harrison, Jay M; Breeze, Matthew L; Harrigan, George G

    2011-08-01

    Statistical comparisons of compositional data generated on genetically modified (GM) crops and their near-isogenic conventional (non-GM) counterparts typically rely on classical significance testing. This manuscript presents an introduction to Bayesian methods for compositional analysis along with recommendations for model validation. The approach is illustrated using protein and fat data from two herbicide tolerant GM soybeans (MON87708 and MON87708×MON89788) and a conventional comparator grown in the US in 2008 and 2009. Guidelines recommended by the US Food and Drug Administration (FDA) in conducting Bayesian analyses of clinical studies on medical devices were followed. This study is the first Bayesian approach to GM and non-GM compositional comparisons. The evaluation presented here supports a conclusion that a Bayesian approach to analyzing compositional data can provide meaningful and interpretable results. We further describe the importance of method validation and approaches to model checking if Bayesian approaches to compositional data analysis are to be considered viable by scientists involved in GM research and regulation. Copyright © 2011 Elsevier Inc. All rights reserved.

  13. Bayesian approach to MSD-based analysis of particle motion in live cells.

    PubMed

    Monnier, Nilah; Guo, Syuan-Ming; Mori, Masashi; He, Jun; Lénárt, Péter; Bathe, Mark

    2012-08-08

    Quantitative tracking of particle motion using live-cell imaging is a powerful approach to understanding the mechanism of transport of biological molecules, organelles, and cells. However, inferring complex stochastic motion models from single-particle trajectories in an objective manner is nontrivial due to noise from sampling limitations and biological heterogeneity. Here, we present a systematic Bayesian approach to multiple-hypothesis testing of a general set of competing motion models based on particle mean-square displacements that automatically classifies particle motion, properly accounting for sampling limitations and correlated noise while appropriately penalizing model complexity according to Occam's Razor to avoid over-fitting. We test the procedure rigorously using simulated trajectories for which the underlying physical process is known, demonstrating that it chooses the simplest physical model that explains the observed data. Further, we show that computed model probabilities provide a reliability test for the downstream biological interpretation of associated parameter values. We subsequently illustrate the broad utility of the approach by applying it to disparate biological systems including experimental particle trajectories from chromosomes, kinetochores, and membrane receptors undergoing a variety of complex motions. This automated and objective Bayesian framework easily scales to large numbers of particle trajectories, making it ideal for classifying the complex motion of large numbers of single molecules and cells from high-throughput screens, as well as single-cell-, tissue-, and organism-level studies. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  14. Bayesian data fusion for spatial prediction of categorical variables in environmental sciences

    NASA Astrophysics Data System (ADS)

    Gengler, Sarah; Bogaert, Patrick

    2014-12-01

    First developed to predict continuous variables, Bayesian Maximum Entropy (BME) has become a complete framework in the context of space-time prediction since it has been extended to predict categorical variables and mixed random fields. This method proposes solutions to combine several sources of data whatever the nature of the information. However, the various attempts that were made for adapting the BME methodology to categorical variables and mixed random fields faced some limitations, as a high computational burden. The main objective of this paper is to overcome this limitation by generalizing the Bayesian Data Fusion (BDF) theoretical framework to categorical variables, which is somehow a simplification of the BME method through the convenient conditional independence hypothesis. The BDF methodology for categorical variables is first described and then applied to a practical case study: the estimation of soil drainage classes using a soil map and point observations in the sandy area of Flanders around the city of Mechelen (Belgium). The BDF approach is compared to BME along with more classical approaches, as Indicator CoKringing (ICK) and logistic regression. Estimators are compared using various indicators, namely the Percentage of Correctly Classified locations (PCC) and the Average Highest Probability (AHP). Although BDF methodology for categorical variables is somehow a simplification of BME approach, both methods lead to similar results and have strong advantages compared to ICK and logistic regression.

  15. Bayesian modeling of flexible cognitive control

    PubMed Central

    Jiang, Jiefeng; Heller, Katherine; Egner, Tobias

    2014-01-01

    “Cognitive control” describes endogenous guidance of behavior in situations where routine stimulus-response associations are suboptimal for achieving a desired goal. The computational and neural mechanisms underlying this capacity remain poorly understood. We examine recent advances stemming from the application of a Bayesian learner perspective that provides optimal prediction for control processes. In reviewing the application of Bayesian models to cognitive control, we note that an important limitation in current models is a lack of a plausible mechanism for the flexible adjustment of control over conflict levels changing at varying temporal scales. We then show that flexible cognitive control can be achieved by a Bayesian model with a volatility-driven learning mechanism that modulates dynamically the relative dependence on recent and remote experiences in its prediction of future control demand. We conclude that the emergent Bayesian perspective on computational mechanisms of cognitive control holds considerable promise, especially if future studies can identify neural substrates of the variables encoded by these models, and determine the nature (Bayesian or otherwise) of their neural implementation. PMID:24929218

  16. Convergence analysis of surrogate-based methods for Bayesian inverse problems

    NASA Astrophysics Data System (ADS)

    Yan, Liang; Zhang, Yuan-Xiang

    2017-12-01

    The major challenges in the Bayesian inverse problems arise from the need for repeated evaluations of the forward model, as required by Markov chain Monte Carlo (MCMC) methods for posterior sampling. Many attempts at accelerating Bayesian inference have relied on surrogates for the forward model, typically constructed through repeated forward simulations that are performed in an offline phase. Although such approaches can be quite effective at reducing computation cost, there has been little analysis of the approximation on posterior inference. In this work, we prove error bounds on the Kullback-Leibler (KL) distance between the true posterior distribution and the approximation based on surrogate models. Our rigorous error analysis show that if the forward model approximation converges at certain rate in the prior-weighted L 2 norm, then the posterior distribution generated by the approximation converges to the true posterior at least two times faster in the KL sense. The error bound on the Hellinger distance is also provided. To provide concrete examples focusing on the use of the surrogate model based methods, we present an efficient technique for constructing stochastic surrogate models to accelerate the Bayesian inference approach. The Christoffel least squares algorithms, based on generalized polynomial chaos, are used to construct a polynomial approximation of the forward solution over the support of the prior distribution. The numerical strategy and the predicted convergence rates are then demonstrated on the nonlinear inverse problems, involving the inference of parameters appearing in partial differential equations.

  17. Exploiting neurovascular coupling: a Bayesian sequential Monte Carlo approach applied to simulated EEG fNIRS data

    NASA Astrophysics Data System (ADS)

    Croce, Pierpaolo; Zappasodi, Filippo; Merla, Arcangelo; Chiarelli, Antonio Maria

    2017-08-01

    Objective. Electrical and hemodynamic brain activity are linked through the neurovascular coupling process and they can be simultaneously measured through integration of electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS). Thanks to the lack of electro-optical interference, the two procedures can be easily combined and, whereas EEG provides electrophysiological information, fNIRS can provide measurements of two hemodynamic variables, such as oxygenated and deoxygenated hemoglobin. A Bayesian sequential Monte Carlo approach (particle filter, PF) was applied to simulated recordings of electrical and neurovascular mediated hemodynamic activity, and the advantages of a unified framework were shown. Approach. Multiple neural activities and hemodynamic responses were simulated in the primary motor cortex of a subject brain. EEG and fNIRS recordings were obtained by means of forward models of volume conduction and light propagation through the head. A state space model of combined EEG and fNIRS data was built and its dynamic evolution was estimated through a Bayesian sequential Monte Carlo approach (PF). Main results. We showed the feasibility of the procedure and the improvements in both electrical and hemodynamic brain activity reconstruction when using the PF on combined EEG and fNIRS measurements. Significance. The investigated procedure allows one to combine the information provided by the two methodologies, and, by taking advantage of a physical model of the coupling between electrical and hemodynamic response, to obtain a better estimate of brain activity evolution. Despite the high computational demand, application of such an approach to in vivo recordings could fully exploit the advantages of this combined brain imaging technology.

  18. Critically evaluating the theory and performance of Bayesian analysis of macroevolutionary mixtures

    PubMed Central

    Moore, Brian R.; Höhna, Sebastian; May, Michael R.; Rannala, Bruce; Huelsenbeck, John P.

    2016-01-01

    Bayesian analysis of macroevolutionary mixtures (BAMM) has recently taken the study of lineage diversification by storm. BAMM estimates the diversification-rate parameters (speciation and extinction) for every branch of a study phylogeny and infers the number and location of diversification-rate shifts across branches of a tree. Our evaluation of BAMM reveals two major theoretical errors: (i) the likelihood function (which estimates the model parameters from the data) is incorrect, and (ii) the compound Poisson process prior model (which describes the prior distribution of diversification-rate shifts across branches) is incoherent. Using simulation, we demonstrate that these theoretical issues cause statistical pathologies; posterior estimates of the number of diversification-rate shifts are strongly influenced by the assumed prior, and estimates of diversification-rate parameters are unreliable. Moreover, the inability to correctly compute the likelihood or to correctly specify the prior for rate-variable trees precludes the use of Bayesian approaches for testing hypotheses regarding the number and location of diversification-rate shifts using BAMM. PMID:27512038

  19. A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E

    2013-06-01

    Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.

  20. Why formal learning theory matters for cognitive science.

    PubMed

    Fulop, Sean; Chater, Nick

    2013-01-01

    This article reviews a number of different areas in the foundations of formal learning theory. After outlining the general framework for formal models of learning, the Bayesian approach to learning is summarized. This leads to a discussion of Solomonoff's Universal Prior Distribution for Bayesian learning. Gold's model of identification in the limit is also outlined. We next discuss a number of aspects of learning theory raised in contributed papers, related to both computational and representational complexity. The article concludes with a description of how semi-supervised learning can be applied to the study of cognitive learning models. Throughout this overview, the specific points raised by our contributing authors are connected to the models and methods under review. Copyright © 2013 Cognitive Science Society, Inc.

  1. A New Mathematical Framework for Design Under Uncertainty

    DTIC Science & Technology

    2016-05-05

    blending multiple information sources via auto-regressive stochastic modeling. A computationally efficient machine learning framework is developed based on...sion and machine learning approaches; see Fig. 1. This will lead to a comprehensive description of system performance with less uncertainty than in the...Bayesian optimization of super-cavitating hy- drofoils The goal of this study is to demonstrate the capabilities of statistical learning and

  2. Non-Metric Similarity Measures

    DTIC Science & Technology

    2015-03-26

    Sunil Aryal and Kai Ming Ting. (2015) A generic ensemble approach to estimate multi-dimensional likelihood in Bayesian classifier learning...Computational Intelligence. http://onlinelibrary.wiley.com/doi/10.1111/coin.12063/abstract 5.2 List of peer-reviewed conference publications [3] Sunil Aryal...International Conference on Data Mining. 707-711. [4] Sunil Aryal, Kai Ming Ting, Jonathan R. Wells and Takashi Washio. (2014) Improv- ing iForest with

  3. Greenhouse Gas Source Attribution: Measurements Modeling and Uncertainty Quantification

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Zhen; Safta, Cosmin; Sargsyan, Khachik

    2014-09-01

    In this project we have developed atmospheric measurement capabilities and a suite of atmospheric modeling and analysis tools that are well suited for verifying emissions of green- house gases (GHGs) on an urban-through-regional scale. We have for the first time applied the Community Multiscale Air Quality (CMAQ) model to simulate atmospheric CO 2 . This will allow for the examination of regional-scale transport and distribution of CO 2 along with air pollutants traditionally studied using CMAQ at relatively high spatial and temporal resolution with the goal of leveraging emissions verification efforts for both air quality and climate. We have developedmore » a bias-enhanced Bayesian inference approach that can remedy the well-known problem of transport model errors in atmospheric CO 2 inversions. We have tested the approach using data and model outputs from the TransCom3 global CO 2 inversion comparison project. We have also performed two prototyping studies on inversion approaches in the generalized convection-diffusion context. One of these studies employed Polynomial Chaos Expansion to accelerate the evaluation of a regional transport model and enable efficient Markov Chain Monte Carlo sampling of the posterior for Bayesian inference. The other approach uses de- terministic inversion of a convection-diffusion-reaction system in the presence of uncertainty. These approaches should, in principle, be applicable to realistic atmospheric problems with moderate adaptation. We outline a regional greenhouse gas source inference system that integrates (1) two ap- proaches of atmospheric dispersion simulation and (2) a class of Bayesian inference and un- certainty quantification algorithms. We use two different and complementary approaches to simulate atmospheric dispersion. Specifically, we use a Eulerian chemical transport model CMAQ and a Lagrangian Particle Dispersion Model - FLEXPART-WRF. These two models share the same WRF assimilated meteorology fields, making it possible to perform a hybrid simulation, in which the Eulerian model (CMAQ) can be used to compute the initial condi- tion needed by the Lagrangian model, while the source-receptor relationships for a large state vector can be efficiently computed using the Lagrangian model in its backward mode. In ad- dition, CMAQ has a complete treatment of atmospheric chemistry of a suite of traditional air pollutants, many of which could help attribute GHGs from different sources. The inference of emissions sources using atmospheric observations is cast as a Bayesian model calibration problem, which is solved using a variety of Bayesian techniques, such as the bias-enhanced Bayesian inference algorithm, which accounts for the intrinsic model deficiency, Polynomial Chaos Expansion to accelerate model evaluation and Markov Chain Monte Carlo sampling, and Karhunen-Lo %60 eve (KL) Expansion to reduce the dimensionality of the state space. We have established an atmospheric measurement site in Livermore, CA and are collect- ing continuous measurements of CO 2 , CH 4 and other species that are typically co-emitted with these GHGs. Measurements of co-emitted species can assist in attributing the GHGs to different emissions sectors. Automatic calibrations using traceable standards are performed routinely for the gas-phase measurements. We are also collecting standard meteorological data at the Livermore site as well as planetary boundary height measurements using a ceilometer. The location of the measurement site is well suited to sample air transported between the San Francisco Bay area and the California Central Valley.« less

  4. A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

    EPA Science Inventory

    We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...

  5. A Tutorial in Bayesian Potential Outcomes Mediation Analysis.

    PubMed

    Miočević, Milica; Gonzalez, Oscar; Valente, Matthew J; MacKinnon, David P

    2018-01-01

    Statistical mediation analysis is used to investigate intermediate variables in the relation between independent and dependent variables. Causal interpretation of mediation analyses is challenging because randomization of subjects to levels of the independent variable does not rule out the possibility of unmeasured confounders of the mediator to outcome relation. Furthermore, commonly used frequentist methods for mediation analysis compute the probability of the data given the null hypothesis, which is not the probability of a hypothesis given the data as in Bayesian analysis. Under certain assumptions, applying the potential outcomes framework to mediation analysis allows for the computation of causal effects, and statistical mediation in the Bayesian framework gives indirect effects probabilistic interpretations. This tutorial combines causal inference and Bayesian methods for mediation analysis so the indirect and direct effects have both causal and probabilistic interpretations. Steps in Bayesian causal mediation analysis are shown in the application to an empirical example.

  6. A guide to Bayesian model selection for ecologists

    USGS Publications Warehouse

    Hooten, Mevin B.; Hobbs, N.T.

    2015-01-01

    The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.

  7. A new approach for measuring power spectra and reconstructing time series in active galactic nuclei

    NASA Astrophysics Data System (ADS)

    Li, Yan-Rong; Wang, Jian-Min

    2018-05-01

    We provide a new approach to measure power spectra and reconstruct time series in active galactic nuclei (AGNs) based on the fact that the Fourier transform of AGN stochastic variations is a series of complex Gaussian random variables. The approach parametrizes a stochastic series in frequency domain and transforms it back to time domain to fit the observed data. The parameters and their uncertainties are derived in a Bayesian framework, which also allows us to compare the relative merits of different power spectral density models. The well-developed fast Fourier transform algorithm together with parallel computation enables an acceptable time complexity for the approach.

  8. A Bayesian approach to the characterization of electroencephalographic recordings in premature infants

    NASA Astrophysics Data System (ADS)

    Mitchell, Timothy J.

    Preterm infants are particularly susceptible to cerebral injury, and electroencephalographic (EEG) recordings provide an important diagnostic tool for determining cerebral health. However, interpreting these EEG recordings is challenging and requires the skills of a trained electroencephalographer. Because these EEG specialists are rare, an automated interpretation of newborn EEG recordings would increase access to an important diagnostic tool for physicians. To automate this procedure, we employ a novel Bayesian approach to compute the probability of EEG features (waveforms) including suppression, delta brushes, and delta waves. The power of this approach lies not only in its ability to closely mimic the techniques used by EEG specialists, but also its ability to be generalized to identify other waveforms that may be of interest for future work. The results of these calculations are used in a program designed to output simple statistics related to the presence or absence of such features. Direct comparison of the software with expert human readers has indicated satisfactory performance, and the algorithm has shown promise in its ability to distinguish between infants with normal neurodevelopmental outcome and those with poor neurodevelopmental outcome.

  9. Fuzzy Naive Bayesian model for medical diagnostic decision support.

    PubMed

    Wagholikar, Kavishwar B; Vijayraghavan, Sundararajan; Deshpande, Ashok W

    2009-01-01

    This work relates to the development of computational algorithms to provide decision support to physicians. The authors propose a Fuzzy Naive Bayesian (FNB) model for medical diagnosis, which extends the Fuzzy Bayesian approach proposed by Okuda. A physician's interview based method is described to define a orthogonal fuzzy symptom information system, required to apply the model. For the purpose of elaboration and elicitation of characteristics, the algorithm is applied to a simple simulated dataset, and compared with conventional Naive Bayes (NB) approach. As a preliminary evaluation of FNB in real world scenario, the comparison is repeated on a real fuzzy dataset of 81 patients diagnosed with infectious diseases. The case study on simulated dataset elucidates that FNB can be optimal over NB for diagnosing patients with imprecise-fuzzy information, on account of the following characteristics - 1) it can model the information that, values of some attributes are semantically closer than values of other attributes, and 2) it offers a mechanism to temper exaggerations in patient information. Although the algorithm requires precise training data, its utility for fuzzy training data is argued for. This is supported by the case study on infectious disease dataset, which indicates optimality of FNB over NB for the infectious disease domain. Further case studies on large datasets are required to establish utility of FNB.

  10. Probabilistic Space Weather Forecasting: a Bayesian Perspective

    NASA Astrophysics Data System (ADS)

    Camporeale, E.; Chandorkar, M.; Borovsky, J.; Care', A.

    2017-12-01

    Most of the Space Weather forecasts, both at operational and research level, are not probabilistic in nature. Unfortunately, a prediction that does not provide a confidence level is not very useful in a decision-making scenario. Nowadays, forecast models range from purely data-driven, machine learning algorithms, to physics-based approximation of first-principle equations (and everything that sits in between). Uncertainties pervade all such models, at every level: from the raw data to finite-precision implementation of numerical methods. The most rigorous way of quantifying the propagation of uncertainties is by embracing a Bayesian probabilistic approach. One of the simplest and most robust machine learning technique in the Bayesian framework is Gaussian Process regression and classification. Here, we present the application of Gaussian Processes to the problems of the DST geomagnetic index forecast, the solar wind type classification, and the estimation of diffusion parameters in radiation belt modeling. In each of these very diverse problems, the GP approach rigorously provide forecasts in the form of predictive distributions. In turn, these distributions can be used as input for ensemble simulations in order to quantify the amplification of uncertainties. We show that we have achieved excellent results in all of the standard metrics to evaluate our models, with very modest computational cost.

  11. Bayesian analysis of energy and count rate data for detection of low count rate radioactive sources

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klumpp, John

    We propose a radiation detection system which generates its own discrete sampling distribution based on past measurements of background. The advantage to this approach is that it can take into account variations in background with respect to time, location, energy spectra, detector-specific characteristics (i.e. different efficiencies at different count rates and energies), etc. This would therefore be a 'machine learning' approach, in which the algorithm updates and improves its characterization of background over time. The system would have a 'learning mode,' in which it measures and analyzes background count rates, and a 'detection mode,' in which it compares measurements frommore » an unknown source against its unique background distribution. By characterizing and accounting for variations in the background, general purpose radiation detectors can be improved with little or no increase in cost. The statistical and computational techniques to perform this kind of analysis have already been developed. The necessary signal analysis can be accomplished using existing Bayesian algorithms which account for multiple channels, multiple detectors, and multiple time intervals. Furthermore, Bayesian machine-learning techniques have already been developed which, with trivial modifications, can generate appropriate decision thresholds based on the comparison of new measurements against a nonparametric sampling distribution. (authors)« less

  12. Bayesian inversion of seismic and electromagnetic data for marine gas reservoir characterization using multi-chain Markov chain Monte Carlo sampling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan

    In this study we developed an efficient Bayesian inversion framework for interpreting marine seismic amplitude versus angle (AVA) and controlled source electromagnetic (CSEM) data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo (MCMC) sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis (DREAM) and Adaptive Metropolis (AM) samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and CSEM data. The multi-chain MCMC is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration,more » the approach is used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic AVA and CSEM joint inversion provides better estimation of reservoir saturations than the seismic AVA-only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated – reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less

  13. Bayesian Monte Carlo and Maximum Likelihood Approach for ...

    EPA Pesticide Factsheets

    Model uncertainty estimation and risk assessment is essential to environmental management and informed decision making on pollution mitigation strategies. In this study, we apply a probabilistic methodology, which combines Bayesian Monte Carlo simulation and Maximum Likelihood estimation (BMCML) to calibrate a lake oxygen recovery model. We first derive an analytical solution of the differential equation governing lake-averaged oxygen dynamics as a function of time-variable wind speed. Statistical inferences on model parameters and predictive uncertainty are then drawn by Bayesian conditioning of the analytical solution on observed daily wind speed and oxygen concentration data obtained from an earlier study during two recovery periods on a eutrophic lake in upper state New York. The model is calibrated using oxygen recovery data for one year and statistical inferences were validated using recovery data for another year. Compared with essentially two-step, regression and optimization approach, the BMCML results are more comprehensive and performed relatively better in predicting the observed temporal dissolved oxygen levels (DO) in the lake. BMCML also produced comparable calibration and validation results with those obtained using popular Markov Chain Monte Carlo technique (MCMC) and is computationally simpler and easier to implement than the MCMC. Next, using the calibrated model, we derive an optimal relationship between liquid film-transfer coefficien

  14. Bayesian inversion of seismic and electromagnetic data for marine gas reservoir characterization using multi-chain Markov chain Monte Carlo sampling

    NASA Astrophysics Data System (ADS)

    Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Bao, Jie; Swiler, Laura

    2017-12-01

    In this study we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach is used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated - reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.

  15. EEG-fMRI Bayesian framework for neural activity estimation: a simulation study.

    PubMed

    Croce, Pierpaolo; Basti, Alessio; Marzetti, Laura; Zappasodi, Filippo; Gratta, Cosimo Del

    2016-12-01

    Due to the complementary nature of electroencephalography (EEG) and functional magnetic resonance imaging (fMRI), and given the possibility of simultaneous acquisition, the joint data analysis can afford a better understanding of the underlying neural activity estimation. In this simulation study we want to show the benefit of the joint EEG-fMRI neural activity estimation in a Bayesian framework. We built a dynamic Bayesian framework in order to perform joint EEG-fMRI neural activity time course estimation. The neural activity is originated by a given brain area and detected by means of both measurement techniques. We have chosen a resting state neural activity situation to address the worst case in terms of the signal-to-noise ratio. To infer information by EEG and fMRI concurrently we used a tool belonging to the sequential Monte Carlo (SMC) methods: the particle filter (PF). First, despite a high computational cost, we showed the feasibility of such an approach. Second, we obtained an improvement in neural activity reconstruction when using both EEG and fMRI measurements. The proposed simulation shows the improvements in neural activity reconstruction with EEG-fMRI simultaneous data. The application of such an approach to real data allows a better comprehension of the neural dynamics.

  16. Reconstructing constructivism: Causal models, Bayesian learning mechanisms and the theory theory

    PubMed Central

    Gopnik, Alison; Wellman, Henry M.

    2012-01-01

    We propose a new version of the “theory theory” grounded in the computational framework of probabilistic causal models and Bayesian learning. Probabilistic models allow a constructivist but rigorous and detailed approach to cognitive development. They also explain the learning of both more specific causal hypotheses and more abstract framework theories. We outline the new theoretical ideas, explain the computational framework in an intuitive and non-technical way, and review an extensive but relatively recent body of empirical results that supports these ideas. These include new studies of the mechanisms of learning. Children infer causal structure from statistical information, through their own actions on the world and through observations of the actions of others. Studies demonstrate these learning mechanisms in children from 16 months to 4 years old and include research on causal statistical learning, informal experimentation through play, and imitation and informal pedagogy. They also include studies of the variability and progressive character of intuitive theory change, particularly theory of mind. These studies investigate both the physical and psychological and social domains. We conclude with suggestions for further collaborative projects between developmental and computational cognitive scientists. PMID:22582739

  17. Bayes' theorem application in the measure information diagnostic value assessment

    NASA Astrophysics Data System (ADS)

    Orzechowski, Piotr D.; Makal, Jaroslaw; Nazarkiewicz, Andrzej

    2006-03-01

    The paper presents Bayesian method application in the measure information diagnostic value assessment that is used in the computer-aided diagnosis system. The computer system described here has been created basing on the Bayesian Network and is used in Benign Prostatic Hyperplasia (BPH) diagnosis. The graphic diagnostic model enables to juxtapose experts' knowledge with data.

  18. Bayesian Asymmetric Regression as a Means to Estimate and Evaluate Oral Reading Fluency Slopes

    ERIC Educational Resources Information Center

    Solomon, Benjamin G.; Forsberg, Ole J.

    2017-01-01

    Bayesian techniques have become increasingly present in the social sciences, fueled by advances in computer speed and the development of user-friendly software. In this paper, we forward the use of Bayesian Asymmetric Regression (BAR) to monitor intervention responsiveness when using Curriculum-Based Measurement (CBM) to assess oral reading…

  19. Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837

    ERIC Educational Resources Information Center

    Levy, Roy

    2014-01-01

    Digital games offer an appealing environment for assessing student proficiencies, including skills and misconceptions in a diagnostic setting. This paper proposes a dynamic Bayesian network modeling approach for observations of student performance from an educational video game. A Bayesian approach to model construction, calibration, and use in…

  20. A Bayesian approach to multivariate measurement system assessment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hamada, Michael Scott

    This article considers system assessment for multivariate measurements and presents a Bayesian approach to analyzing gauge R&R study data. The evaluation of variances for univariate measurement becomes the evaluation of covariance matrices for multivariate measurements. The Bayesian approach ensures positive definite estimates of the covariance matrices and easily provides their uncertainty. Furthermore, various measurement system assessment criteria are easily evaluated. The approach is illustrated with data from a real gauge R&R study as well as simulated data.

  1. A Bayesian approach to multivariate measurement system assessment

    DOE PAGES

    Hamada, Michael Scott

    2016-07-01

    This article considers system assessment for multivariate measurements and presents a Bayesian approach to analyzing gauge R&R study data. The evaluation of variances for univariate measurement becomes the evaluation of covariance matrices for multivariate measurements. The Bayesian approach ensures positive definite estimates of the covariance matrices and easily provides their uncertainty. Furthermore, various measurement system assessment criteria are easily evaluated. The approach is illustrated with data from a real gauge R&R study as well as simulated data.

  2. Computer-Based Model Calibration and Uncertainty Analysis: Terms and Concepts

    DTIC Science & Technology

    2015-07-01

    uncertainty analyses throughout the lifecycle of planning, designing, and operating of Civil Works flood risk management projects as described in...value 95% of the time. In the frequentist approach to PE, model parameters area regarded as having true values, and their estimate is based on the...in catchment models. 1. Evaluating parameter uncertainty. Water Resources Research 19(5):1151–1172. Lee, P. M. 2012. Bayesian statistics: An

  3. Formalizing Neurath's ship: Approximate algorithms for online causal learning.

    PubMed

    Bramley, Neil R; Dayan, Peter; Griffiths, Thomas L; Lagnado, David A

    2017-04-01

    Higher-level cognition depends on the ability to learn models of the world. We can characterize this at the computational level as a structure-learning problem with the goal of best identifying the prevailing causal relationships among a set of relata. However, the computational cost of performing exact Bayesian inference over causal models grows rapidly as the number of relata increases. This implies that the cognitive processes underlying causal learning must be substantially approximate. A powerful class of approximations that focuses on the sequential absorption of successive inputs is captured by the Neurath's ship metaphor in philosophy of science, where theory change is cast as a stochastic and gradual process shaped as much by people's limited willingness to abandon their current theory when considering alternatives as by the ground truth they hope to approach. Inspired by this metaphor and by algorithms for approximating Bayesian inference in machine learning, we propose an algorithmic-level model of causal structure learning under which learners represent only a single global hypothesis that they update locally as they gather evidence. We propose a related scheme for understanding how, under these limitations, learners choose informative interventions that manipulate the causal system to help elucidate its workings. We find support for our approach in the analysis of 3 experiments. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  4. Visual shape perception as Bayesian inference of 3D object-centered shape representations.

    PubMed

    Erdogan, Goker; Jacobs, Robert A

    2017-11-01

    Despite decades of research, little is known about how people visually perceive object shape. We hypothesize that a promising approach to shape perception is provided by a "visual perception as Bayesian inference" framework which augments an emphasis on visual representation with an emphasis on the idea that shape perception is a form of statistical inference. Our hypothesis claims that shape perception of unfamiliar objects can be characterized as statistical inference of 3D shape in an object-centered coordinate system. We describe a computational model based on our theoretical framework, and provide evidence for the model along two lines. First, we show that, counterintuitively, the model accounts for viewpoint-dependency of object recognition, traditionally regarded as evidence against people's use of 3D object-centered shape representations. Second, we report the results of an experiment using a shape similarity task, and present an extensive evaluation of existing models' abilities to account for the experimental data. We find that our shape inference model captures subjects' behaviors better than competing models. Taken as a whole, our experimental and computational results illustrate the promise of our approach and suggest that people's shape representations of unfamiliar objects are probabilistic, 3D, and object-centered. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  5. Semisupervised learning using Bayesian interpretation: application to LS-SVM.

    PubMed

    Adankon, Mathias M; Cheriet, Mohamed; Biem, Alain

    2011-04-01

    Bayesian reasoning provides an ideal basis for representing and manipulating uncertain knowledge, with the result that many interesting algorithms in machine learning are based on Bayesian inference. In this paper, we use the Bayesian approach with one and two levels of inference to model the semisupervised learning problem and give its application to the successful kernel classifier support vector machine (SVM) and its variant least-squares SVM (LS-SVM). Taking advantage of Bayesian interpretation of LS-SVM, we develop a semisupervised learning algorithm for Bayesian LS-SVM using our approach based on two levels of inference. Experimental results on both artificial and real pattern recognition problems show the utility of our method.

  6. Spatio Temporal EEG Source Imaging with the Hierarchical Bayesian Elastic Net and Elitist Lasso Models

    PubMed Central

    Paz-Linares, Deirel; Vega-Hernández, Mayrim; Rojas-López, Pedro A.; Valdés-Hernández, Pedro A.; Martínez-Montes, Eduardo; Valdés-Sosa, Pedro A.

    2017-01-01

    The estimation of EEG generating sources constitutes an Inverse Problem (IP) in Neuroscience. This is an ill-posed problem due to the non-uniqueness of the solution and regularization or prior information is needed to undertake Electrophysiology Source Imaging. Structured Sparsity priors can be attained through combinations of (L1 norm-based) and (L2 norm-based) constraints such as the Elastic Net (ENET) and Elitist Lasso (ELASSO) models. The former model is used to find solutions with a small number of smooth nonzero patches, while the latter imposes different degrees of sparsity simultaneously along different dimensions of the spatio-temporal matrix solutions. Both models have been addressed within the penalized regression approach, where the regularization parameters are selected heuristically, leading usually to non-optimal and computationally expensive solutions. The existing Bayesian formulation of ENET allows hyperparameter learning, but using the computationally intensive Monte Carlo/Expectation Maximization methods, which makes impractical its application to the EEG IP. While the ELASSO have not been considered before into the Bayesian context. In this work, we attempt to solve the EEG IP using a Bayesian framework for ENET and ELASSO models. We propose a Structured Sparse Bayesian Learning algorithm based on combining the Empirical Bayes and the iterative coordinate descent procedures to estimate both the parameters and hyperparameters. Using realistic simulations and avoiding the inverse crime we illustrate that our methods are able to recover complicated source setups more accurately and with a more robust estimation of the hyperparameters and behavior under different sparsity scenarios than classical LORETA, ENET and LASSO Fusion solutions. We also solve the EEG IP using data from a visual attention experiment, finding more interpretable neurophysiological patterns with our methods. The Matlab codes used in this work, including Simulations, Methods, Quality Measures and Visualization Routines are freely available in a public website. PMID:29200994

  7. Spatio Temporal EEG Source Imaging with the Hierarchical Bayesian Elastic Net and Elitist Lasso Models.

    PubMed

    Paz-Linares, Deirel; Vega-Hernández, Mayrim; Rojas-López, Pedro A; Valdés-Hernández, Pedro A; Martínez-Montes, Eduardo; Valdés-Sosa, Pedro A

    2017-01-01

    The estimation of EEG generating sources constitutes an Inverse Problem (IP) in Neuroscience. This is an ill-posed problem due to the non-uniqueness of the solution and regularization or prior information is needed to undertake Electrophysiology Source Imaging. Structured Sparsity priors can be attained through combinations of (L1 norm-based) and (L2 norm-based) constraints such as the Elastic Net (ENET) and Elitist Lasso (ELASSO) models. The former model is used to find solutions with a small number of smooth nonzero patches, while the latter imposes different degrees of sparsity simultaneously along different dimensions of the spatio-temporal matrix solutions. Both models have been addressed within the penalized regression approach, where the regularization parameters are selected heuristically, leading usually to non-optimal and computationally expensive solutions. The existing Bayesian formulation of ENET allows hyperparameter learning, but using the computationally intensive Monte Carlo/Expectation Maximization methods, which makes impractical its application to the EEG IP. While the ELASSO have not been considered before into the Bayesian context. In this work, we attempt to solve the EEG IP using a Bayesian framework for ENET and ELASSO models. We propose a Structured Sparse Bayesian Learning algorithm based on combining the Empirical Bayes and the iterative coordinate descent procedures to estimate both the parameters and hyperparameters. Using realistic simulations and avoiding the inverse crime we illustrate that our methods are able to recover complicated source setups more accurately and with a more robust estimation of the hyperparameters and behavior under different sparsity scenarios than classical LORETA, ENET and LASSO Fusion solutions. We also solve the EEG IP using data from a visual attention experiment, finding more interpretable neurophysiological patterns with our methods. The Matlab codes used in this work, including Simulations, Methods, Quality Measures and Visualization Routines are freely available in a public website.

  8. Genomic prediction using an iterative conditional expectation algorithm for a fast BayesC-like model.

    PubMed

    Dong, Linsong; Wang, Zhiyong

    2018-06-11

    Genomic prediction is feasible for estimating genomic breeding values because of dense genome-wide markers and credible statistical methods, such as Genomic Best Linear Unbiased Prediction (GBLUP) and various Bayesian methods. Compared with GBLUP, Bayesian methods propose more flexible assumptions for the distributions of SNP effects. However, most Bayesian methods are performed based on Markov chain Monte Carlo (MCMC) algorithms, leading to computational efficiency challenges. Hence, some fast Bayesian approaches, such as fast BayesB (fBayesB), were proposed to speed up the calculation. This study proposed another fast Bayesian method termed fast BayesC (fBayesC). The prior distribution of fBayesC assumes that a SNP with probability γ has a non-zero effect which comes from a normal density with a common variance. The simulated data from QTLMAS XII workshop and actual data on large yellow croaker were used to compare the predictive results of fBayesB, fBayesC and (MCMC-based) BayesC. The results showed that when γ was set as a small value, such as 0.01 in the simulated data or 0.001 in the actual data, fBayesB and fBayesC yielded lower prediction accuracies (abilities) than BayesC. In the actual data, fBayesC could yield very similar predictive abilities as BayesC when γ ≥ 0.01. When γ = 0.01, fBayesB could also yield similar results as fBayesC and BayesC. However, fBayesB could not yield an explicit result when γ ≥ 0.1, but a similar situation was not observed for fBayesC. Moreover, the computational speed of fBayesC was significantly faster than that of BayesC, making fBayesC a promising method for genomic prediction.

  9. Development and comparison in uncertainty assessment based Bayesian modularization method in hydrological modeling

    NASA Astrophysics Data System (ADS)

    Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn

    2013-04-01

    SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.

  10. Bayesian analysis of zero inflated spatiotemporal HIV/TB child mortality data through the INLA and SPDE approaches: Applied to data observed between 1992 and 2010 in rural North East South Africa

    NASA Astrophysics Data System (ADS)

    Musenge, Eustasius; Chirwa, Tobias Freeman; Kahn, Kathleen; Vounatsou, Penelope

    2013-06-01

    Longitudinal mortality data with few deaths usually have problems of zero-inflation. This paper presents and applies two Bayesian models which cater for zero-inflation, spatial and temporal random effects. To reduce the computational burden experienced when a large number of geo-locations are treated as a Gaussian field (GF) we transformed the field to a Gaussian Markov Random Fields (GMRF) by triangulation. We then modelled the spatial random effects using the Stochastic Partial Differential Equations (SPDEs). Inference was done using a computationally efficient alternative to Markov chain Monte Carlo (MCMC) called Integrated Nested Laplace Approximation (INLA) suited for GMRF. The models were applied to data from 71,057 children aged 0 to under 10 years from rural north-east South Africa living in 15,703 households over the years 1992-2010. We found protective effects on HIV/TB mortality due to greater birth weight, older age and more antenatal clinic visits during pregnancy (adjusted RR (95% CI)): 0.73(0.53;0.99), 0.18(0.14;0.22) and 0.96(0.94;0.97) respectively. Therefore childhood HIV/TB mortality could be reduced if mothers are better catered for during pregnancy as this can reduce mother-to-child transmissions and contribute to improved birth weights. The INLA and SPDE approaches are computationally good alternatives in modelling large multilevel spatiotemporal GMRF data structures.

  11. Bayesian structural equation modeling in sport and exercise psychology.

    PubMed

    Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus

    2015-08-01

    Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.

  12. Model-based Bayesian inference for ROC data analysis

    NASA Astrophysics Data System (ADS)

    Lei, Tianhu; Bae, K. Ty

    2013-03-01

    This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.

  13. A Bayesian Approach to Real-Time Earthquake Phase Association

    NASA Astrophysics Data System (ADS)

    Benz, H.; Johnson, C. E.; Earle, P. S.; Patton, J. M.

    2014-12-01

    Real-time location of seismic events requires a robust and extremely efficient means of associating and identifying seismic phases with hypothetical sources. An association algorithm converts a series of phase arrival times into a catalog of earthquake hypocenters. The classical approach based on time-space stacking of the locus of possible hypocenters for each phase arrival using the principal of acoustic reciprocity has been in use now for many years. One of the most significant problems that has emerged over time with this approach is related to the extreme variations in seismic station density throughout the global seismic network. To address this problem we have developed a novel, Bayesian association algorithm, which looks at the association problem as a dynamically evolving complex system of "many to many relationships". While the end result must be an array of one to many relations (one earthquake, many phases), during the association process the situation is quite different. Both the evolving possible hypocenters and the relationships between phases and all nascent hypocenters is many to many (many earthquakes, many phases). The computational framework we are using to address this is a responsive, NoSQL graph database where the earthquake-phase associations are represented as intersecting Bayesian Learning Networks. The approach directly addresses the network inhomogeneity issue while at the same time allowing the inclusion of other kinds of data (e.g., seismic beams, station noise characteristics, priors on estimated location of the seismic source) by representing the locus of intersecting hypothetical loci for a given datum as joint probability density functions.

  14. Approximate Bayesian computation in large-scale structure: constraining the galaxy-halo connection

    NASA Astrophysics Data System (ADS)

    Hahn, ChangHoon; Vakili, Mohammadjavad; Walsh, Kilian; Hearin, Andrew P.; Hogg, David W.; Campbell, Duncan

    2017-08-01

    Standard approaches to Bayesian parameter inference in large-scale structure assume a Gaussian functional form (chi-squared form) for the likelihood. This assumption, in detail, cannot be correct. Likelihood free inferences such as approximate Bayesian computation (ABC) relax these restrictions and make inference possible without making any assumptions on the likelihood. Instead ABC relies on a forward generative model of the data and a metric for measuring the distance between the model and data. In this work, we demonstrate that ABC is feasible for LSS parameter inference by using it to constrain parameters of the halo occupation distribution (HOD) model for populating dark matter haloes with galaxies. Using specific implementation of ABC supplemented with population Monte Carlo importance sampling, a generative forward model using HOD and a distance metric based on galaxy number density, two-point correlation function and galaxy group multiplicity function, we constrain the HOD parameters of mock observation generated from selected 'true' HOD parameters. The parameter constraints we obtain from ABC are consistent with the 'true' HOD parameters, demonstrating that ABC can be reliably used for parameter inference in LSS. Furthermore, we compare our ABC constraints to constraints we obtain using a pseudo-likelihood function of Gaussian form with MCMC and find consistent HOD parameter constraints. Ultimately, our results suggest that ABC can and should be applied in parameter inference for LSS analyses.

  15. Uncertainty quantification for nuclear density functional theory and information content of new measurements.

    PubMed

    McDonnell, J D; Schunck, N; Higdon, D; Sarich, J; Wild, S M; Nazarewicz, W

    2015-03-27

    Statistical tools of uncertainty quantification can be used to assess the information content of measured observables with respect to present-day theoretical models, to estimate model errors and thereby improve predictive capability, to extrapolate beyond the regions reached by experiment, and to provide meaningful input to applications and planned measurements. To showcase new opportunities offered by such tools, we make a rigorous analysis of theoretical statistical uncertainties in nuclear density functional theory using Bayesian inference methods. By considering the recent mass measurements from the Canadian Penning Trap at Argonne National Laboratory, we demonstrate how the Bayesian analysis and a direct least-squares optimization, combined with high-performance computing, can be used to assess the information content of the new data with respect to a model based on the Skyrme energy density functional approach. Employing the posterior probability distribution computed with a Gaussian process emulator, we apply the Bayesian framework to propagate theoretical statistical uncertainties in predictions of nuclear masses, two-neutron dripline, and fission barriers. Overall, we find that the new mass measurements do not impose a constraint that is strong enough to lead to significant changes in the model parameters. The example discussed in this study sets the stage for quantifying and maximizing the impact of new measurements with respect to current modeling and guiding future experimental efforts, thus enhancing the experiment-theory cycle in the scientific method.

  16. Gravity dual for a model of perception

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nakayama, Yu, E-mail: nakayama@berkeley.edu

    2011-01-15

    One of the salient features of human perception is its invariance under dilatation in addition to the Euclidean group, but its non-invariance under special conformal transformation. We investigate a holographic approach to the information processing in image discrimination with this feature. We claim that a strongly coupled analogue of the statistical model proposed by Bialek and Zee can be holographically realized in scale invariant but non-conformal Euclidean geometries. We identify the Bayesian probability distribution of our generalized Bialek-Zee model with the GKPW partition function of the dual gravitational system. We provide a concrete example of the geometric configuration based onmore » a vector condensation model coupled with the Euclidean Einstein-Hilbert action. From the proposed geometry, we study sample correlation functions to compute the Bayesian probability distribution.« less

  17. Bayesian Treatment of Uncertainty in Environmental Modeling: Optimization, Sampling and Data Assimilation Using the DREAM Software Package

    NASA Astrophysics Data System (ADS)

    Vrugt, J. A.

    2012-12-01

    In the past decade much progress has been made in the treatment of uncertainty in earth systems modeling. Whereas initial approaches has focused mostly on quantification of parameter and predictive uncertainty, recent methods attempt to disentangle the effects of parameter, forcing (input) data, model structural and calibration data errors. In this talk I will highlight some of our recent work involving theory, concepts and applications of Bayesian parameter and/or state estimation. In particular, new methods for sequential Monte Carlo (SMC) and Markov Chain Monte Carlo (MCMC) simulation will be presented with emphasis on massively parallel distributed computing and quantification of model structural errors. The theoretical and numerical developments will be illustrated using model-data synthesis problems in hydrology, hydrogeology and geophysics.

  18. Bayesian Factor Analysis as a Variable Selection Problem: Alternative Priors and Consequences

    PubMed Central

    Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric

    2016-01-01

    Factor analysis is a popular statistical technique for multivariate data analysis. Developments in the structural equation modeling framework have enabled the use of hybrid confirmatory/exploratory approaches in which factor loading structures can be explored relatively flexibly within a confirmatory factor analysis (CFA) framework. Recently, a Bayesian structural equation modeling (BSEM) approach (Muthén & Asparouhov, 2012) has been proposed as a way to explore the presence of cross-loadings in CFA models. We show that the issue of determining factor loading patterns may be formulated as a Bayesian variable selection problem in which Muthén and Asparouhov’s approach can be regarded as a BSEM approach with ridge regression prior (BSEM-RP). We propose another Bayesian approach, denoted herein as the Bayesian structural equation modeling with spike and slab prior (BSEM-SSP), which serves as a one-stage alternative to the BSEM-RP. We review the theoretical advantages and disadvantages of both approaches and compare their empirical performance relative to two modification indices-based approaches and exploratory factor analysis with target rotation. A teacher stress scale data set (Byrne, 2012; Pettegrew & Wolf, 1982) is used to demonstrate our approach. PMID:27314566

  19. Determining Protein Complex Structures Based on a Bayesian Model of in Vivo Förster Resonance Energy Transfer (FRET) Data*

    PubMed Central

    Bonomi, Massimiliano; Pellarin, Riccardo; Kim, Seung Joong; Russel, Daniel; Sundin, Bryan A.; Riffle, Michael; Jaschob, Daniel; Ramsden, Richard; Davis, Trisha N.; Muller, Eric G. D.; Sali, Andrej

    2014-01-01

    The use of in vivo Förster resonance energy transfer (FRET) data to determine the molecular architecture of a protein complex in living cells is challenging due to data sparseness, sample heterogeneity, signal contributions from multiple donors and acceptors, unequal fluorophore brightness, photobleaching, flexibility of the linker connecting the fluorophore to the tagged protein, and spectral cross-talk. We addressed these challenges by using a Bayesian approach that produces the posterior probability of a model, given the input data. The posterior probability is defined as a function of the dependence of our FRET metric FRETR on a structure (forward model), a model of noise in the data, as well as prior information about the structure, relative populations of distinct states in the sample, forward model parameters, and data noise. The forward model was validated against kinetic Monte Carlo simulations and in vivo experimental data collected on nine systems of known structure. In addition, our Bayesian approach was validated by a benchmark of 16 protein complexes of known structure. Given the structures of each subunit of the complexes, models were computed from synthetic FRETR data with a distance root-mean-squared deviation error of 14 to 17 Å. The approach is implemented in the open-source Integrative Modeling Platform, allowing us to determine macromolecular structures through a combination of in vivo FRETR data and data from other sources, such as electron microscopy and chemical cross-linking. PMID:25139910

  20. Inference of reactive transport model parameters using a Bayesian multivariate approach

    NASA Astrophysics Data System (ADS)

    Carniato, Luca; Schoups, Gerrit; van de Giesen, Nick

    2014-08-01

    Parameter estimation of subsurface transport models from multispecies data requires the definition of an objective function that includes different types of measurements. Common approaches are weighted least squares (WLS), where weights are specified a priori for each measurement, and weighted least squares with weight estimation (WLS(we)) where weights are estimated from the data together with the parameters. In this study, we formulate the parameter estimation task as a multivariate Bayesian inference problem. The WLS and WLS(we) methods are special cases in this framework, corresponding to specific prior assumptions about the residual covariance matrix. The Bayesian perspective allows for generalizations to cases where residual correlation is important and for efficient inference by analytically integrating out the variances (weights) and selected covariances from the joint posterior. Specifically, the WLS and WLS(we) methods are compared to a multivariate (MV) approach that accounts for specific residual correlations without the need for explicit estimation of the error parameters. When applied to inference of reactive transport model parameters from column-scale data on dissolved species concentrations, the following results were obtained: (1) accounting for residual correlation between species provides more accurate parameter estimation for high residual correlation levels whereas its influence for predictive uncertainty is negligible, (2) integrating out the (co)variances leads to an efficient estimation of the full joint posterior with a reduced computational effort compared to the WLS(we) method, and (3) in the presence of model structural errors, none of the methods is able to identify the correct parameter values.

  1. Statistical estimation via convex optimization for trending and performance monitoring

    NASA Astrophysics Data System (ADS)

    Samar, Sikandar

    This thesis presents an optimization-based statistical estimation approach to find unknown trends in noisy data. A Bayesian framework is used to explicitly take into account prior information about the trends via trend models and constraints. The main focus is on convex formulation of the Bayesian estimation problem, which allows efficient computation of (globally) optimal estimates. There are two main parts of this thesis. The first part formulates trend estimation in systems described by known detailed models as a convex optimization problem. Statistically optimal estimates are then obtained by maximizing a concave log-likelihood function subject to convex constraints. We consider the problem of increasing problem dimension as more measurements become available, and introduce a moving horizon framework to enable recursive estimation of the unknown trend by solving a fixed size convex optimization problem at each horizon. We also present a distributed estimation framework, based on the dual decomposition method, for a system formed by a network of complex sensors with local (convex) estimation. Two specific applications of the convex optimization-based Bayesian estimation approach are described in the second part of the thesis. Batch estimation for parametric diagnostics in a flight control simulation of a space launch vehicle is shown to detect incipient fault trends despite the natural masking properties of feedback in the guidance and control loops. Moving horizon approach is used to estimate time varying fault parameters in a detailed nonlinear simulation model of an unmanned aerial vehicle. An excellent performance is demonstrated in the presence of winds and turbulence.

  2. Particle identification in ALICE: a Bayesian approach

    NASA Astrophysics Data System (ADS)

    Adam, J.; Adamová, D.; Aggarwal, M. M.; Aglieri Rinella, G.; Agnello, M.; Agrawal, N.; Ahammed, Z.; Ahmad, S.; Ahn, S. U.; Aiola, S.; Akindinov, A.; Alam, S. N.; Albuquerque, D. S. D.; Aleksandrov, D.; Alessandro, B.; Alexandre, D.; Alfaro Molina, R.; Alici, A.; Alkin, A.; Almaraz, J. R. M.; Alme, J.; Alt, T.; Altinpinar, S.; Altsybeev, I.; Alves Garcia Prado, C.; Andrei, C.; Andronic, A.; Anguelov, V.; Antičić, T.; Antinori, F.; Antonioli, P.; Aphecetche, L.; Appelshäuser, H.; Arcelli, S.; Arnaldi, R.; Arnold, O. W.; Arsene, I. C.; Arslandok, M.; Audurier, B.; Augustinus, A.; Averbeck, R.; Azmi, M. D.; Badalà, A.; Baek, Y. W.; Bagnasco, S.; Bailhache, R.; Bala, R.; Balasubramanian, S.; Baldisseri, A.; Baral, R. C.; Barbano, A. M.; Barbera, R.; Barile, F.; Barnaföldi, G. G.; Barnby, L. S.; Barret, V.; Bartalini, P.; Barth, K.; Bartke, J.; Bartsch, E.; Basile, M.; Bastid, N.; Basu, S.; Bathen, B.; Batigne, G.; Batista Camejo, A.; Batyunya, B.; Batzing, P. C.; Bearden, I. G.; Beck, H.; Bedda, C.; Behera, N. K.; Belikov, I.; Bellini, F.; Bello Martinez, H.; Bellwied, R.; Belmont, R.; Belmont-Moreno, E.; Belyaev, V.; Benacek, P.; Bencedi, G.; Beole, S.; Berceanu, I.; Bercuci, A.; Berdnikov, Y.; Berenyi, D.; Bertens, R. A.; Berzano, D.; Betev, L.; Bhasin, A.; Bhat, I. R.; Bhati, A. K.; Bhattacharjee, B.; Bhom, J.; Bianchi, L.; Bianchi, N.; Bianchin, C.; Bielčík, J.; Bielčíková, J.; Bilandzic, A.; Biro, G.; Biswas, R.; Biswas, S.; Bjelogrlic, S.; Blair, J. T.; Blau, D.; Blume, C.; Bock, F.; Bogdanov, A.; Bøggild, H.; Boldizsár, L.; Bombara, M.; Book, J.; Borel, H.; Borissov, A.; Borri, M.; Bossú, F.; Botta, E.; Bourjau, C.; Braun-Munzinger, P.; Bregant, M.; Breitner, T.; Broker, T. A.; Browning, T. A.; Broz, M.; Brucken, E. J.; Bruna, E.; Bruno, G. E.; Budnikov, D.; Buesching, H.; Bufalino, S.; Buncic, P.; Busch, O.; Buthelezi, Z.; Butt, J. B.; Buxton, J. T.; Cabala, J.; Caffarri, D.; Cai, X.; Caines, H.; Calero Diaz, L.; Caliva, A.; Calvo Villar, E.; Camerini, P.; Carena, F.; Carena, W.; Carnesecchi, F.; Castillo Castellanos, J.; Castro, A. J.; Casula, E. A. R.; Ceballos Sanchez, C.; Cepila, J.; Cerello, P.; Cerkala, J.; Chang, B.; Chapeland, S.; Chartier, M.; Charvet, J. L.; Chattopadhyay, S.; Chattopadhyay, S.; Chauvin, A.; Chelnokov, V.; Cherney, M.; Cheshkov, C.; Cheynis, B.; Chibante Barroso, V.; Chinellato, D. D.; Cho, S.; Chochula, P.; Choi, K.; Chojnacki, M.; Choudhury, S.; Christakoglou, P.; Christensen, C. H.; Christiansen, P.; Chujo, T.; Chung, S. U.; Cicalo, C.; Cifarelli, L.; Cindolo, F.; Cleymans, J.; Colamaria, F.; Colella, D.; Collu, A.; Colocci, M.; Conesa Balbastre, G.; Conesa del Valle, Z.; Connors, M. E.; Contreras, J. G.; Cormier, T. M.; Corrales Morales, Y.; Cortés Maldonado, I.; Cortese, P.; Cosentino, M. R.; Costa, F.; Crochet, P.; Cruz Albino, R.; Cuautle, E.; Cunqueiro, L.; Dahms, T.; Dainese, A.; Danisch, M. C.; Danu, A.; Das, D.; Das, I.; Das, S.; Dash, A.; Dash, S.; De, S.; De Caro, A.; de Cataldo, G.; de Conti, C.; de Cuveland, J.; De Falco, A.; De Gruttola, D.; De Marco, N.; De Pasquale, S.; Deisting, A.; Deloff, A.; Dénes, E.; Deplano, C.; Dhankher, P.; Di Bari, D.; Di Mauro, A.; Di Nezza, P.; Diaz Corchero, M. A.; Dietel, T.; Dillenseger, P.; Divià, R.; Djuvsland, Ø.; Dobrin, A.; Domenicis Gimenez, D.; Dönigus, B.; Dordic, O.; Drozhzhova, T.; Dubey, A. K.; Dubla, A.; Ducroux, L.; Dupieux, P.; Ehlers, R. J.; Elia, D.; Endress, E.; Engel, H.; Epple, E.; Erazmus, B.; Erdemir, I.; Erhardt, F.; Espagnon, B.; Estienne, M.; Esumi, S.; Eum, J.; Evans, D.; Evdokimov, S.; Eyyubova, G.; Fabbietti, L.; Fabris, D.; Faivre, J.; Fantoni, A.; Fasel, M.; Feldkamp, L.; Feliciello, A.; Feofilov, G.; Ferencei, J.; Fernández Téllez, A.; Ferreiro, E. G.; Ferretti, A.; Festanti, A.; Feuillard, V. J. G.; Figiel, J.; Figueredo, M. A. S.; Filchagin, S.; Finogeev, D.; Fionda, F. M.; Fiore, E. M.; Fleck, M. G.; Floris, M.; Foertsch, S.; Foka, P.; Fokin, S.; Fragiacomo, E.; Francescon, A.; Frankenfeld, U.; Fronze, G. G.; Fuchs, U.; Furget, C.; Furs, A.; Fusco Girard, M.; Gaardhøje, J. J.; Gagliardi, M.; Gago, A. M.; Gallio, M.; Gangadharan, D. R.; Ganoti, P.; Gao, C.; Garabatos, C.; Garcia-Solis, E.; Gargiulo, C.; Gasik, P.; Gauger, E. F.; Germain, M.; Gheata, A.; Gheata, M.; Ghosh, P.; Ghosh, S. K.; Gianotti, P.; Giubellino, P.; Giubilato, P.; Gladysz-Dziadus, E.; Glässel, P.; Goméz Coral, D. M.; Gomez Ramirez, A.; Gonzalez, A. S.; Gonzalez, V.; González-Zamora, P.; Gorbunov, S.; Görlich, L.; Gotovac, S.; Grabski, V.; Grachov, O. A.; Graczykowski, L. K.; Graham, K. L.; Grelli, A.; Grigoras, A.; Grigoras, C.; Grigoriev, V.; Grigoryan, A.; Grigoryan, S.; Grinyov, B.; Grion, N.; Gronefeld, J. M.; Grosse-Oetringhaus, J. F.; Grosso, R.; Guber, F.; Guernane, R.; Guerzoni, B.; Gulbrandsen, K.; Gunji, T.; Gupta, A.; Gupta, R.; Haake, R.; Haaland, Ø.; Hadjidakis, C.; Haiduc, M.; Hamagaki, H.; Hamar, G.; Hamon, J. C.; Harris, J. W.; Harton, A.; Hatzifotiadou, D.; Hayashi, S.; Heckel, S. T.; Hellbär, E.; Helstrup, H.; Herghelegiu, A.; Herrera Corral, G.; Hess, B. A.; Hetland, K. F.; Hillemanns, H.; Hippolyte, B.; Horak, D.; Hosokawa, R.; Hristov, P.; Humanic, T. J.; Hussain, N.; Hussain, T.; Hutter, D.; Hwang, D. S.; Ilkaev, R.; Inaba, M.; Incani, E.; Ippolitov, M.; Irfan, M.; Ivanov, M.; Ivanov, V.; Izucheev, V.; Jacazio, N.; Jacobs, P. M.; Jadhav, M. B.; Jadlovska, S.; Jadlovsky, J.; Jahnke, C.; Jakubowska, M. J.; Jang, H. J.; Janik, M. A.; Jayarathna, P. H. S. Y.; Jena, C.; Jena, S.; Jimenez Bustamante, R. T.; Jones, P. G.; Jusko, A.; Kalinak, P.; Kalweit, A.; Kamin, J.; Kang, J. H.; Kaplin, V.; Kar, S.; Karasu Uysal, A.; Karavichev, O.; Karavicheva, T.; Karayan, L.; Karpechev, E.; Kebschull, U.; Keidel, R.; Keijdener, D. L. D.; Keil, M.; Mohisin Khan, M.; Khan, P.; Khan, S. A.; Khanzadeev, A.; Kharlov, Y.; Kileng, B.; Kim, D. W.; Kim, D. J.; Kim, D.; Kim, H.; Kim, J. S.; Kim, M.; Kim, S.; Kim, T.; Kirsch, S.; Kisel, I.; Kiselev, S.; Kisiel, A.; Kiss, G.; Klay, J. L.; Klein, C.; Klein, J.; Klein-Bösing, C.; Klewin, S.; Kluge, A.; Knichel, M. L.; Knospe, A. G.; Kobdaj, C.; Kofarago, M.; Kollegger, T.; Kolojvari, A.; Kondratiev, V.; Kondratyeva, N.; Kondratyuk, E.; Konevskikh, A.; Kopcik, M.; Kostarakis, P.; Kour, M.; Kouzinopoulos, C.; Kovalenko, O.; Kovalenko, V.; Kowalski, M.; Koyithatta Meethaleveedu, G.; Králik, I.; Kravčáková, A.; Krivda, M.; Krizek, F.; Kryshen, E.; Krzewicki, M.; Kubera, A. M.; Kučera, V.; Kuhn, C.; Kuijer, P. G.; Kumar, A.; Kumar, J.; Kumar, L.; Kumar, S.; Kurashvili, P.; Kurepin, A.; Kurepin, A. B.; Kuryakin, A.; Kweon, M. J.; Kwon, Y.; La Pointe, S. L.; La Rocca, P.; Ladron de Guevara, P.; Lagana Fernandes, C.; Lakomov, I.; Langoy, R.; Lara, C.; Lardeux, A.; Lattuca, A.; Laudi, E.; Lea, R.; Leardini, L.; Lee, G. R.; Lee, S.; Lehas, F.; Lemmon, R. C.; Lenti, V.; Leogrande, E.; León Monzón, I.; León Vargas, H.; Leoncino, M.; Lévai, P.; Li, S.; Li, X.; Lien, J.; Lietava, R.; Lindal, S.; Lindenstruth, V.; Lippmann, C.; Lisa, M. A.; Ljunggren, H. M.; Lodato, D. F.; Loenne, P. I.; Loginov, V.; Loizides, C.; Lopez, X.; López Torres, E.; Lowe, A.; Luettig, P.; Lunardon, M.; Luparello, G.; Lutz, T. H.; Maevskaya, A.; Mager, M.; Mahajan, S.; Mahmood, S. M.; Maire, A.; Majka, R. D.; Malaev, M.; Maldonado Cervantes, I.; Malinina, L.; Mal'Kevich, D.; Malzacher, P.; Mamonov, A.; Manko, V.; Manso, F.; Manzari, V.; Marchisone, M.; Mareš, J.; Margagliotti, G. V.; Margotti, A.; Margutti, J.; Marín, A.; Markert, C.; Marquard, M.; Martin, N. A.; Martin Blanco, J.; Martinengo, P.; Martínez, M. I.; Martínez García, G.; Martinez Pedreira, M.; Mas, A.; Masciocchi, S.; Masera, M.; Masoni, A.; Mastroserio, A.; Matyja, A.; Mayer, C.; Mazer, J.; Mazzoni, M. A.; Mcdonald, D.; Meddi, F.; Melikyan, Y.; Menchaca-Rocha, A.; Meninno, E.; Mercado Pérez, J.; Meres, M.; Miake, Y.; Mieskolainen, M. M.; Mikhaylov, K.; Milano, L.; Milosevic, J.; Mischke, A.; Mishra, A. N.; Miśkowiec, D.; Mitra, J.; Mitu, C. M.; Mohammadi, N.; Mohanty, B.; Molnar, L.; Montaño Zetina, L.; Montes, E.; Moreira De Godoy, D. A.; Moreno, L. A. P.; Moretto, S.; Morreale, A.; Morsch, A.; Muccifora, V.; Mudnic, E.; Mühlheim, D.; Muhuri, S.; Mukherjee, M.; Mulligan, J. D.; Munhoz, M. G.; Munzer, R. H.; Murakami, H.; Murray, S.; Musa, L.; Musinsky, J.; Naik, B.; Nair, R.; Nandi, B. K.; Nania, R.; Nappi, E.; Naru, M. U.; Natal da Luz, H.; Nattrass, C.; Navarro, S. R.; Nayak, K.; Nayak, R.; Nayak, T. K.; Nazarenko, S.; Nedosekin, A.; Nellen, L.; Ng, F.; Nicassio, M.; Niculescu, M.; Niedziela, J.; Nielsen, B. S.; Nikolaev, S.; Nikulin, S.; Nikulin, V.; Noferini, F.; Nomokonov, P.; Nooren, G.; Noris, J. C. C.; Norman, J.; Nyanin, A.; Nystrand, J.; Oeschler, H.; Oh, S.; Oh, S. K.; Ohlson, A.; Okatan, A.; Okubo, T.; Olah, L.; Oleniacz, J.; Oliveira Da Silva, A. C.; Oliver, M. H.; Onderwaater, J.; Oppedisano, C.; Orava, R.; Oravec, M.; Ortiz Velasquez, A.; Oskarsson, A.; Otwinowski, J.; Oyama, K.; Ozdemir, M.; Pachmayer, Y.; Pagano, D.; Pagano, P.; Paić, G.; Pal, S. K.; Pan, J.; Pandey, A. K.; Papikyan, V.; Pappalardo, G. S.; Pareek, P.; Park, W. J.; Parmar, S.; Passfeld, A.; Paticchio, V.; Patra, R. N.; Paul, B.; Pei, H.; Peitzmann, T.; Pereira Da Costa, H.; Peresunko, D.; Pérez Lara, C. E.; Perez Lezama, E.; Peskov, V.; Pestov, Y.; Petráček, V.; Petrov, V.; Petrovici, M.; Petta, C.; Piano, S.; Pikna, M.; Pillot, P.; Pimentel, L. O. D. L.; Pinazza, O.; Pinsky, L.; Piyarathna, D. B.; Płoskoń, M.; Planinic, M.; Pluta, J.; Pochybova, S.; Podesta-Lerma, P. L. M.; Poghosyan, M. G.; Polichtchouk, B.; Poljak, N.; Poonsawat, W.; Pop, A.; Porteboeuf-Houssais, S.; Porter, J.; Pospisil, J.; Prasad, S. K.; Preghenella, R.; Prino, F.; Pruneau, C. A.; Pshenichnov, I.; Puccio, M.; Puddu, G.; Pujahari, P.; Punin, V.; Putschke, J.; Qvigstad, H.; Rachevski, A.; Raha, S.; Rajput, S.; Rak, J.; Rakotozafindrabe, A.; Ramello, L.; Rami, F.; Raniwala, R.; Raniwala, S.; Räsänen, S. S.; Rascanu, B. T.; Rathee, D.; Read, K. F.; Redlich, K.; Reed, R. J.; Rehman, A.; Reichelt, P.; Reidt, F.; Ren, X.; Renfordt, R.; Reolon, A. R.; Reshetin, A.; Reygers, K.; Riabov, V.; Ricci, R. A.; Richert, T.; Richter, M.; Riedler, P.; Riegler, W.; Riggi, F.; Ristea, C.; Rocco, E.; Rodríguez Cahuantzi, M.; Rodriguez Manso, A.; Røed, K.; Rogochaya, E.; Rohr, D.; Röhrich, D.; Ronchetti, F.; Ronflette, L.; Rosnet, P.; Rossi, A.; Roukoutakis, F.; Roy, A.; Roy, C.; Roy, P.; Rubio Montero, A. J.; Rui, R.; Russo, R.; Ryabinkin, E.; Ryabov, Y.; Rybicki, A.; Saarinen, S.; Sadhu, S.; Sadovsky, S.; Šafařík, K.; Sahlmuller, B.; Sahoo, P.; Sahoo, R.; Sahoo, S.; Sahu, P. K.; Saini, J.; Sakai, S.; Saleh, M. A.; Salzwedel, J.; Sambyal, S.; Samsonov, V.; Šándor, L.; Sandoval, A.; Sano, M.; Sarkar, D.; Sarkar, N.; Sarma, P.; Scapparone, E.; Scarlassara, F.; Schiaua, C.; Schicker, R.; Schmidt, C.; Schmidt, H. R.; Schuchmann, S.; Schukraft, J.; Schulc, M.; Schutz, Y.; Schwarz, K.; Schweda, K.; Scioli, G.; Scomparin, E.; Scott, R.; Šefčík, M.; Seger, J. E.; Sekiguchi, Y.; Sekihata, D.; Selyuzhenkov, I.; Senosi, K.; Senyukov, S.; Serradilla, E.; Sevcenco, A.; Shabanov, A.; Shabetai, A.; Shadura, O.; Shahoyan, R.; Shahzad, M. I.; Shangaraev, A.; Sharma, A.; Sharma, M.; Sharma, M.; Sharma, N.; Sheikh, A. I.; Shigaki, K.; Shou, Q.; Shtejer, K.; Sibiriak, Y.; Siddhanta, S.; Sielewicz, K. M.; Siemiarczuk, T.; Silvermyr, D.; Silvestre, C.; Simatovic, G.; Simonetti, G.; Singaraju, R.; Singh, R.; Singha, S.; Singhal, V.; Sinha, B. C.; Sinha, T.; Sitar, B.; Sitta, M.; Skaali, T. B.; Slupecki, M.; Smirnov, N.; Snellings, R. J. M.; Snellman, T. W.; Song, J.; Song, M.; Song, Z.; Soramel, F.; Sorensen, S.; Souza, R. D. de; Sozzi, F.; Spacek, M.; Spiriti, E.; Sputowska, I.; Spyropoulou-Stassinaki, M.; Stachel, J.; Stan, I.; Stankus, P.; Stenlund, E.; Steyn, G.; Stiller, J. H.; Stocco, D.; Strmen, P.; Suaide, A. A. P.; Sugitate, T.; Suire, C.; Suleymanov, M.; Suljic, M.; Sultanov, R.; Šumbera, M.; Sumowidagdo, S.; Szabo, A.; Szanto de Toledo, A.; Szarka, I.; Szczepankiewicz, A.; Szymanski, M.; Tabassam, U.; Takahashi, J.; Tambave, G. J.; Tanaka, N.; Tarhini, M.; Tariq, M.; Tarzila, M. G.; Tauro, A.; Tejeda Muñoz, G.; Telesca, A.; Terasaki, K.; Terrevoli, C.; Teyssier, B.; Thäder, J.; Thakur, D.; Thomas, D.; Tieulent, R.; Timmins, A. R.; Toia, A.; Trogolo, S.; Trombetta, G.; Trubnikov, V.; Trzaska, W. H.; Tsuji, T.; Tumkin, A.; Turrisi, R.; Tveter, T. S.; Ullaland, K.; Uras, A.; Usai, G. L.; Utrobicic, A.; Vala, M.; Valencia Palomo, L.; Vallero, S.; Van Der Maarel, J.; Van Hoorne, J. W.; van Leeuwen, M.; Vanat, T.; Vande Vyvre, P.; Varga, D.; Vargas, A.; Vargyas, M.; Varma, R.; Vasileiou, M.; Vasiliev, A.; Vauthier, A.; Vechernin, V.; Veen, A. M.; Veldhoen, M.; Velure, A.; Vercellin, E.; Vergara Limón, S.; Vernet, R.; Verweij, M.; Vickovic, L.; Viesti, G.; Viinikainen, J.; Vilakazi, Z.; Villalobos Baillie, O.; Villatoro Tello, A.; Vinogradov, A.; Vinogradov, L.; Vinogradov, Y.; Virgili, T.; Vislavicius, V.; Viyogi, Y. P.; Vodopyanov, A.; Völkl, M. A.; Voloshin, K.; Voloshin, S. A.; Volpe, G.; von Haller, B.; Vorobyev, I.; Vranic, D.; Vrláková, J.; Vulpescu, B.; Wagner, B.; Wagner, J.; Wang, H.; Wang, M.; Watanabe, D.; Watanabe, Y.; Weber, M.; Weber, S. G.; Weiser, D. F.; Wessels, J. P.; Westerhoff, U.; Whitehead, A. M.; Wiechula, J.; Wikne, J.; Wilk, G.; Wilkinson, J.; Williams, M. C. S.; Windelband, B.; Winn, M.; Yang, H.; Yang, P.; Yano, S.; Yasin, Z.; Yin, Z.; Yokoyama, H.; Yoo, I.-K.; Yoon, J. H.; Yurchenko, V.; Yushmanov, I.; Zaborowska, A.; Zaccolo, V.; Zaman, A.; Zampolli, C.; Zanoli, H. J. C.; Zaporozhets, S.; Zardoshti, N.; Zarochentsev, A.; Závada, P.; Zaviyalov, N.; Zbroszczyk, H.; Zgura, I. S.; Zhalov, M.; Zhang, H.; Zhang, X.; Zhang, Y.; Zhang, C.; Zhang, Z.; Zhao, C.; Zhigareva, N.; Zhou, D.; Zhou, Y.; Zhou, Z.; Zhu, H.; Zhu, J.; Zichichi, A.; Zimmermann, A.; Zimmermann, M. B.; Zinovjev, G.; Zyzak, M.

    2016-05-01

    We present a Bayesian approach to particle identification (PID) within the ALICE experiment. The aim is to more effectively combine the particle identification capabilities of its various detectors. After a brief explanation of the adopted methodology and formalism, the performance of the Bayesian PID approach for charged pions, kaons and protons in the central barrel of ALICE is studied. PID is performed via measurements of specific energy loss ( d E/d x) and time of flight. PID efficiencies and misidentification probabilities are extracted and compared with Monte Carlo simulations using high-purity samples of identified particles in the decay channels K0S → π-π+, φ→ K-K+, and Λ→ p π- in p-Pb collisions at √{s_{NN}}=5.02 TeV. In order to thoroughly assess the validity of the Bayesian approach, this methodology was used to obtain corrected pT spectra of pions, kaons, protons, and D0 mesons in pp collisions at √{s}=7 TeV. In all cases, the results using Bayesian PID were found to be consistent with previous measurements performed by ALICE using a standard PID approach. For the measurement of D0 → K-π+, it was found that a Bayesian PID approach gave a higher signal-to-background ratio and a similar or larger statistical significance when compared with standard PID selections, despite a reduced identification efficiency. Finally, we present an exploratory study of the measurement of Λc+ → p K-π+ in pp collisions at √{s}=7 TeV, using the Bayesian approach for the identification of its decay products.

  3. Π4U: A high performance computing framework for Bayesian uncertainty quantification of complex models

    NASA Astrophysics Data System (ADS)

    Hadjidoukas, P. E.; Angelikopoulos, P.; Papadimitriou, C.; Koumoutsakos, P.

    2015-03-01

    We present Π4U, an extensible framework, for non-intrusive Bayesian Uncertainty Quantification and Propagation (UQ+P) of complex and computationally demanding physical models, that can exploit massively parallel computer architectures. The framework incorporates Laplace asymptotic approximations as well as stochastic algorithms, along with distributed numerical differentiation and task-based parallelism for heterogeneous clusters. Sampling is based on the Transitional Markov Chain Monte Carlo (TMCMC) algorithm and its variants. The optimization tasks associated with the asymptotic approximations are treated via the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). A modified subset simulation method is used for posterior reliability measurements of rare events. The framework accommodates scheduling of multiple physical model evaluations based on an adaptive load balancing library and shows excellent scalability. In addition to the software framework, we also provide guidelines as to the applicability and efficiency of Bayesian tools when applied to computationally demanding physical models. Theoretical and computational developments are demonstrated with applications drawn from molecular dynamics, structural dynamics and granular flow.

  4. Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation.

    PubMed

    Technow, Frank; Messina, Carlos D; Totir, L Radu; Cooper, Mark

    2015-01-01

    Genomic selection, enabled by whole genome prediction (WGP) methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E), continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs) attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC), a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics.

  5. Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation

    PubMed Central

    Technow, Frank; Messina, Carlos D.; Totir, L. Radu; Cooper, Mark

    2015-01-01

    Genomic selection, enabled by whole genome prediction (WGP) methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E), continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs) attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC), a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics. PMID:26121133

  6. A Bayesian method for assessing multiscalespecies-habitat relationships

    USGS Publications Warehouse

    Stuber, Erica F.; Gruber, Lutz F.; Fontaine, Joseph J.

    2017-01-01

    ContextScientists face several theoretical and methodological challenges in appropriately describing fundamental wildlife-habitat relationships in models. The spatial scales of habitat relationships are often unknown, and are expected to follow a multi-scale hierarchy. Typical frequentist or information theoretic approaches often suffer under collinearity in multi-scale studies, fail to converge when models are complex or represent an intractable computational burden when candidate model sets are large.ObjectivesOur objective was to implement an automated, Bayesian method for inference on the spatial scales of habitat variables that best predict animal abundance.MethodsWe introduce Bayesian latent indicator scale selection (BLISS), a Bayesian method to select spatial scales of predictors using latent scale indicator variables that are estimated with reversible-jump Markov chain Monte Carlo sampling. BLISS does not suffer from collinearity, and substantially reduces computation time of studies. We present a simulation study to validate our method and apply our method to a case-study of land cover predictors for ring-necked pheasant (Phasianus colchicus) abundance in Nebraska, USA.ResultsOur method returns accurate descriptions of the explanatory power of multiple spatial scales, and unbiased and precise parameter estimates under commonly encountered data limitations including spatial scale autocorrelation, effect size, and sample size. BLISS outperforms commonly used model selection methods including stepwise and AIC, and reduces runtime by 90%.ConclusionsGiven the pervasiveness of scale-dependency in ecology, and the implications of mismatches between the scales of analyses and ecological processes, identifying the spatial scales over which species are integrating habitat information is an important step in understanding species-habitat relationships. BLISS is a widely applicable method for identifying important spatial scales, propagating scale uncertainty, and testing hypotheses of scaling relationships.

  7. Specialist and generalist symbionts show counterintuitive levels of genetic diversity and discordant demographic histories along the Florida Reef Tract

    NASA Astrophysics Data System (ADS)

    Titus, Benjamin M.; Daly, Marymegan

    2017-03-01

    Specialist and generalist life histories are expected to result in contrasting levels of genetic diversity at the population level, and symbioses are expected to lead to patterns that reflect a shared biogeographic history and co-diversification. We test these assumptions using mtDNA sequencing and a comparative phylogeographic approach for six co-occurring crustacean species that are symbiotic with sea anemones on western Atlantic coral reefs, yet vary in their host specificities: four are host specialists and two are host generalists. We first conducted species discovery analyses to delimit cryptic lineages, followed by classic population genetic diversity analyses for each delimited taxon, and then reconstructed the demographic history for each taxon using traditional summary statistics, Bayesian skyline plots, and approximate Bayesian computation to test for signatures of recent and concerted population expansion. The genetic diversity values recovered here contravene the expectations of the specialist-generalist variation hypothesis and classic population genetics theory; all specialist lineages had greater genetic diversity than generalists. Demography suggests recent population expansions in all taxa, although Bayesian skyline plots and approximate Bayesian computation suggest the timing and magnitude of these events were idiosyncratic. These results do not meet the a priori expectation of concordance among symbiotic taxa and suggest that intrinsic aspects of species biology may contribute more to phylogeographic history than extrinsic forces that shape whole communities. The recovery of two cryptic specialist lineages adds an additional layer of biodiversity to this symbiosis and contributes to an emerging pattern of cryptic speciation in the specialist taxa. Our results underscore the differences in the evolutionary processes acting on marine systems from the terrestrial processes that often drive theory. Finally, we continue to highlight the Florida Reef Tract as an important biodiversity hotspot.

  8. Bayesian neural adjustment of inhibitory control predicts emergence of problem stimulant use.

    PubMed

    Harlé, Katia M; Stewart, Jennifer L; Zhang, Shunan; Tapert, Susan F; Yu, Angela J; Paulus, Martin P

    2015-11-01

    Bayesian ideal observer models quantify individuals' context- and experience-dependent beliefs and expectations about their environment, which provides a powerful approach (i) to link basic behavioural mechanisms to neural processing; and (ii) to generate clinical predictors for patient populations. Here, we focus on (ii) and determine whether individual differences in the neural representation of the need to stop in an inhibitory task can predict the development of problem use (i.e. abuse or dependence) in individuals experimenting with stimulants. One hundred and fifty-seven non-dependent occasional stimulant users, aged 18-24, completed a stop-signal task while undergoing functional magnetic resonance imaging. These individuals were prospectively followed for 3 years and evaluated for stimulant use and abuse/dependence symptoms. At follow-up, 38 occasional stimulant users met criteria for a stimulant use disorder (problem stimulant users), while 50 had discontinued use (desisted stimulant users). We found that those individuals who showed greater neural responses associated with Bayesian prediction errors, i.e. the difference between actual and expected need to stop on a given trial, in right medial prefrontal cortex/anterior cingulate cortex, caudate, anterior insula, and thalamus were more likely to exhibit problem use 3 years later. Importantly, these computationally based neural predictors outperformed clinical measures and non-model based neural variables in predicting clinical status. In conclusion, young adults who show exaggerated brain processing underlying whether to 'stop' or to 'go' are more likely to develop stimulant abuse. Thus, Bayesian cognitive models provide both a computational explanation and potential predictive biomarkers of belief processing deficits in individuals at risk for stimulant addiction. © The Author (2015). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  9. Massive parallelization of serial inference algorithms for a complex generalized linear model

    PubMed Central

    Suchard, Marc A.; Simpson, Shawn E.; Zorych, Ivan; Ryan, Patrick; Madigan, David

    2014-01-01

    Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we show how high-performance statistical computation, including graphics processing units, relatively inexpensive highly parallel computing devices, can enable complex methods in large databases. We focus on optimization and massive parallelization of cyclic coordinate descent approaches to fit a conditioned generalized linear model involving tens of millions of observations and thousands of predictors in a Bayesian context. We find orders-of-magnitude improvement in overall run-time. Coordinate descent approaches are ubiquitous in high-dimensional statistics and the algorithms we propose open up exciting new methodological possibilities with the potential to significantly improve drug safety. PMID:25328363

  10. The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective.

    PubMed

    Kruschke, John K; Liddell, Torrin M

    2018-02-01

    In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.

  11. Simultaneous Force Regression and Movement Classification of Fingers via Surface EMG within a Unified Bayesian Framework.

    PubMed

    Baldacchino, Tara; Jacobs, William R; Anderson, Sean R; Worden, Keith; Rowson, Jennifer

    2018-01-01

    This contribution presents a novel methodology for myolectric-based control using surface electromyographic (sEMG) signals recorded during finger movements. A multivariate Bayesian mixture of experts (MoE) model is introduced which provides a powerful method for modeling force regression at the fingertips, while also performing finger movement classification as a by-product of the modeling algorithm. Bayesian inference of the model allows uncertainties to be naturally incorporated into the model structure. This method is tested using data from the publicly released NinaPro database which consists of sEMG recordings for 6 degree-of-freedom force activations for 40 intact subjects. The results demonstrate that the MoE model achieves similar performance compared to the benchmark set by the authors of NinaPro for finger force regression. Additionally, inherent to the Bayesian framework is the inclusion of uncertainty in the model parameters, naturally providing confidence bounds on the force regression predictions. Furthermore, the integrated clustering step allows a detailed investigation into classification of the finger movements, without incurring any extra computational effort. Subsequently, a systematic approach to assessing the importance of the number of electrodes needed for accurate control is performed via sensitivity analysis techniques. A slight degradation in regression performance is observed for a reduced number of electrodes, while classification performance is unaffected.

  12. Wavelet-Bayesian inference of cosmic strings embedded in the cosmic microwave background

    NASA Astrophysics Data System (ADS)

    McEwen, J. D.; Feeney, S. M.; Peiris, H. V.; Wiaux, Y.; Ringeval, C.; Bouchet, F. R.

    2017-12-01

    Cosmic strings are a well-motivated extension to the standard cosmological model and could induce a subdominant component in the anisotropies of the cosmic microwave background (CMB), in addition to the standard inflationary component. The detection of strings, while observationally challenging, would provide a direct probe of physics at very high-energy scales. We develop a framework for cosmic string inference from observations of the CMB made over the celestial sphere, performing a Bayesian analysis in wavelet space where the string-induced CMB component has distinct statistical properties to the standard inflationary component. Our wavelet-Bayesian framework provides a principled approach to compute the posterior distribution of the string tension Gμ and the Bayesian evidence ratio comparing the string model to the standard inflationary model. Furthermore, we present a technique to recover an estimate of any string-induced CMB map embedded in observational data. Using Planck-like simulations, we demonstrate the application of our framework and evaluate its performance. The method is sensitive to Gμ ∼ 5 × 10-7 for Nambu-Goto string simulations that include an integrated Sachs-Wolfe contribution only and do not include any recombination effects, before any parameters of the analysis are optimized. The sensitivity of the method compares favourably with other techniques applied to the same simulations.

  13. Bayesian Design of Superiority Clinical Trials for Recurrent Events Data with Applications to Bleeding and Transfusion Events in Myelodyplastic Syndrome

    PubMed Central

    Chen, Ming-Hui; Zeng, Donglin; Hu, Kuolung; Jia, Catherine

    2014-01-01

    Summary In many biomedical studies, patients may experience the same type of recurrent event repeatedly over time, such as bleeding, multiple infections and disease. In this article, we propose a Bayesian design to a pivotal clinical trial in which lower risk myelodysplastic syndromes (MDS) patients are treated with MDS disease modifying therapies. One of the key study objectives is to demonstrate the investigational product (treatment) effect on reduction of platelet transfusion and bleeding events while receiving MDS therapies. In this context, we propose a new Bayesian approach for the design of superiority clinical trials using recurrent events frailty regression models. Historical recurrent events data from an already completed phase 2 trial are incorporated into the Bayesian design via the partial borrowing power prior of Ibrahim et al. (2012, Biometrics 68, 578–586). An efficient Gibbs sampling algorithm, a predictive data generation algorithm, and a simulation-based algorithm are developed for sampling from the fitting posterior distribution, generating the predictive recurrent events data, and computing various design quantities such as the type I error rate and power, respectively. An extensive simulation study is conducted to compare the proposed method to the existing frequentist methods and to investigate various operating characteristics of the proposed design. PMID:25041037

  14. Receptive Field Inference with Localized Priors

    PubMed Central

    Park, Mijung; Pillow, Jonathan W.

    2011-01-01

    The linear receptive field describes a mapping from sensory stimuli to a one-dimensional variable governing a neuron's spike response. However, traditional receptive field estimators such as the spike-triggered average converge slowly and often require large amounts of data. Bayesian methods seek to overcome this problem by biasing estimates towards solutions that are more likely a priori, typically those with small, smooth, or sparse coefficients. Here we introduce a novel Bayesian receptive field estimator designed to incorporate locality, a powerful form of prior information about receptive field structure. The key to our approach is a hierarchical receptive field model that flexibly adapts to localized structure in both spacetime and spatiotemporal frequency, using an inference method known as empirical Bayes. We refer to our method as automatic locality determination (ALD), and show that it can accurately recover various types of smooth, sparse, and localized receptive fields. We apply ALD to neural data from retinal ganglion cells and V1 simple cells, and find it achieves error rates several times lower than standard estimators. Thus, estimates of comparable accuracy can be achieved with substantially less data. Finally, we introduce a computationally efficient Markov Chain Monte Carlo (MCMC) algorithm for fully Bayesian inference under the ALD prior, yielding accurate Bayesian confidence intervals for small or noisy datasets. PMID:22046110

  15. Comparison between the basic least squares and the Bayesian approach for elastic constants identification

    NASA Astrophysics Data System (ADS)

    Gogu, C.; Haftka, R.; LeRiche, R.; Molimard, J.; Vautrin, A.; Sankar, B.

    2008-11-01

    The basic formulation of the least squares method, based on the L2 norm of the misfit, is still widely used today for identifying elastic material properties from experimental data. An alternative statistical approach is the Bayesian method. We seek here situations with significant difference between the material properties found by the two methods. For a simple three bar truss example we illustrate three such situations in which the Bayesian approach leads to more accurate results: different magnitude of the measurements, different uncertainty in the measurements and correlation among measurements. When all three effects add up, the Bayesian approach can have a large advantage. We then compared the two methods for identification of elastic constants from plate vibration natural frequencies.

  16. Efficient reconstruction method for ground layer adaptive optics with mixed natural and laser guide stars.

    PubMed

    Wagner, Roland; Helin, Tapio; Obereder, Andreas; Ramlau, Ronny

    2016-02-20

    The imaging quality of modern ground-based telescopes such as the planned European Extremely Large Telescope is affected by atmospheric turbulence. In consequence, they heavily depend on stable and high-performance adaptive optics (AO) systems. Using measurements of incoming light from guide stars, an AO system compensates for the effects of turbulence by adjusting so-called deformable mirror(s) (DMs) in real time. In this paper, we introduce a novel reconstruction method for ground layer adaptive optics. In the literature, a common approach to this problem is to use Bayesian inference in order to model the specific noise structure appearing due to spot elongation. This approach leads to large coupled systems with high computational effort. Recently, fast solvers of linear order, i.e., with computational complexity O(n), where n is the number of DM actuators, have emerged. However, the quality of such methods typically degrades in low flux conditions. Our key contribution is to achieve the high quality of the standard Bayesian approach while at the same time maintaining the linear order speed of the recent solvers. Our method is based on performing a separate preprocessing step before applying the cumulative reconstructor (CuReD). The efficiency and performance of the new reconstructor are demonstrated using the OCTOPUS, the official end-to-end simulation environment of the ESO for extremely large telescopes. For more specific simulations we also use the MOST toolbox.

  17. Nonlinear and non-Gaussian Bayesian based handwriting beautification

    NASA Astrophysics Data System (ADS)

    Shi, Cao; Xiao, Jianguo; Xu, Canhui; Jia, Wenhua

    2013-03-01

    A framework is proposed in this paper to effectively and efficiently beautify handwriting by means of a novel nonlinear and non-Gaussian Bayesian algorithm. In the proposed framework, format and size of handwriting image are firstly normalized, and then typeface in computer system is applied to optimize vision effect of handwriting. The Bayesian statistics is exploited to characterize the handwriting beautification process as a Bayesian dynamic model. The model parameters to translate, rotate and scale typeface in computer system are controlled by state equation, and the matching optimization between handwriting and transformed typeface is employed by measurement equation. Finally, the new typeface, which is transformed from the original one and gains the best nonlinear and non-Gaussian optimization, is the beautification result of handwriting. Experimental results demonstrate the proposed framework provides a creative handwriting beautification methodology to improve visual acceptance.

  18. A High Performance Computing Study of a Scalable FISST-Based Approach to Multi-Target, Multi-Sensor Tracking

    NASA Astrophysics Data System (ADS)

    Hussein, I.; Wilkins, M.; Roscoe, C.; Faber, W.; Chakravorty, S.; Schumacher, P.

    2016-09-01

    Finite Set Statistics (FISST) is a rigorous Bayesian multi-hypothesis management tool for the joint detection, classification and tracking of multi-sensor, multi-object systems. Implicit within the approach are solutions to the data association and target label-tracking problems. The full FISST filtering equations, however, are intractable. While FISST-based methods such as the PHD and CPHD filters are tractable, they require heavy moment approximations to the full FISST equations that result in a significant loss of information contained in the collected data. In this paper, we review Smart Sampling Markov Chain Monte Carlo (SSMCMC) that enables FISST to be tractable while avoiding moment approximations. We study the effect of tuning key SSMCMC parameters on tracking quality and computation time. The study is performed on a representative space object catalog with varying numbers of RSOs. The solution is implemented in the Scala computing language at the Maui High Performance Computing Center (MHPCC) facility.

  19. Modeling the Space Debris Environment with MASTER-2009 and ORDEM2010

    NASA Technical Reports Server (NTRS)

    Flegel, S.; Gelhaus, J.; Wiedemann, C.; Mockel, M.; Vorsmann, P.; Krisko, P.; Xu, Y. -L.; Horstman, M. F.; Opiela, J. N.; Matney, M.; hide

    2010-01-01

    Spacecraft analysis using ORDEM2010 uses a high-fidelity population model to compute risk to on-orbit assets. The ORDEM2010 GUI allows visualization of spacecraft flux in 2-D and 1-D. The population was produced using a Bayesian statistical approach with measured and modeled environment data. Validation of sizes < 1mm were performed using Shuttle window and radiator impact measurements. Validation of sizes > 1mm is on-going.

  20. Bayesian networks of age estimation and classification based on dental evidence: A study on the third molar mineralization.

    PubMed

    Sironi, Emanuele; Pinchi, Vilma; Pradella, Francesco; Focardi, Martina; Bozza, Silvia; Taroni, Franco

    2018-04-01

    Not only does the Bayesian approach offer a rational and logical environment for evidence evaluation in a forensic framework, but it also allows scientists to coherently deal with uncertainty related to a collection of multiple items of evidence, due to its flexible nature. Such flexibility might come at the expense of elevated computational complexity, which can be handled by using specific probabilistic graphical tools, namely Bayesian networks. In the current work, such probabilistic tools are used for evaluating dental evidence related to the development of third molars. A set of relevant properties characterizing the graphical models are discussed and Bayesian networks are implemented to deal with the inferential process laying beyond the estimation procedure, as well as to provide age estimates. Such properties include operationality, flexibility, coherence, transparence and sensitivity. A data sample composed of Italian subjects was employed for the analysis; results were in agreement with previous studies in terms of point estimate and age classification. The influence of the prior probability elicitation in terms of Bayesian estimate and classifies was also analyzed. Findings also supported the opportunity to take into consideration multiple teeth in the evaluative procedure, since it can be shown this results in an increased robustness towards the prior probability elicitation process, as well as in more favorable outcomes from a forensic perspective. Copyright © 2018 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  1. A Bayesian approach to reliability and confidence

    NASA Technical Reports Server (NTRS)

    Barnes, Ron

    1989-01-01

    The historical evolution of NASA's interest in quantitative measures of reliability assessment is outlined. The introduction of some quantitative methodologies into the Vehicle Reliability Branch of the Safety, Reliability and Quality Assurance (SR and QA) Division at Johnson Space Center (JSC) was noted along with the development of the Extended Orbiter Duration--Weakest Link study which will utilize quantitative tools for a Bayesian statistical analysis. Extending the earlier work of NASA sponsor, Richard Heydorn, researchers were able to produce a consistent Bayesian estimate for the reliability of a component and hence by a simple extension for a system of components in some cases where the rate of failure is not constant but varies over time. Mechanical systems in general have this property since the reliability usually decreases markedly as the parts degrade over time. While they have been able to reduce the Bayesian estimator to a simple closed form for a large class of such systems, the form for the most general case needs to be attacked by the computer. Once a table is generated for this form, researchers will have a numerical form for the general solution. With this, the corresponding probability statements about the reliability of a system can be made in the most general setting. Note that the utilization of uniform Bayesian priors represents a worst case scenario in the sense that as researchers incorporate more expert opinion into the model, they will be able to improve the strength of the probability calculations.

  2. Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model

    NASA Astrophysics Data System (ADS)

    Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.

    2014-02-01

    Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can be successfully applied to process-based models of high complexity. The methodology is particularly suitable for heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models.

  3. DATMAN: A reliability data analysis program using Bayesian updating

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Becker, M.; Feltus, M.A.

    1996-12-31

    Preventive maintenance (PM) techniques focus on the prevention of failures, in particular, system components that are important to plant functions. Reliability-centered maintenance (RCM) improves on the PM techniques by introducing a set of guidelines by which to evaluate the system functions. It also minimizes intrusive maintenance, labor, and equipment downtime without sacrificing system performance when its function is essential for plant safety. Both the PM and RCM approaches require that system reliability data be updated as more component failures and operation time are acquired. Systems reliability and the likelihood of component failures can be calculated by Bayesian statistical methods, whichmore » can update these data. The DATMAN computer code has been developed at Penn State to simplify the Bayesian analysis by performing tedious calculations needed for RCM reliability analysis. DATMAN reads data for updating, fits a distribution that best fits the data, and calculates component reliability. DATMAN provides a user-friendly interface menu that allows the user to choose from several common prior and posterior distributions, insert new failure data, and visually select the distribution that matches the data most accurately.« less

  4. Bayesian spatiotemporal model of fMRI data using transfer functions.

    PubMed

    Quirós, Alicia; Diez, Raquel Montes; Wilson, Simon P

    2010-09-01

    This research describes a new Bayesian spatiotemporal model to analyse BOLD fMRI studies. In the temporal dimension, we describe the shape of the hemodynamic response function (HRF) with a transfer function model. The spatial continuity and local homogeneity of the evoked responses are modelled by a Gaussian Markov random field prior on the parameter indicating activations. The proposal constitutes an extension of the spatiotemporal model presented in a previous approach [Quirós, A., Montes Diez, R. and Gamerman, D., 2010. Bayesian spatiotemporal model of fMRI data, Neuroimage, 49: 442-456], offering more flexibility in the estimation of the HRF and computational advantages in the resulting MCMC algorithm. Simulations from the model are performed in order to ascertain the performance of the sampling scheme and the ability of the posterior to estimate model parameters, as well as to check the model sensitivity to signal to noise ratio. Results are shown on synthetic data and on a real data set from a block-design fMRI experiment. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  5. Estimating Bayesian Phylogenetic Information Content

    PubMed Central

    Lewis, Paul O.; Chen, Ming-Hui; Kuo, Lynn; Lewis, Louise A.; Fučíková, Karolina; Neupane, Suman; Wang, Yu-Bo; Shi, Daoyuan

    2016-01-01

    Measuring the phylogenetic information content of data has a long history in systematics. Here we explore a Bayesian approach to information content estimation. The entropy of the posterior distribution compared with the entropy of the prior distribution provides a natural way to measure information content. If the data have no information relevant to ranking tree topologies beyond the information supplied by the prior, the posterior and prior will be identical. Information in data discourages consideration of some hypotheses allowed by the prior, resulting in a posterior distribution that is more concentrated (has lower entropy) than the prior. We focus on measuring information about tree topology using marginal posterior distributions of tree topologies. We show that both the accuracy and the computational efficiency of topological information content estimation improve with use of the conditional clade distribution, which also allows topological information content to be partitioned by clade. We explore two important applications of our method: providing a compelling definition of saturation and detecting conflict among data partitions that can negatively affect analyses of concatenated data. [Bayesian; concatenation; conditional clade distribution; entropy; information; phylogenetics; saturation.] PMID:27155008

  6. Genome-wide regression and prediction with the BGLR statistical package.

    PubMed

    Pérez, Paulino; de los Campos, Gustavo

    2014-10-01

    Many modern genomic data analyses require implementing regressions where the number of parameters (p, e.g., the number of marker effects) exceeds sample size (n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and nonparametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection of Bayesian regression models, including parametric variable selection and shrinkage methods and semiparametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many nongenomic applications as well. The response can be continuous (censored or not) or categorical (either binary or ordinal). The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis. Copyright © 2014 by the Genetics Society of America.

  7. Efficient Posterior Probability Mapping Using Savage-Dickey Ratios

    PubMed Central

    Penny, William D.; Ridgway, Gerard R.

    2013-01-01

    Statistical Parametric Mapping (SPM) is the dominant paradigm for mass-univariate analysis of neuroimaging data. More recently, a Bayesian approach termed Posterior Probability Mapping (PPM) has been proposed as an alternative. PPM offers two advantages: (i) inferences can be made about effect size thus lending a precise physiological meaning to activated regions, (ii) regions can be declared inactive. This latter facility is most parsimoniously provided by PPMs based on Bayesian model comparisons. To date these comparisons have been implemented by an Independent Model Optimization (IMO) procedure which separately fits null and alternative models. This paper proposes a more computationally efficient procedure based on Savage-Dickey approximations to the Bayes factor, and Taylor-series approximations to the voxel-wise posterior covariance matrices. Simulations show the accuracy of this Savage-Dickey-Taylor (SDT) method to be comparable to that of IMO. Results on fMRI data show excellent agreement between SDT and IMO for second-level models, and reasonable agreement for first-level models. This Savage-Dickey test is a Bayesian analogue of the classical SPM-F and allows users to implement model comparison in a truly interactive manner. PMID:23533640

  8. Bayesian generalized linear mixed modeling of Tuberculosis using informative priors.

    PubMed

    Ojo, Oluwatobi Blessing; Lougue, Siaka; Woldegerima, Woldegebriel Assefa

    2017-01-01

    TB is rated as one of the world's deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014.

  9. A Bayesian Attractor Model for Perceptual Decision Making

    PubMed Central

    Bitzer, Sebastian; Bruineberg, Jelle; Kiebel, Stefan J.

    2015-01-01

    Even for simple perceptual decisions, the mechanisms that the brain employs are still under debate. Although current consensus states that the brain accumulates evidence extracted from noisy sensory information, open questions remain about how this simple model relates to other perceptual phenomena such as flexibility in decisions, decision-dependent modulation of sensory gain, or confidence about a decision. We propose a novel approach of how perceptual decisions are made by combining two influential formalisms into a new model. Specifically, we embed an attractor model of decision making into a probabilistic framework that models decision making as Bayesian inference. We show that the new model can explain decision making behaviour by fitting it to experimental data. In addition, the new model combines for the first time three important features: First, the model can update decisions in response to switches in the underlying stimulus. Second, the probabilistic formulation accounts for top-down effects that may explain recent experimental findings of decision-related gain modulation of sensory neurons. Finally, the model computes an explicit measure of confidence which we relate to recent experimental evidence for confidence computations in perceptual decision tasks. PMID:26267143

  10. Exact calculation of distributions on integers, with application to sequence alignment.

    PubMed

    Newberg, Lee A; Lawrence, Charles E

    2009-01-01

    Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure. In many cases, score is naturally defined on integers, such as a count of the number of pairing differences between two sequence alignments, or else an integer score has been adopted for computational reasons, such as in the test of significance of motif scores. The probability distribution of the score under an appropriate probabilistic model is of interest, such as in tests of significance of motif scores, or in calculation of Bayesian confidence limits around an alignment. Here we present three algorithms for calculating the exact distribution of a score of this type; then, in the context of pairwise local sequence alignments, we apply the approach so as to find the alignment score distribution and Bayesian confidence limits.

  11. Predictive uncertainty analysis of plume distribution for geological carbon sequestration using sparse-grid Bayesian method

    NASA Astrophysics Data System (ADS)

    Shi, X.; Zhang, G.

    2013-12-01

    Because of the extensive computational burden, parametric uncertainty analyses are rarely conducted for geological carbon sequestration (GCS) process based multi-phase models. The difficulty of predictive uncertainty analysis for the CO2 plume migration in realistic GCS models is not only due to the spatial distribution of the caprock and reservoir (i.e. heterogeneous model parameters), but also because the GCS optimization estimation problem has multiple local minima due to the complex nonlinear multi-phase (gas and aqueous), and multi-component (water, CO2, salt) transport equations. The geological model built by Doughty and Pruess (2004) for the Frio pilot site (Texas) was selected and assumed to represent the 'true' system, which was composed of seven different facies (geological units) distributed among 10 layers. We chose to calibrate the permeabilities of these facies. Pressure and gas saturation values from this true model were then extracted and used as observations for subsequent model calibration. Random noise was added to the observations to approximate realistic field conditions. Each simulation of the model lasts about 2 hours. In this study, we develop a new approach that improves computational efficiency of Bayesian inference by constructing a surrogate system based on an adaptive sparse-grid stochastic collocation method. This surrogate response surface global optimization algorithm is firstly used to calibrate the model parameters, then prediction uncertainty of the CO2 plume position is quantified due to the propagation from parametric uncertainty in the numerical experiments, which is also compared to the actual plume from the 'true' model. Results prove that the approach is computationally efficient for multi-modal optimization and prediction uncertainty quantification for computationally expensive simulation models. Both our inverse methodology and findings can be broadly applicable to GCS in heterogeneous storage formations.

  12. Bayesian Inference: with ecological applications

    USGS Publications Warehouse

    Link, William A.; Barker, Richard J.

    2010-01-01

    This text provides a mathematically rigorous yet accessible and engaging introduction to Bayesian inference with relevant examples that will be of interest to biologists working in the fields of ecology, wildlife management and environmental studies as well as students in advanced undergraduate statistics.. This text opens the door to Bayesian inference, taking advantage of modern computational efficiencies and easily accessible software to evaluate complex hierarchical models.

  13. Incorporating approximation error in surrogate based Bayesian inversion

    NASA Astrophysics Data System (ADS)

    Zhang, J.; Zeng, L.; Li, W.; Wu, L.

    2015-12-01

    There are increasing interests in applying surrogates for inverse Bayesian modeling to reduce repetitive evaluations of original model. In this way, the computational cost is expected to be saved. However, the approximation error of surrogate model is usually overlooked. This is partly because that it is difficult to evaluate the approximation error for many surrogates. Previous studies have shown that, the direct combination of surrogates and Bayesian methods (e.g., Markov Chain Monte Carlo, MCMC) may lead to biased estimations when the surrogate cannot emulate the highly nonlinear original system. This problem can be alleviated by implementing MCMC in a two-stage manner. However, the computational cost is still high since a relatively large number of original model simulations are required. In this study, we illustrate the importance of incorporating approximation error in inverse Bayesian modeling. Gaussian process (GP) is chosen to construct the surrogate for its convenience in approximation error evaluation. Numerical cases of Bayesian experimental design and parameter estimation for contaminant source identification are used to illustrate this idea. It is shown that, once the surrogate approximation error is well incorporated into Bayesian framework, promising results can be obtained even when the surrogate is directly used, and no further original model simulations are required.

  14. A hybrid expectation maximisation and MCMC sampling algorithm to implement Bayesian mixture model based genomic prediction and QTL mapping.

    PubMed

    Wang, Tingting; Chen, Yi-Ping Phoebe; Bowman, Phil J; Goddard, Michael E; Hayes, Ben J

    2016-09-21

    Bayesian mixture models in which the effects of SNP are assumed to come from normal distributions with different variances are attractive for simultaneous genomic prediction and QTL mapping. These models are usually implemented with Monte Carlo Markov Chain (MCMC) sampling, which requires long compute times with large genomic data sets. Here, we present an efficient approach (termed HyB_BR), which is a hybrid of an Expectation-Maximisation algorithm, followed by a limited number of MCMC without the requirement for burn-in. To test prediction accuracy from HyB_BR, dairy cattle and human disease trait data were used. In the dairy cattle data, there were four quantitative traits (milk volume, protein kg, fat% in milk and fertility) measured in 16,214 cattle from two breeds genotyped for 632,002 SNPs. Validation of genomic predictions was in a subset of cattle either from the reference set or in animals from a third breeds that were not in the reference set. In all cases, HyB_BR gave almost identical accuracies to Bayesian mixture models implemented with full MCMC, however computational time was reduced by up to 1/17 of that required by full MCMC. The SNPs with high posterior probability of a non-zero effect were also very similar between full MCMC and HyB_BR, with several known genes affecting milk production in this category, as well as some novel genes. HyB_BR was also applied to seven human diseases with 4890 individuals genotyped for around 300 K SNPs in a case/control design, from the Welcome Trust Case Control Consortium (WTCCC). In this data set, the results demonstrated again that HyB_BR performed as well as Bayesian mixture models with full MCMC for genomic predictions and genetic architecture inference while reducing the computational time from 45 h with full MCMC to 3 h with HyB_BR. The results for quantitative traits in cattle and disease in humans demonstrate that HyB_BR can perform equally well as Bayesian mixture models implemented with full MCMC in terms of prediction accuracy, but with up to 17 times faster than the full MCMC implementations. The HyB_BR algorithm makes simultaneous genomic prediction, QTL mapping and inference of genetic architecture feasible in large genomic data sets.

  15. Daniel Goodman’s empirical approach to Bayesian statistics

    USGS Publications Warehouse

    Gerrodette, Tim; Ward, Eric; Taylor, Rebecca L.; Schwarz, Lisa K.; Eguchi, Tomoharu; Wade, Paul; Himes Boor, Gina

    2016-01-01

    Bayesian statistics, in contrast to classical statistics, uses probability to represent uncertainty about the state of knowledge. Bayesian statistics has often been associated with the idea that knowledge is subjective and that a probability distribution represents a personal degree of belief. Dr. Daniel Goodman considered this viewpoint problematic for issues of public policy. He sought to ground his Bayesian approach in data, and advocated the construction of a prior as an empirical histogram of “similar” cases. In this way, the posterior distribution that results from a Bayesian analysis combined comparable previous data with case-specific current data, using Bayes’ formula. Goodman championed such a data-based approach, but he acknowledged that it was difficult in practice. If based on a true representation of our knowledge and uncertainty, Goodman argued that risk assessment and decision-making could be an exact science, despite the uncertainties. In his view, Bayesian statistics is a critical component of this science because a Bayesian analysis produces the probabilities of future outcomes. Indeed, Goodman maintained that the Bayesian machinery, following the rules of conditional probability, offered the best legitimate inference from available data. We give an example of an informative prior in a recent study of Steller sea lion spatial use patterns in Alaska.

  16. Numerical study on the sequential Bayesian approach for radioactive materials detection

    NASA Astrophysics Data System (ADS)

    Qingpei, Xiang; Dongfeng, Tian; Jianyu, Zhu; Fanhua, Hao; Ge, Ding; Jun, Zeng

    2013-01-01

    A new detection method, based on the sequential Bayesian approach proposed by Candy et al., offers new horizons for the research of radioactive detection. Compared with the commonly adopted detection methods incorporated with statistical theory, the sequential Bayesian approach offers the advantages of shorter verification time during the analysis of spectra that contain low total counts, especially in complex radionuclide components. In this paper, a simulation experiment platform implanted with the methodology of sequential Bayesian approach was developed. Events sequences of γ-rays associating with the true parameters of a LaBr3(Ce) detector were obtained based on an events sequence generator using Monte Carlo sampling theory to study the performance of the sequential Bayesian approach. The numerical experimental results are in accordance with those of Candy. Moreover, the relationship between the detection model and the event generator, respectively represented by the expected detection rate (Am) and the tested detection rate (Gm) parameters, is investigated. To achieve an optimal performance for this processor, the interval of the tested detection rate as a function of the expected detection rate is also presented.

  17. Automating approximate Bayesian computation by local linear regression.

    PubMed

    Thornton, Kevin R

    2009-07-07

    In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.

  18. Sparse-grid, reduced-basis Bayesian inversion: Nonaffine-parametric nonlinear equations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Peng, E-mail: peng@ices.utexas.edu; Schwab, Christoph, E-mail: christoph.schwab@sam.math.ethz.ch

    2016-07-01

    We extend the reduced basis (RB) accelerated Bayesian inversion methods for affine-parametric, linear operator equations which are considered in [16,17] to non-affine, nonlinear parametric operator equations. We generalize the analysis of sparsity of parametric forward solution maps in [20] and of Bayesian inversion in [48,49] to the fully discrete setting, including Petrov–Galerkin high-fidelity (“HiFi”) discretization of the forward maps. We develop adaptive, stochastic collocation based reduction methods for the efficient computation of reduced bases on the parametric solution manifold. The nonaffinity and nonlinearity with respect to (w.r.t.) the distributed, uncertain parameters and the unknown solution is collocated; specifically, by themore » so-called Empirical Interpolation Method (EIM). For the corresponding Bayesian inversion problems, computational efficiency is enhanced in two ways: first, expectations w.r.t. the posterior are computed by adaptive quadratures with dimension-independent convergence rates proposed in [49]; the present work generalizes [49] to account for the impact of the PG discretization in the forward maps on the convergence rates of the Quantities of Interest (QoI for short). Second, we propose to perform the Bayesian estimation only w.r.t. a parsimonious, RB approximation of the posterior density. Based on the approximation results in [49], the infinite-dimensional parametric, deterministic forward map and operator admit N-term RB and EIM approximations which converge at rates which depend only on the sparsity of the parametric forward map. In several numerical experiments, the proposed algorithms exhibit dimension-independent convergence rates which equal, at least, the currently known rate estimates for N-term approximation. We propose to accelerate Bayesian estimation by first offline construction of reduced basis surrogates of the Bayesian posterior density. The parsimonious surrogates can then be employed for online data assimilation and for Bayesian estimation. They also open a perspective for optimal experimental design.« less

  19. Prior approval: the growth of Bayesian methods in psychology.

    PubMed

    Andrews, Mark; Baguley, Thom

    2013-02-01

    Within the last few years, Bayesian methods of data analysis in psychology have proliferated. In this paper, we briefly review the history or the Bayesian approach to statistics, and consider the implications that Bayesian methods have for the theory and practice of data analysis in psychology.

  20. A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

    DTIC Science & Technology

    2017-09-01

    efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components

  1. A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.

    ERIC Educational Resources Information Center

    Glas, Cees A. W.; Meijer, Rob R.

    A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…

  2. A Simulation Study Comparison of Bayesian Estimation with Conventional Methods for Estimating Unknown Change Points

    ERIC Educational Resources Information Center

    Wang, Lijuan; McArdle, John J.

    2008-01-01

    The main purpose of this research is to evaluate the performance of a Bayesian approach for estimating unknown change points using Monte Carlo simulations. The univariate and bivariate unknown change point mixed models were presented and the basic idea of the Bayesian approach for estimating the models was discussed. The performance of Bayesian…

  3. Bayesian networks improve causal environmental ...

    EPA Pesticide Factsheets

    Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the problem, using this knowledge with probabilistic calculus to combine multiple lines of evidence, and minimizing biases in predicting or diagnosing causal relationships. Too often, sources of uncertainty in conventional weight of evidence approaches are ignored that can be accounted for with Bayesian networks. Specifying and propagating uncertainties improve the ability of models to incorporate strength of the evidence in the risk management phase of an assessment. Probabilistic inference from a Bayesian network allows evaluation of changes in uncertainty for variables from the evidence. The network structure and probabilistic framework of a Bayesian approach provide advantages over qualitative approaches in weight of evidence for capturing the impacts of multiple sources of quantifiable uncertainty on predictions of ecological risk. Bayesian networks can facilitate the development of evidence-based policy under conditions of uncertainty by incorporating analytical inaccuracies or the implications of imperfect information, structuring and communicating causal issues through qualitative directed graph formulations, and quantitatively comparing the causal power of multiple stressors on value

  4. Model Selection in Historical Research Using Approximate Bayesian Computation

    PubMed Central

    Rubio-Campillo, Xavier

    2016-01-01

    Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953

  5. The decisive future of inflation

    NASA Astrophysics Data System (ADS)

    Hardwick, Robert J.; Vennin, Vincent; Wands, David

    2018-05-01

    How much more will we learn about single-field inflationary models in the future? We address this question in the context of Bayesian design and information theory. We develop a novel method to compute the expected utility of deciding between models and apply it to a set of futuristic measurements. This necessarily requires one to evaluate the Bayesian evidence many thousands of times over, which is numerically challenging. We show how this can be done using a number of simplifying assumptions and discuss their validity. We also modify the form of the expected utility, as previously introduced in the literature in different contexts, in order to partition each possible future into either the rejection of models at the level of the maximum likelihood or the decision between models using Bayesian model comparison. We then quantify the ability of future experiments to constrain the reheating temperature and the scalar running. Our approach allows us to discuss possible strategies for maximising information from future cosmological surveys. In particular, our conclusions suggest that, in the context of inflationary model selection, a decrease in the measurement uncertainty of the scalar spectral index would be more decisive than a decrease in the uncertainty in the tensor-to-scalar ratio. We have incorporated our approach into a publicly available python class, foxi,1 that can be readily applied to any survey optimisation problem.

  6. Bayesian inversion of seismic and electromagnetic data for marine gas reservoir characterization using multi-chain Markov chain Monte Carlo sampling

    DOE PAGES

    Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan; ...

    2017-10-17

    In this paper we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach ismore » used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated — reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less

  7. Bayesian inversion of seismic and electromagnetic data for marine gas reservoir characterization using multi-chain Markov chain Monte Carlo sampling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan

    In this paper we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach ismore » used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated — reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less

  8. Efficient Bayesian experimental design for contaminant source identification

    NASA Astrophysics Data System (ADS)

    Zhang, J.; Zeng, L.

    2013-12-01

    In this study, an efficient full Bayesian approach is developed for the optimal sampling well location design and source parameter identification of groundwater contaminants. An information measure, i.e., the relative entropy, is employed to quantify the information gain from indirect concentration measurements in identifying unknown source parameters such as the release time, strength and location. In this approach, the sampling location that gives the maximum relative entropy is selected as the optimal one. Once the sampling location is determined, a Bayesian approach based on Markov Chain Monte Carlo (MCMC) is used to estimate unknown source parameters. In both the design and estimation, the contaminant transport equation is required to be solved many times to evaluate the likelihood. To reduce the computational burden, an interpolation method based on the adaptive sparse grid is utilized to construct a surrogate for the contaminant transport. The approximated likelihood can be evaluated directly from the surrogate, which greatly accelerates the design and estimation process. The accuracy and efficiency of our approach are demonstrated through numerical case studies. Compared with the traditional optimal design, which is based on the Gaussian linear assumption, the method developed in this study can cope with arbitrary nonlinearity. It can be used to assist in groundwater monitor network design and identification of unknown contaminant sources. Contours of the expected information gain. The optimal observing location corresponds to the maximum value. Posterior marginal probability densities of unknown parameters, the thick solid black lines are for the designed location. For comparison, other 7 lines are for randomly chosen locations. The true values are denoted by vertical lines. It is obvious that the unknown parameters are estimated better with the desinged location.

  9. Atmospheric Tracer Inverse Modeling Using Markov Chain Monte Carlo (MCMC)

    NASA Astrophysics Data System (ADS)

    Kasibhatla, P.

    2004-12-01

    In recent years, there has been an increasing emphasis on the use of Bayesian statistical estimation techniques to characterize the temporal and spatial variability of atmospheric trace gas sources and sinks. The applications have been varied in terms of the particular species of interest, as well as in terms of the spatial and temporal resolution of the estimated fluxes. However, one common characteristic has been the use of relatively simple statistical models for describing the measurement and chemical transport model error statistics and prior source statistics. For example, multivariate normal probability distribution functions (pdfs) are commonly used to model these quantities and inverse source estimates are derived for fixed values of pdf paramaters. While the advantage of this approach is that closed form analytical solutions for the a posteriori pdfs of interest are available, it is worth exploring Bayesian analysis approaches which allow for a more general treatment of error and prior source statistics. Here, we present an application of the Markov Chain Monte Carlo (MCMC) methodology to an atmospheric tracer inversion problem to demonstrate how more gereral statistical models for errors can be incorporated into the analysis in a relatively straightforward manner. The MCMC approach to Bayesian analysis, which has found wide application in a variety of fields, is a statistical simulation approach that involves computing moments of interest of the a posteriori pdf by efficiently sampling this pdf. The specific inverse problem that we focus on is the annual mean CO2 source/sink estimation problem considered by the TransCom3 project. TransCom3 was a collaborative effort involving various modeling groups and followed a common modeling and analysis protocoal. As such, this problem provides a convenient case study to demonstrate the applicability of the MCMC methodology to atmospheric tracer source/sink estimation problems.

  10. A Bayesian Nonparametric Approach to Test Equating

    ERIC Educational Resources Information Center

    Karabatsos, George; Walker, Stephen G.

    2009-01-01

    A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are…

  11. Bayesian Networks Improve Causal Environmental Assessments for Evidence-Based Policy.

    PubMed

    Carriger, John F; Barron, Mace G; Newman, Michael C

    2016-12-20

    Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the problem, using this knowledge with probabilistic calculus to combine multiple lines of evidence, and minimizing biases in predicting or diagnosing causal relationships. Too often, sources of uncertainty in conventional weight of evidence approaches are ignored that can be accounted for with Bayesian networks. Specifying and propagating uncertainties improve the ability of models to incorporate strength of the evidence in the risk management phase of an assessment. Probabilistic inference from a Bayesian network allows evaluation of changes in uncertainty for variables from the evidence. The network structure and probabilistic framework of a Bayesian approach provide advantages over qualitative approaches in weight of evidence for capturing the impacts of multiple sources of quantifiable uncertainty on predictions of ecological risk. Bayesian networks can facilitate the development of evidence-based policy under conditions of uncertainty by incorporating analytical inaccuracies or the implications of imperfect information, structuring and communicating causal issues through qualitative directed graph formulations, and quantitatively comparing the causal power of multiple stressors on valued ecological resources. These aspects are demonstrated through hypothetical problem scenarios that explore some major benefits of using Bayesian networks for reasoning and making inferences in evidence-based policy.

  12. Low-rank separated representation surrogates of high-dimensional stochastic functions: Application in Bayesian inference

    NASA Astrophysics Data System (ADS)

    Validi, AbdoulAhad

    2014-03-01

    This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow.

  13. Bayesian pedigree inference with small numbers of single nucleotide polymorphisms via a factor-graph representation.

    PubMed

    Anderson, Eric C; Ng, Thomas C

    2016-02-01

    We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.

  14. The discounting model selector: Statistical software for delay discounting applications.

    PubMed

    Gilroy, Shawn P; Franck, Christopher T; Hantula, Donald A

    2017-05-01

    Original, open-source computer software was developed and validated against established delay discounting methods in the literature. The software executed approximate Bayesian model selection methods from user-supplied temporal discounting data and computed the effective delay 50 (ED50) from the best performing model. Software was custom-designed to enable behavior analysts to conveniently apply recent statistical methods to temporal discounting data with the aid of a graphical user interface (GUI). The results of independent validation of the approximate Bayesian model selection methods indicated that the program provided results identical to that of the original source paper and its methods. Monte Carlo simulation (n = 50,000) confirmed that true model was selected most often in each setting. Simulation code and data for this study were posted to an online repository for use by other researchers. The model selection approach was applied to three existing delay discounting data sets from the literature in addition to the data from the source paper. Comparisons of model selected ED50 were consistent with traditional indices of discounting. Conceptual issues related to the development and use of computer software by behavior analysts and the opportunities afforded by free and open-sourced software are discussed and a review of possible expansions of this software are provided. © 2017 Society for the Experimental Analysis of Behavior.

  15. A novel Bayesian framework for discriminative feature extraction in Brain-Computer Interfaces.

    PubMed

    Suk, Heung-Il; Lee, Seong-Whan

    2013-02-01

    As there has been a paradigm shift in the learning load from a human subject to a computer, machine learning has been considered as a useful tool for Brain-Computer Interfaces (BCIs). In this paper, we propose a novel Bayesian framework for discriminative feature extraction for motor imagery classification in an EEG-based BCI in which the class-discriminative frequency bands and the corresponding spatial filters are optimized by means of the probabilistic and information-theoretic approaches. In our framework, the problem of simultaneous spatiospectral filter optimization is formulated as the estimation of an unknown posterior probability density function (pdf) that represents the probability that a single-trial EEG of predefined mental tasks can be discriminated in a state. In order to estimate the posterior pdf, we propose a particle-based approximation method by extending a factored-sampling technique with a diffusion process. An information-theoretic observation model is also devised to measure discriminative power of features between classes. From the viewpoint of classifier design, the proposed method naturally allows us to construct a spectrally weighted label decision rule by linearly combining the outputs from multiple classifiers. We demonstrate the feasibility and effectiveness of the proposed method by analyzing the results and its success on three public databases.

  16. Bayesian Nonlinear Assimilation of Eulerian and Lagrangian Coastal Flow Data

    DTIC Science & Technology

    2015-09-30

    Lagrangian Coastal Flow Data Dr. Pierre F.J. Lermusiaux Department of Mechanical Engineering Center for Ocean Science and Engineering Massachusetts...Develop and apply theory, schemes and computational systems for rigorous Bayesian nonlinear assimilation of Eulerian and Lagrangian coastal flow data...coastal ocean fields, both in Eulerian and Lagrangian forms. - Further develop and implement our GMM-DO schemes for robust Bayesian nonlinear estimation

  17. Bayesian estimation of the discrete coefficient of determination.

    PubMed

    Chen, Ting; Braga-Neto, Ulisses M

    2016-12-01

    The discrete coefficient of determination (CoD) measures the nonlinear interaction between discrete predictor and target variables and has had far-reaching applications in Genomic Signal Processing. Previous work has addressed the inference of the discrete CoD using classical parametric and nonparametric approaches. In this paper, we introduce a Bayesian framework for the inference of the discrete CoD. We derive analytically the optimal minimum mean-square error (MMSE) CoD estimator, as well as a CoD estimator based on the Optimal Bayesian Predictor (OBP). For the latter estimator, exact expressions for its bias, variance, and root-mean-square (RMS) are given. The accuracy of both Bayesian CoD estimators with non-informative and informative priors, under fixed or random parameters, is studied via analytical and numerical approaches. We also demonstrate the application of the proposed Bayesian approach in the inference of gene regulatory networks, using gene-expression data from a previously published study on metastatic melanoma.

  18. A Bayesian approach to modelling the impact of hydrodynamic shear stress on biofilm deformation

    PubMed Central

    Wilkinson, Darren J.; Jayathilake, Pahala Gedara; Rushton, Steve P.; Bridgens, Ben; Li, Bowen; Zuliani, Paolo

    2018-01-01

    We investigate the feasibility of using a surrogate-based method to emulate the deformation and detachment behaviour of a biofilm in response to hydrodynamic shear stress. The influence of shear force, growth rate and viscoelastic parameters on the patterns of growth, structure and resulting shape of microbial biofilms was examined. We develop a statistical modelling approach to this problem, using combination of Bayesian Poisson regression and dynamic linear models for the emulation. We observe that the hydrodynamic shear force affects biofilm deformation in line with some literature. Sensitivity results also showed that the expected number of shear events, shear flow, yield coefficient for heterotrophic bacteria and extracellular polymeric substance (EPS) stiffness per unit EPS mass are the four principal mechanisms governing the bacteria detachment in this study. The sensitivity of the model parameters is temporally dynamic, emphasising the significance of conducting the sensitivity analysis across multiple time points. The surrogate models are shown to perform well, and produced ≈ 480 fold increase in computational efficiency. We conclude that a surrogate-based approach is effective, and resulting biofilm structure is determined primarily by a balance between bacteria growth, viscoelastic parameters and applied shear stress. PMID:29649240

  19. Towards Real-time, On-board, Hardware-Supported Sensor and Software Health Management for Unmanned Aerial Systems

    NASA Technical Reports Server (NTRS)

    Schumann, Johann; Rozier, Kristin Y.; Reinbacher, Thomas; Mengshoel, Ole J.; Mbaya, Timmy; Ippolito, Corey

    2013-01-01

    Unmanned aerial systems (UASs) can only be deployed if they can effectively complete their missions and respond to failures and uncertain environmental conditions while maintaining safety with respect to other aircraft as well as humans and property on the ground. In this paper, we design a real-time, on-board system health management (SHM) capability to continuously monitor sensors, software, and hardware components for detection and diagnosis of failures and violations of safety or performance rules during the flight of a UAS. Our approach to SHM is three-pronged, providing: (1) real-time monitoring of sensor and/or software signals; (2) signal analysis, preprocessing, and advanced on the- fly temporal and Bayesian probabilistic fault diagnosis; (3) an unobtrusive, lightweight, read-only, low-power realization using Field Programmable Gate Arrays (FPGAs) that avoids overburdening limited computing resources or costly re-certification of flight software due to instrumentation. Our implementation provides a novel approach of combining modular building blocks, integrating responsive runtime monitoring of temporal logic system safety requirements with model-based diagnosis and Bayesian network-based probabilistic analysis. We demonstrate this approach using actual data from the NASA Swift UAS, an experimental all-electric aircraft.

  20. GWASinlps: Nonlocal prior based iterative SNP selection tool for genome-wide association studies.

    PubMed

    Sanyal, Nilotpal; Lo, Min-Tzu; Kauppi, Karolina; Djurovic, Srdjan; Andreassen, Ole A; Johnson, Valen E; Chen, Chi-Hua

    2018-06-19

    Multiple marker analysis of the genome-wide association study (GWAS) data has gained ample attention in recent years. However, because of the ultra high-dimensionality of GWAS data, such analysis is challenging. Frequently used penalized regression methods often lead to large number of false positives, whereas Bayesian methods are computationally very expensive. Motivated to ameliorate these issues simultaneously, we consider the novel approach of using nonlocal priors in an iterative variable selection framework. We develop a variable selection method, named, iterative nonlocal prior based selection for GWAS, or GWASinlps, that combines, in an iterative variable selection framework, the computational efficiency of the screen-and-select approach based on some association learning and the parsimonious uncertainty quantification provided by the use of nonlocal priors. The hallmark of our method is the introduction of 'structured screen-and-select' strategy, that considers hierarchical screening, which is not only based on response-predictor associations, but also based on response-response associations, and concatenates variable selection within that hierarchy. Extensive simulation studies with SNPs having realistic linkage disequilibrium structures demonstrate the advantages of our computationally efficient method compared to several frequentist and Bayesian variable selection methods, in terms of true positive rate, false discovery rate, mean squared error, and effect size estimation error. Further, we provide empirical power analysis useful for study design. Finally, a real GWAS data application was considered with human height as phenotype. An R-package for implementing the GWASinlps method is available at https://cran.r-project.org/web/packages/GWASinlps/index.html. Supplementary data are available at Bioinformatics online.

  1. Bayesian Modeling of Perceived Surface Slant from Actively-Generated and Passively-Observed Optic Flow

    PubMed Central

    Caudek, Corrado; Fantoni, Carlo; Domini, Fulvio

    2011-01-01

    We measured perceived depth from the optic flow (a) when showing a stationary physical or virtual object to observers who moved their head at a normal or slower speed, and (b) when simulating the same optic flow on a computer and presenting it to stationary observers. Our results show that perceived surface slant is systematically distorted, for both the active and the passive viewing of physical or virtual surfaces. These distortions are modulated by head translation speed, with perceived slant increasing directly with the local velocity gradient of the optic flow. This empirical result allows us to determine the relative merits of two alternative approaches aimed at explaining perceived surface slant in active vision: an “inverse optics” model that takes head motion information into account, and a probabilistic model that ignores extra-retinal signals. We compare these two approaches within the framework of the Bayesian theory. The “inverse optics” Bayesian model produces veridical slant estimates if the optic flow and the head translation velocity are measured with no error; because of the influence of a “prior” for flatness, the slant estimates become systematically biased as the measurement errors increase. The Bayesian model, which ignores the observer's motion, always produces distorted estimates of surface slant. Interestingly, the predictions of this second model, not those of the first one, are consistent with our empirical findings. The present results suggest that (a) in active vision perceived surface slant may be the product of probabilistic processes which do not guarantee the correct solution, and (b) extra-retinal signals may be mainly used for a better measurement of retinal information. PMID:21533197

  2. A Bayesian connectivity-based approach to constructing probabilistic gene regulatory networks.

    PubMed

    Zhou, Xiaobo; Wang, Xiaodong; Pal, Ranadip; Ivanov, Ivan; Bittner, Michael; Dougherty, Edward R

    2004-11-22

    We have hypothesized that the construction of transcriptional regulatory networks using a method that optimizes connectivity would lead to regulation consistent with biological expectations. A key expectation is that the hypothetical networks should produce a few, very strong attractors, highly similar to the original observations, mimicking biological state stability and determinism. Another central expectation is that, since it is expected that the biological control is distributed and mutually reinforcing, interpretation of the observations should lead to a very small number of connection schemes. We propose a fully Bayesian approach to constructing probabilistic gene regulatory networks (PGRNs) that emphasizes network topology. The method computes the possible parent sets of each gene, the corresponding predictors and the associated probabilities based on a nonlinear perceptron model, using a reversible jump Markov chain Monte Carlo (MCMC) technique, and an MCMC method is employed to search the network configurations to find those with the highest Bayesian scores to construct the PGRN. The Bayesian method has been used to construct a PGRN based on the observed behavior of a set of genes whose expression patterns vary across a set of melanoma samples exhibiting two very different phenotypes with respect to cell motility and invasiveness. Key biological features have been faithfully reflected in the model. Its steady-state distribution contains attractors that are either identical or very similar to the states observed in the data, and many of the attractors are singletons, which mimics the biological propensity to stably occupy a given state. Most interestingly, the connectivity rules for the most optimal generated networks constituting the PGRN are remarkably similar, as would be expected for a network operating on a distributed basis, with strong interactions between the components.

  3. Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty

    PubMed Central

    Baele, Guy; Lemey, Philippe; Suchard, Marc A.

    2016-01-01

    Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of “working distributions” to facilitate—or shorten—the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a “working” distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different “working” distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428

  4. Bayesian inferences suggest that Amazon Yunga Natives diverged from Andeans less than 5000 ybp: implications for South American prehistory.

    PubMed

    Scliar, Marilia O; Gouveia, Mateus H; Benazzo, Andrea; Ghirotto, Silvia; Fagundes, Nelson J R; Leal, Thiago P; Magalhães, Wagner C S; Pereira, Latife; Rodrigues, Maira R; Soares-Souza, Giordano B; Cabrera, Lilia; Berg, Douglas E; Gilman, Robert H; Bertorelle, Giorgio; Tarazona-Santos, Eduardo

    2014-09-30

    Archaeology reports millenary cultural contacts between Peruvian Coast-Andes and the Amazon Yunga, a rainforest transitional region between Andes and Lower Amazonia. To clarify the relationships between cultural and biological evolution of these populations, in particular between Amazon Yungas and Andeans, we used DNA-sequence data, a model-based Bayesian approach and several statistical validations to infer a set of demographic parameters. We found that the genetic diversity of the Shimaa (an Amazon Yunga population) is a subset of that of Quechuas from Central-Andes. Using the Isolation-with-Migration population genetics model, we inferred that the Shimaa ancestors were a small subgroup that split less than 5300 years ago (after the development of complex societies) from an ancestral Andean population. After the split, the most plausible scenario compatible with our results is that the ancestors of Shimaas moved toward the Peruvian Amazon Yunga and incorporated the culture and language of some of their neighbors, but not a substantial amount of their genes. We validated our results using Approximate Bayesian Computations, posterior predictive tests and the analysis of pseudo-observed datasets. We presented a case study in which model-based Bayesian approaches, combined with necessary statistical validations, shed light into the prehistoric demographic relationship between Andeans and a population from the Amazon Yunga. Our results offer a testable model for the peopling of this large transitional environmental region between the Andes and the Lower Amazonia. However, studies on larger samples and involving more populations of these regions are necessary to confirm if the predominant Andean biological origin of the Shimaas is the rule, and not the exception.

  5. A Parallel and Incremental Approach for Data-Intensive Learning of Bayesian Networks.

    PubMed

    Yue, Kun; Fang, Qiyu; Wang, Xiaoling; Li, Jin; Liu, Weiyi

    2015-12-01

    Bayesian network (BN) has been adopted as the underlying model for representing and inferring uncertain knowledge. As the basis of realistic applications centered on probabilistic inferences, learning a BN from data is a critical subject of machine learning, artificial intelligence, and big data paradigms. Currently, it is necessary to extend the classical methods for learning BNs with respect to data-intensive computing or in cloud environments. In this paper, we propose a parallel and incremental approach for data-intensive learning of BNs from massive, distributed, and dynamically changing data by extending the classical scoring and search algorithm and using MapReduce. First, we adopt the minimum description length as the scoring metric and give the two-pass MapReduce-based algorithms for computing the required marginal probabilities and scoring the candidate graphical model from sample data. Then, we give the corresponding strategy for extending the classical hill-climbing algorithm to obtain the optimal structure, as well as that for storing a BN by pairs. Further, in view of the dynamic characteristics of the changing data, we give the concept of influence degree to measure the coincidence of the current BN with new data, and then propose the corresponding two-pass MapReduce-based algorithms for BNs incremental learning. Experimental results show the efficiency, scalability, and effectiveness of our methods.

  6. Variations on Bayesian Prediction and Inference

    DTIC Science & Technology

    2016-05-09

    inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle

  7. Calculation of Crystallographic Texture of BCC Steels During Cold Rolling

    NASA Astrophysics Data System (ADS)

    Das, Arpan

    2017-05-01

    BCC alloys commonly tend to develop strong fibre textures and often represent as isointensity diagrams in φ 1 sections or by fibre diagrams. Alpha fibre in bcc steels is generally characterised by <110> crystallographic axis parallel to the rolling direction. The objective of present research is to correlate carbon content, carbide dispersion, rolling reduction, Euler angles (ϕ) (when φ 1 = 0° and φ 2 = 45° along alpha fibre) and the resulting alpha fibre texture orientation intensity. In the present research, Bayesian neural computation has been employed to correlate these and compare with the existing feed-forward neural network model comprehensively. Excellent match to the measured texture data within the bounding box of texture training data set has been already predicted through the feed-forward neural network model by other researchers. Feed-forward neural network prediction outside the bounds of training texture data showed deviations from the expected values. Currently, Bayesian computation has been similarly applied to confirm that the predictions are reasonable in the context of basic metallurgical principles, and matched better outside the bounds of training texture data set than the reported feed-forward neural network. Bayesian computation puts error bars on predicted values and allows significance of each individual parameters to be estimated. Additionally, it is also possible by Bayesian computation to estimate the isolated influence of particular variable such as carbon concentration, which exactly cannot in practice be varied independently. This shows the ability of the Bayesian neural network to examine the new phenomenon in situations where the data cannot be accessed through experiments.

  8. Uncertainty quantification for nuclear density functional theory and information content of new measurements

    DOE PAGES

    McDonnell, J. D.; Schunck, N.; Higdon, D.; ...

    2015-03-24

    Statistical tools of uncertainty quantification can be used to assess the information content of measured observables with respect to present-day theoretical models, to estimate model errors and thereby improve predictive capability, to extrapolate beyond the regions reached by experiment, and to provide meaningful input to applications and planned measurements. To showcase new opportunities offered by such tools, we make a rigorous analysis of theoretical statistical uncertainties in nuclear density functional theory using Bayesian inference methods. By considering the recent mass measurements from the Canadian Penning Trap at Argonne National Laboratory, we demonstrate how the Bayesian analysis and a direct least-squaresmore » optimization, combined with high-performance computing, can be used to assess the information content of the new data with respect to a model based on the Skyrme energy density functional approach. Employing the posterior probability distribution computed with a Gaussian process emulator, we apply the Bayesian framework to propagate theoretical statistical uncertainties in predictions of nuclear masses, two-neutron dripline, and fission barriers. Overall, we find that the new mass measurements do not impose a constraint that is strong enough to lead to significant changes in the model parameters. In addition, the example discussed in this study sets the stage for quantifying and maximizing the impact of new measurements with respect to current modeling and guiding future experimental efforts, thus enhancing the experiment-theory cycle in the scientific method.« less

  9. Bayesian bivariate meta-analysis of diagnostic test studies with interpretable priors.

    PubMed

    Guo, Jingyi; Riebler, Andrea; Rue, Håvard

    2017-08-30

    In a bivariate meta-analysis, the number of diagnostic studies involved is often very low so that frequentist methods may result in problems. Using Bayesian inference is particularly attractive as informative priors that add a small amount of information can stabilise the analysis without overwhelming the data. However, Bayesian analysis is often computationally demanding and the selection of the prior for the covariance matrix of the bivariate structure is crucial with little data. The integrated nested Laplace approximations method provides an efficient solution to the computational issues by avoiding any sampling, but the important question of priors remain. We explore the penalised complexity (PC) prior framework for specifying informative priors for the variance parameters and the correlation parameter. PC priors facilitate model interpretation and hyperparameter specification as expert knowledge can be incorporated intuitively. We conduct a simulation study to compare the properties and behaviour of differently defined PC priors to currently used priors in the field. The simulation study shows that the PC prior seems beneficial for the variance parameters. The use of PC priors for the correlation parameter results in more precise estimates when specified in a sensible neighbourhood around the truth. To investigate the usage of PC priors in practice, we reanalyse a meta-analysis using the telomerase marker for the diagnosis of bladder cancer and compare the results with those obtained by other commonly used modelling approaches. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  10. Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method

    PubMed Central

    Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui

    2017-01-01

    Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli, and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs. PMID:29113310

  11. Uncertainty quantification for nuclear density functional theory and information content of new measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McDonnell, J. D.; Schunck, N.; Higdon, D.

    2015-03-24

    Statistical tools of uncertainty quantification can be used to assess the information content of measured observables with respect to present-day theoretical models, to estimate model errors and thereby improve predictive capability, to extrapolate beyond the regions reached by experiment, and to provide meaningful input to applications and planned measurements. To showcase new opportunities offered by such tools, we make a rigorous analysis of theoretical statistical uncertainties in nuclear density functional theory using Bayesian inference methods. By considering the recent mass measurements from the Canadian Penning Trap at Argonne National Laboratory, we demonstrate how the Bayesian analysis and a direct least-squaresmore » optimization, combined with high-performance computing, can be used to assess the information content of the new data with respect to a model based on the Skyrme energy density functional approach. Employing the posterior probability distribution computed with a Gaussian process emulator, we apply the Bayesian framework to propagate theoretical statistical uncertainties in predictions of nuclear masses, two-neutron dripline, and fission barriers. Overall, we find that the new mass measurements do not impose a constraint that is strong enough to lead to significant changes in the model parameters. As a result, the example discussed in this study sets the stage for quantifying and maximizing the impact of new measurements with respect to current modeling and guiding future experimental efforts, thus enhancing the experiment-theory cycle in the scientific method.« less

  12. Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method.

    PubMed

    Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui

    2017-10-06

    Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.

  13. Multilocus approach to clarify species status and the divergence history of the Bemisia tabaci (Hemiptera: Aleyrodidae) species complex.

    PubMed

    Hsieh, Chia-Hung; Ko, Chiun-Cheng; Chung, Cheng-Han; Wang, Hurng-Yi

    2014-07-01

    The sweet potato whitefly, Bemisia tabaci, is a highly differentiated species complex. Despite consisting of several morphologically indistinguishable entities and frequent invasions on all continents with important associated economic losses, the phylogenetic relationships, species status, and evolutionary history of this species complex is still debated. We sequenced and analyzed one mitochondrial and three single-copy nuclear genes from 9 of the 12 genetic groups of B. tabaci and 5 closely related species. Bayesian species delimitation was applied to investigate the speciation events of B. tabaci. The species statuses of the different genetic groups were strongly supported under different prior settings and phylogenetic scenarios. Divergence histories were estimated by a multispecies coalescence approach implemented in (*)BEAST. Based on mitochondrial locus, B. tabaci was originated 6.47 million years ago (MYA). Nevertheless, the time was 1.25MYA based on nuclear loci. According to the method of approximate Bayesian computation, this difference is probably due to different degrees of migration among loci; i.e., although the mitochondrial locus had differentiated, gene flow at nuclear loci was still possible, a scenario similar to parapatric mode of speciation. This is the first study in whiteflies using multilocus data and incorporating Bayesian coalescence approaches, both of which provide a more biologically realistic framework for delimiting species status and delineating the divergence history of B. tabaci. Our study illustrates that gene flow during species divergence should not be overlooked and has a great impact on divergence time estimation. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. Spatial Guilds in the Serengeti Food Web Revealed by a Bayesian Group Model

    PubMed Central

    Baskerville, Edward B.; Dobson, Andy P.; Bedford, Trevor; Allesina, Stefano; Anderson, T. Michael; Pascual, Mercedes

    2011-01-01

    Food webs, networks of feeding relationships in an ecosystem, provide fundamental insights into mechanisms that determine ecosystem stability and persistence. A standard approach in food-web analysis, and network analysis in general, has been to identify compartments, or modules, defined by many links within compartments and few links between them. This approach can identify large habitat boundaries in the network but may fail to identify other important structures. Empirical analyses of food webs have been further limited by low-resolution data for primary producers. In this paper, we present a Bayesian computational method for identifying group structure using a flexible definition that can describe both functional trophic roles and standard compartments. We apply this method to a newly compiled plant-mammal food web from the Serengeti ecosystem that includes high taxonomic resolution at the plant level, allowing a simultaneous examination of the signature of both habitat and trophic roles in network structure. We find that groups at the plant level reflect habitat structure, coupled at higher trophic levels by groups of herbivores, which are in turn coupled by carnivore groups. Thus the group structure of the Serengeti web represents a mixture of trophic guild structure and spatial pattern, in contrast to the standard compartments typically identified. The network topology supports recent ideas on spatial coupling and energy channels in ecosystems that have been proposed as important for persistence. Furthermore, our Bayesian approach provides a powerful, flexible framework for the study of network structure, and we believe it will prove instrumental in a variety of biological contexts. PMID:22219719

  15. A Bayesian Approach to More Stable Estimates of Group-Level Effects in Contextual Studies.

    PubMed

    Zitzmann, Steffen; Lüdtke, Oliver; Robitzsch, Alexander

    2015-01-01

    Multilevel analyses are often used to estimate the effects of group-level constructs. However, when using aggregated individual data (e.g., student ratings) to assess a group-level construct (e.g., classroom climate), the observed group mean might not provide a reliable measure of the unobserved latent group mean. In the present article, we propose a Bayesian approach that can be used to estimate a multilevel latent covariate model, which corrects for the unreliable assessment of the latent group mean when estimating the group-level effect. A simulation study was conducted to evaluate the choice of different priors for the group-level variance of the predictor variable and to compare the Bayesian approach with the maximum likelihood approach implemented in the software Mplus. Results showed that, under problematic conditions (i.e., small number of groups, predictor variable with a small ICC), the Bayesian approach produced more accurate estimates of the group-level effect than the maximum likelihood approach did.

  16. Joint Model and Parameter Dimension Reduction for Bayesian Inversion Applied to an Ice Sheet Flow Problem

    NASA Astrophysics Data System (ADS)

    Ghattas, O.; Petra, N.; Cui, T.; Marzouk, Y.; Benjamin, P.; Willcox, K.

    2016-12-01

    Model-based projections of the dynamics of the polar ice sheets play a central role in anticipating future sea level rise. However, a number of mathematical and computational challenges place significant barriers on improving predictability of these models. One such challenge is caused by the unknown model parameters (e.g., in the basal boundary conditions) that must be inferred from heterogeneous observational data, leading to an ill-posed inverse problem and the need to quantify uncertainties in its solution. In this talk we discuss the problem of estimating the uncertainty in the solution of (large-scale) ice sheet inverse problems within the framework of Bayesian inference. Computing the general solution of the inverse problem--i.e., the posterior probability density--is intractable with current methods on today's computers, due to the expense of solving the forward model (3D full Stokes flow with nonlinear rheology) and the high dimensionality of the uncertain parameters (which are discretizations of the basal sliding coefficient field). To overcome these twin computational challenges, it is essential to exploit problem structure (e.g., sensitivity of the data to parameters, the smoothing property of the forward model, and correlations in the prior). To this end, we present a data-informed approach that identifies low-dimensional structure in both parameter space and the forward model state space. This approach exploits the fact that the observations inform only a low-dimensional parameter space and allows us to construct a parameter-reduced posterior. Sampling this parameter-reduced posterior still requires multiple evaluations of the forward problem, therefore we also aim to identify a low dimensional state space to reduce the computational cost. To this end, we apply a proper orthogonal decomposition (POD) approach to approximate the state using a low-dimensional manifold constructed using ``snapshots'' from the parameter reduced posterior, and the discrete empirical interpolation method (DEIM) to approximate the nonlinearity in the forward problem. We show that using only a limited number of forward solves, the resulting subspaces lead to an efficient method to explore the high-dimensional posterior.

  17. Bayesian generalized linear mixed modeling of Tuberculosis using informative priors

    PubMed Central

    Woldegerima, Woldegebriel Assefa

    2017-01-01

    TB is rated as one of the world’s deadliest diseases and South Africa ranks 9th out of the 22 countries with hardest hit of TB. Although many pieces of research have been carried out on this subject, this paper steps further by inculcating past knowledge into the model, using Bayesian approach with informative prior. Bayesian statistics approach is getting popular in data analyses. But, most applications of Bayesian inference technique are limited to situations of non-informative prior, where there is no solid external information about the distribution of the parameter of interest. The main aim of this study is to profile people living with TB in South Africa. In this paper, identical regression models are fitted for classical and Bayesian approach both with non-informative and informative prior, using South Africa General Household Survey (GHS) data for the year 2014. For the Bayesian model with informative prior, South Africa General Household Survey dataset for the year 2011 to 2013 are used to set up priors for the model 2014. PMID:28257437

  18. CytoBayesJ: software tools for Bayesian analysis of cytogenetic radiation dosimetry data.

    PubMed

    Ainsbury, Elizabeth A; Vinnikov, Volodymyr; Puig, Pedro; Maznyk, Nataliya; Rothkamm, Kai; Lloyd, David C

    2013-08-30

    A number of authors have suggested that a Bayesian approach may be most appropriate for analysis of cytogenetic radiation dosimetry data. In the Bayesian framework, probability of an event is described in terms of previous expectations and uncertainty. Previously existing, or prior, information is used in combination with experimental results to infer probabilities or the likelihood that a hypothesis is true. It has been shown that the Bayesian approach increases both the accuracy and quality assurance of radiation dose estimates. New software entitled CytoBayesJ has been developed with the aim of bringing Bayesian analysis to cytogenetic biodosimetry laboratory practice. CytoBayesJ takes a number of Bayesian or 'Bayesian like' methods that have been proposed in the literature and presents them to the user in the form of simple user-friendly tools, including testing for the most appropriate model for distribution of chromosome aberrations and calculations of posterior probability distributions. The individual tools are described in detail and relevant examples of the use of the methods and the corresponding CytoBayesJ software tools are given. In this way, the suitability of the Bayesian approach to biological radiation dosimetry is highlighted and its wider application encouraged by providing a user-friendly software interface and manual in English and Russian. Copyright © 2013 Elsevier B.V. All rights reserved.

  19. The soft computing-based approach to investigate allergic diseases: a systematic review.

    PubMed

    Tartarisco, Gennaro; Tonacci, Alessandro; Minciullo, Paola Lucia; Billeci, Lucia; Pioggia, Giovanni; Incorvaia, Cristoforo; Gangemi, Sebastiano

    2017-01-01

    Early recognition of inflammatory markers and their relation to asthma, adverse drug reactions, allergic rhinitis, atopic dermatitis and other allergic diseases is an important goal in allergy. The vast majority of studies in the literature are based on classic statistical methods; however, developments in computational techniques such as soft computing-based approaches hold new promise in this field. The aim of this manuscript is to systematically review the main soft computing-based techniques such as artificial neural networks, support vector machines, bayesian networks and fuzzy logic to investigate their performances in the field of allergic diseases. The review was conducted following PRISMA guidelines and the protocol was registered within PROSPERO database (CRD42016038894). The research was performed on PubMed and ScienceDirect, covering the period starting from September 1, 1990 through April 19, 2016. The review included 27 studies related to allergic diseases and soft computing performances. We observed promising results with an overall accuracy of 86.5%, mainly focused on asthmatic disease. The review reveals that soft computing-based approaches are suitable for big data analysis and can be very powerful, especially when dealing with uncertainty and poorly characterized parameters. Furthermore, they can provide valuable support in case of lack of data and entangled cause-effect relationships, which make it difficult to assess the evolution of disease. Although most works deal with asthma, we believe the soft computing approach could be a real breakthrough and foster new insights into other allergic diseases as well.

  20. Contrast enhancement in EIT imaging of the brain.

    PubMed

    Nissinen, A; Kaipio, J P; Vauhkonen, M; Kolehmainen, V

    2016-01-01

    We consider electrical impedance tomography (EIT) imaging of the brain. The brain is surrounded by the poorly conducting skull which has low conductivity compared to the brain. The skull layer causes a partial shielding effect which leads to weak sensitivity for the imaging of the brain tissue. In this paper we propose an approach based on the Bayesian approximation error approach, to enhance the contrast in brain imaging. With this approach, both the (uninteresting) geometry and the conductivity of the skull are embedded in the approximation error statistics, which leads to a computationally efficient algorithm that is able to detect features such as internal haemorrhage with significantly increased sensitivity and specificity. We evaluate the approach with simulations and phantom data.

  1. Bayesian inference and decision theory - A framework for decision making in natural resource management

    USGS Publications Warehouse

    Dorazio, R.M.; Johnson, F.A.

    2003-01-01

    Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.

  2. How to interpret the results of medical time series data analysis: Classical statistical approaches versus dynamic Bayesian network modeling.

    PubMed

    Onisko, Agnieszka; Druzdzel, Marek J; Austin, R Marshall

    2016-01-01

    Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan-Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches.

  3. Adversarial risk analysis with incomplete information: a level-k approach.

    PubMed

    Rothschild, Casey; McLay, Laura; Guikema, Seth

    2012-07-01

    This article proposes, develops, and illustrates the application of level-k game theory to adversarial risk analysis. Level-k reasoning, which assumes that players play strategically but have bounded rationality, is useful for operationalizing a Bayesian approach to adversarial risk analysis. It can be applied in a broad class of settings, including settings with asynchronous play and partial but incomplete revelation of early moves. Its computational and elicitation requirements are modest. We illustrate the approach with an application to a simple defend-attack model in which the defender's countermeasures are revealed with a probability less than one to the attacker before he decides on how or whether to attack. © 2011 Society for Risk Analysis.

  4. Approaches in highly parameterized inversion: bgaPEST, a Bayesian geostatistical approach implementation with PEST: documentation and instructions

    USGS Publications Warehouse

    Fienen, Michael N.; D'Oria, Marco; Doherty, John E.; Hunt, Randall J.

    2013-01-01

    The application bgaPEST is a highly parameterized inversion software package implementing the Bayesian Geostatistical Approach in a framework compatible with the parameter estimation suite PEST. Highly parameterized inversion refers to cases in which parameters are distributed in space or time and are correlated with one another. The Bayesian aspect of bgaPEST is related to Bayesian probability theory in which prior information about parameters is formally revised on the basis of the calibration dataset used for the inversion. Conceptually, this approach formalizes the conditionality of estimated parameters on the specific data and model available. The geostatistical component of the method refers to the way in which prior information about the parameters is used. A geostatistical autocorrelation function is used to enforce structure on the parameters to avoid overfitting and unrealistic results. Bayesian Geostatistical Approach is designed to provide the smoothest solution that is consistent with the data. Optionally, users can specify a level of fit or estimate a balance between fit and model complexity informed by the data. Groundwater and surface-water applications are used as examples in this text, but the possible uses of bgaPEST extend to any distributed parameter applications.

  5. The Bayesian approach to reporting GSR analysis results: some first-hand experiences

    NASA Astrophysics Data System (ADS)

    Charles, Sebastien; Nys, Bart

    2010-06-01

    The use of Bayesian principles in the reporting of forensic findings has been a matter of interest for some years. Recently, also the GSR community is gradually exploring the advantages of this method, or rather approach, for writing reports. Since last year, our GSR group is adapting reporting procedures to the use of Bayesian principles. The police and magistrates find the reports more directly accessible and useful in their part of the criminal investigation. In the lab we find that, through applying the Bayesian principles, unnecessary analyses can be eliminated and thus time can be freed on the instruments.

  6. Bayesian statistics in medicine: a 25 year review.

    PubMed

    Ashby, Deborah

    2006-11-15

    This review examines the state of Bayesian thinking as Statistics in Medicine was launched in 1982, reflecting particularly on its applicability and uses in medical research. It then looks at each subsequent five-year epoch, with a focus on papers appearing in Statistics in Medicine, putting these in the context of major developments in Bayesian thinking and computation with reference to important books, landmark meetings and seminal papers. It charts the growth of Bayesian statistics as it is applied to medicine and makes predictions for the future. From sparse beginnings, where Bayesian statistics was barely mentioned, Bayesian statistics has now permeated all the major areas of medical statistics, including clinical trials, epidemiology, meta-analyses and evidence synthesis, spatial modelling, longitudinal modelling, survival modelling, molecular genetics and decision-making in respect of new technologies.

  7. Bayesian calibration of coarse-grained forces: Efficiently addressing transferability

    NASA Astrophysics Data System (ADS)

    Patrone, Paul N.; Rosch, Thomas W.; Phelan, Frederick R.

    2016-04-01

    Generating and calibrating forces that are transferable across a range of state-points remains a challenging task in coarse-grained (CG) molecular dynamics. In this work, we present a coarse-graining workflow, inspired by ideas from uncertainty quantification and numerical analysis, to address this problem. The key idea behind our approach is to introduce a Bayesian correction algorithm that uses functional derivatives of CG simulations to rapidly and inexpensively recalibrate initial estimates f0 of forces anchored by standard methods such as force-matching. Taking density-temperature relationships as a running example, we demonstrate that this algorithm, in concert with various interpolation schemes, can be used to efficiently compute physically reasonable force curves on a fine grid of state-points. Importantly, we show that our workflow is robust to several choices available to the modeler, including the interpolation schemes and tools used to construct f0. In a related vein, we also demonstrate that our approach can speed up coarse-graining by reducing the number of atomistic simulations needed as inputs to standard methods for generating CG forces.

  8. Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression.

    PubMed

    Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula

    2011-01-01

    Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.

  9. Estimating the probability for major gene Alzheimer disease

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Farrer, L.A.; Cupples, L.A.

    1994-02-01

    Alzheimer disease (AD) is a neuropsychiatric illness caused by multiple etiologies. Prediction of whether AD is genetically based in a given family is problematic because of censoring bias among unaffected relatives as a consequence of the late onset of the disorder, diagnostic uncertainties, heterogeneity, and limited information in a single family. The authors have developed a method based on Bayesian probability to compute values for a continuous variable that ranks AD families as having a major gene form of AD (MGAD). In addition, they have compared the Bayesian method with a maximum-likelihood approach. These methods incorporate sex- and age-adjusted riskmore » estimates and allow for phenocopies and familial clustering of age on onset. Agreement is high between the two approaches for ranking families as MGAD (Spearman rank [r] = .92). When either method is used, the numerical outcomes are sensitive to assumptions of the gene frequency and cumulative incidence of the disease in the population. Consequently, risk estimates should be used cautiously for counseling purposes; however, there are numerous valid applications of these procedures in genetic and epidemiological studies. 41 refs., 4 figs., 3 tabs.« less

  10. Bayesian network interface for assisting radiology interpretation and education

    NASA Astrophysics Data System (ADS)

    Duda, Jeffrey; Botzolakis, Emmanuel; Chen, Po-Hao; Mohan, Suyash; Nasrallah, Ilya; Rauschecker, Andreas; Rudie, Jeffrey; Bryan, R. Nick; Gee, James; Cook, Tessa

    2018-03-01

    In this work, we present the use of Bayesian networks for radiologist decision support during clinical interpretation. This computational approach has the advantage of avoiding incorrect diagnoses that result from known human cognitive biases such as anchoring bias, framing effect, availability bias, and premature closure. To integrate Bayesian networks into clinical practice, we developed an open-source web application that provides diagnostic support for a variety of radiology disease entities (e.g., basal ganglia diseases, bone lesions). The Clinical tool presents the user with a set of buttons representing clinical and imaging features of interest. These buttons are used to set the value for each observed feature. As features are identified, the conditional probabilities for each possible diagnosis are updated in real time. Additionally, using sensitivity analysis, the interface may be set to inform the user which remaining imaging features provide maximum discriminatory information to choose the most likely diagnosis. The Case Submission tools allow the user to submit a validated case and the associated imaging features to a database, which can then be used for future tuning/testing of the Bayesian networks. These submitted cases are then reviewed by an assigned expert using the provided QC tool. The Research tool presents users with cases with previously labeled features and a chosen diagnosis, for the purpose of performance evaluation. Similarly, the Education page presents cases with known features, but provides real time feedback on feature selection.

  11. Markov Chain Monte Carlo Inference of Parametric Dictionaries for Sparse Bayesian Approximations

    PubMed Central

    Chaspari, Theodora; Tsiartas, Andreas; Tsilifis, Panagiotis; Narayanan, Shrikanth

    2016-01-01

    Parametric dictionaries can increase the ability of sparse representations to meaningfully capture and interpret the underlying signal information, such as encountered in biomedical problems. Given a mapping function from the atom parameter space to the actual atoms, we propose a sparse Bayesian framework for learning the atom parameters, because of its ability to provide full posterior estimates, take uncertainty into account and generalize on unseen data. Inference is performed with Markov Chain Monte Carlo, that uses block sampling to generate the variables of the Bayesian problem. Since the parameterization of dictionary atoms results in posteriors that cannot be analytically computed, we use a Metropolis-Hastings-within-Gibbs framework, according to which variables with closed-form posteriors are generated with the Gibbs sampler, while the remaining ones with the Metropolis Hastings from appropriate candidate-generating densities. We further show that the corresponding Markov Chain is uniformly ergodic ensuring its convergence to a stationary distribution independently of the initial state. Results on synthetic data and real biomedical signals indicate that our approach offers advantages in terms of signal reconstruction compared to previously proposed Steepest Descent and Equiangular Tight Frame methods. This paper demonstrates the ability of Bayesian learning to generate parametric dictionaries that can reliably represent the exemplar data and provides the foundation towards inferring the entire variable set of the sparse approximation problem for signal denoising, adaptation and other applications. PMID:28649173

  12. A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction

    PubMed Central

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R.; Buenrostro-Mariscal, Raymundo

    2017-01-01

    There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. PMID:28391241

  13. A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction.

    PubMed

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Montesinos-López, José Cricelio; Luna-Vázquez, Francisco Javier; Salinas-Ruiz, Josafhat; Herrera-Morales, José R; Buenrostro-Mariscal, Raymundo

    2017-06-07

    There are Bayesian and non-Bayesian genomic models that take into account G×E interactions. However, the computational cost of implementing Bayesian models is high, and becomes almost impossible when the number of genotypes, environments, and traits is very large, while, in non-Bayesian models, there are often important and unsolved convergence problems. The variational Bayes method is popular in machine learning, and, by approximating the probability distributions through optimization, it tends to be faster than Markov Chain Monte Carlo methods. For this reason, in this paper, we propose a new genomic variational Bayes version of the Bayesian genomic model with G×E using half-t priors on each standard deviation (SD) term to guarantee highly noninformative and posterior inferences that are not sensitive to the choice of hyper-parameters. We show the complete theoretical derivation of the full conditional and the variational posterior distributions, and their implementations. We used eight experimental genomic maize and wheat data sets to illustrate the new proposed variational Bayes approximation, and compared its predictions and implementation time with a standard Bayesian genomic model with G×E. Results indicated that prediction accuracies are slightly higher in the standard Bayesian model with G×E than in its variational counterpart, but, in terms of computation time, the variational Bayes genomic model with G×E is, in general, 10 times faster than the conventional Bayesian genomic model with G×E. For this reason, the proposed model may be a useful tool for researchers who need to predict and select genotypes in several environments. Copyright © 2017 Montesinos-López et al.

  14. Bayesian Exploratory Factor Analysis

    PubMed Central

    Conti, Gabriella; Frühwirth-Schnatter, Sylvia; Heckman, James J.; Piatek, Rémi

    2014-01-01

    This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corresponding factor loadings. Classical identification criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo study confirms the validity of the approach. The method is used to produce interpretable low dimensional aggregates from a high dimensional set of psychological measurements. PMID:25431517

  15. Bayesian Decision Support

    NASA Astrophysics Data System (ADS)

    Berliner, M.

    2017-12-01

    Bayesian statistical decision theory offers a natural framework for decision-policy making in the presence of uncertainty. Key advantages of the approach include efficient incorporation of information and observations. However, in complicated settings it is very difficult, perhaps essentially impossible, to formalize the mathematical inputs needed in the approach. Nevertheless, using the approach as a template is useful for decision support; that is, organizing and communicating our analyses. Bayesian hierarchical modeling is valuable in quantifying and managing uncertainty such cases. I review some aspects of the idea emphasizing statistical model development and use in the context of sea-level rise.

  16. Is probabilistic bias analysis approximately Bayesian?

    PubMed Central

    MacLehose, Richard F.; Gustafson, Paul

    2011-01-01

    Case-control studies are particularly susceptible to differential exposure misclassification when exposure status is determined following incident case status. Probabilistic bias analysis methods have been developed as ways to adjust standard effect estimates based on the sensitivity and specificity of exposure misclassification. The iterative sampling method advocated in probabilistic bias analysis bears a distinct resemblance to a Bayesian adjustment; however, it is not identical. Furthermore, without a formal theoretical framework (Bayesian or frequentist), the results of a probabilistic bias analysis remain somewhat difficult to interpret. We describe, both theoretically and empirically, the extent to which probabilistic bias analysis can be viewed as approximately Bayesian. While the differences between probabilistic bias analysis and Bayesian approaches to misclassification can be substantial, these situations often involve unrealistic prior specifications and are relatively easy to detect. Outside of these special cases, probabilistic bias analysis and Bayesian approaches to exposure misclassification in case-control studies appear to perform equally well. PMID:22157311

  17. A novel approach for choosing summary statistics in approximate Bayesian computation.

    PubMed

    Aeschbacher, Simon; Beaumont, Mark A; Futschik, Andreas

    2012-11-01

    The choice of summary statistics is a crucial step in approximate Bayesian computation (ABC). Since statistics are often not sufficient, this choice involves a trade-off between loss of information and reduction of dimensionality. The latter may increase the efficiency of ABC. Here, we propose an approach for choosing summary statistics based on boosting, a technique from the machine-learning literature. We consider different types of boosting and compare them to partial least-squares regression as an alternative. To mitigate the lack of sufficiency, we also propose an approach for choosing summary statistics locally, in the putative neighborhood of the true parameter value. We study a demographic model motivated by the reintroduction of Alpine ibex (Capra ibex) into the Swiss Alps. The parameters of interest are the mean and standard deviation across microsatellites of the scaled ancestral mutation rate (θ(anc) = 4N(e)u) and the proportion of males obtaining access to matings per breeding season (ω). By simulation, we assess the properties of the posterior distribution obtained with the various methods. According to our criteria, ABC with summary statistics chosen locally via boosting with the L(2)-loss performs best. Applying that method to the ibex data, we estimate θ(anc)≈ 1.288 and find that most of the variation across loci of the ancestral mutation rate u is between 7.7 × 10(-4) and 3.5 × 10(-3) per locus per generation. The proportion of males with access to matings is estimated as ω≈ 0.21, which is in good agreement with recent independent estimates.

  18. Adaptive MCMC in Bayesian phylogenetics: an application to analyzing partitioned data in BEAST.

    PubMed

    Baele, Guy; Lemey, Philippe; Rambaut, Andrew; Suchard, Marc A

    2017-06-15

    Advances in sequencing technology continue to deliver increasingly large molecular sequence datasets that are often heavily partitioned in order to accurately model the underlying evolutionary processes. In phylogenetic analyses, partitioning strategies involve estimating conditionally independent models of molecular evolution for different genes and different positions within those genes, requiring a large number of evolutionary parameters that have to be estimated, leading to an increased computational burden for such analyses. The past two decades have also seen the rise of multi-core processors, both in the central processing unit (CPU) and Graphics processing unit processor markets, enabling massively parallel computations that are not yet fully exploited by many software packages for multipartite analyses. We here propose a Markov chain Monte Carlo (MCMC) approach using an adaptive multivariate transition kernel to estimate in parallel a large number of parameters, split across partitioned data, by exploiting multi-core processing. Across several real-world examples, we demonstrate that our approach enables the estimation of these multipartite parameters more efficiently than standard approaches that typically use a mixture of univariate transition kernels. In one case, when estimating the relative rate parameter of the non-coding partition in a heterochronous dataset, MCMC integration efficiency improves by > 14-fold. Our implementation is part of the BEAST code base, a widely used open source software package to perform Bayesian phylogenetic inference. guy.baele@kuleuven.be. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  19. A Novel Approach for Choosing Summary Statistics in Approximate Bayesian Computation

    PubMed Central

    Aeschbacher, Simon; Beaumont, Mark A.; Futschik, Andreas

    2012-01-01

    The choice of summary statistics is a crucial step in approximate Bayesian computation (ABC). Since statistics are often not sufficient, this choice involves a trade-off between loss of information and reduction of dimensionality. The latter may increase the efficiency of ABC. Here, we propose an approach for choosing summary statistics based on boosting, a technique from the machine-learning literature. We consider different types of boosting and compare them to partial least-squares regression as an alternative. To mitigate the lack of sufficiency, we also propose an approach for choosing summary statistics locally, in the putative neighborhood of the true parameter value. We study a demographic model motivated by the reintroduction of Alpine ibex (Capra ibex) into the Swiss Alps. The parameters of interest are the mean and standard deviation across microsatellites of the scaled ancestral mutation rate (θanc = 4Neu) and the proportion of males obtaining access to matings per breeding season (ω). By simulation, we assess the properties of the posterior distribution obtained with the various methods. According to our criteria, ABC with summary statistics chosen locally via boosting with the L2-loss performs best. Applying that method to the ibex data, we estimate θ^anc≈1.288 and find that most of the variation across loci of the ancestral mutation rate u is between 7.7 × 10−4 and 3.5 × 10−3 per locus per generation. The proportion of males with access to matings is estimated as ω^≈0.21, which is in good agreement with recent independent estimates. PMID:22960215

  20. Quantifying the uncertainty in heritability.

    PubMed

    Furlotte, Nicholas A; Heckerman, David; Lippert, Christoph

    2014-05-01

    The use of mixed models to determine narrow-sense heritability and related quantities such as SNP heritability has received much recent attention. Less attention has been paid to the inherent variability in these estimates. One approach for quantifying variability in estimates of heritability is a frequentist approach, in which heritability is estimated using maximum likelihood and its variance is quantified through an asymptotic normal approximation. An alternative approach is to quantify the uncertainty in heritability through its Bayesian posterior distribution. In this paper, we develop the latter approach, make it computationally efficient and compare it to the frequentist approach. We show theoretically that, for a sufficiently large sample size and intermediate values of heritability, the two approaches provide similar results. Using the Atherosclerosis Risk in Communities cohort, we show empirically that the two approaches can give different results and that the variance/uncertainty can remain large.

  1. The Psychology of Bayesian Reasoning

    DTIC Science & Technology

    2014-10-21

    The psychology of Bayesian reasoning David R. Mandel* Socio-Cognitive Systems Section, Defence Research and Development Canada and Department...belief revision, subjective probability, human judgment, psychological methods. Most psychological research on Bayesian reasoning since the 1970s has...attention to some important problems with the conventional approach to studying Bayesian reasoning in psychology that has been dominant since the

  2. Bayesian Just-So Stories in Psychology and Neuroscience

    ERIC Educational Resources Information Center

    Bowers, Jeffrey S.; Davis, Colin J.

    2012-01-01

    According to Bayesian theories in psychology and neuroscience, minds and brains are (near) optimal in solving a wide range of tasks. We challenge this view and argue that more traditional, non-Bayesian approaches are more promising. We make 3 main arguments. First, we show that the empirical evidence for Bayesian theories in psychology is weak.…

  3. Teaching Bayesian Statistics in a Health Research Methodology Program

    ERIC Educational Resources Information Center

    Pullenayegum, Eleanor M.; Thabane, Lehana

    2009-01-01

    Despite the appeal of Bayesian methods in health research, they are not widely used. This is partly due to a lack of courses in Bayesian methods at an appropriate level for non-statisticians in health research. Teaching such a course can be challenging because most statisticians have been taught Bayesian methods using a mathematical approach, and…

  4. Efficient Bayesian hierarchical functional data analysis with basis function approximations using Gaussian-Wishart processes.

    PubMed

    Yang, Jingjing; Cox, Dennis D; Lee, Jong Soo; Ren, Peng; Choi, Taeryon

    2017-12-01

    Functional data are defined as realizations of random functions (mostly smooth functions) varying over a continuum, which are usually collected on discretized grids with measurement errors. In order to accurately smooth noisy functional observations and deal with the issue of high-dimensional observation grids, we propose a novel Bayesian method based on the Bayesian hierarchical model with a Gaussian-Wishart process prior and basis function representations. We first derive an induced model for the basis-function coefficients of the functional data, and then use this model to conduct posterior inference through Markov chain Monte Carlo methods. Compared to the standard Bayesian inference that suffers serious computational burden and instability in analyzing high-dimensional functional data, our method greatly improves the computational scalability and stability, while inheriting the advantage of simultaneously smoothing raw observations and estimating the mean-covariance functions in a nonparametric way. In addition, our method can naturally handle functional data observed on random or uncommon grids. Simulation and real studies demonstrate that our method produces similar results to those obtainable by the standard Bayesian inference with low-dimensional common grids, while efficiently smoothing and estimating functional data with random and high-dimensional observation grids when the standard Bayesian inference fails. In conclusion, our method can efficiently smooth and estimate high-dimensional functional data, providing one way to resolve the curse of dimensionality for Bayesian functional data analysis with Gaussian-Wishart processes. © 2017, The International Biometric Society.

  5. One dimensional models of temperature and composition in the transition zone from a bayesian inversion of surface waves

    NASA Astrophysics Data System (ADS)

    Drilleau, M.; Beucler, E.; Mocquet, A.; Verhoeven, O.; Burgos, G.; Capdeville, Y.; Montagner, J.

    2011-12-01

    The transition zone plays a key role in the dynamics of the Earth's mantle, especially for the exchanges between the upper and the lower mantles. Phase transitions, convective motions, hot upwelling and/or cold downwelling materials may make the 400 to 1000 km depth range very anisotropic and heterogeneous, both thermally and chemically. A classical procedure to infer the thermal state and the composition is to interpret 3D velocity perturbation models in terms of temperature and mineralogical composition, with respect to a global 1D model. However, the strength of heterogeneity and anisotropy can be so high that the concept of a one-dimensional reference seismic model might be addressed for this depth range. Some recent studies prefer to directly invert seismic travel times and normal modes catalogues in terms of temperature and composition. Bayesian approach allows to go beyond the classical computation of the best fit model by providing a quantitative measure of model uncertainty. We implement a non linear inverse approach (Monte Carlo Markov Chains) to interpret seismic data in terms of temperature, anisotropy and composition. Two different data sets are used and compared : surface wave waveforms and phase velocities (fundamental mode and the first overtones). A guideline of this method is to let the resolution power of the data govern the spatial resolution of the model. Up to now, the model parameters are the temperature field and the mineralogical composition ; other important effects, such as macroscopic anisotropy, will be taken into account in the near future. In order to reduce the computing time of the Monte Carlo procedure, polynomial Bézier curves are used for the parameterization. This choice allows for smoothly varying models and first-order discontinuities. Our Bayesian algorithm is tested with standard circular synthetic experiments and with more realistic simulations including 3D wave propagation effects (SEM). The test results enhance the ability of this approach to match the three-component waveforms and address the question of the mean radial interpretation of a 3D model. The method is also tested using real datasets, such as along the Vanuatu-California path.

  6. Uncertainty Quantification given Discontinuous Climate Model Response and a Limited Number of Model Runs

    NASA Astrophysics Data System (ADS)

    Sargsyan, K.; Safta, C.; Debusschere, B.; Najm, H.

    2010-12-01

    Uncertainty quantification in complex climate models is challenged by the sparsity of available climate model predictions due to the high computational cost of model runs. Another feature that prevents classical uncertainty analysis from being readily applicable is bifurcative behavior in climate model response with respect to certain input parameters. A typical example is the Atlantic Meridional Overturning Circulation. The predicted maximum overturning stream function exhibits discontinuity across a curve in the space of two uncertain parameters, namely climate sensitivity and CO2 forcing. We outline a methodology for uncertainty quantification given discontinuous model response and a limited number of model runs. Our approach is two-fold. First we detect the discontinuity with Bayesian inference, thus obtaining a probabilistic representation of the discontinuity curve shape and location for arbitrarily distributed input parameter values. Then, we construct spectral representations of uncertainty, using Polynomial Chaos (PC) expansions on either side of the discontinuity curve, leading to an averaged-PC representation of the forward model that allows efficient uncertainty quantification. The approach is enabled by a Rosenblatt transformation that maps each side of the discontinuity to regular domains where desirable orthogonality properties for the spectral bases hold. We obtain PC modes by either orthogonal projection or Bayesian inference, and argue for a hybrid approach that targets a balance between the accuracy provided by the orthogonal projection and the flexibility provided by the Bayesian inference - where the latter allows obtaining reasonable expansions without extra forward model runs. The model output, and its associated uncertainty at specific design points, are then computed by taking an ensemble average over PC expansions corresponding to possible realizations of the discontinuity curve. The methodology is tested on synthetic examples of discontinuous model data with adjustable sharpness and structure. This work was supported by the Sandia National Laboratories Seniors’ Council LDRD (Laboratory Directed Research and Development) program. Sandia National Laboratories is a multi-program laboratory operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Company, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

  7. A Sparse Bayesian Approach for Forward-Looking Superresolution Radar Imaging

    PubMed Central

    Zhang, Yin; Zhang, Yongchao; Huang, Yulin; Yang, Jianyu

    2017-01-01

    This paper presents a sparse superresolution approach for high cross-range resolution imaging of forward-looking scanning radar based on the Bayesian criterion. First, a novel forward-looking signal model is established as the product of the measurement matrix and the cross-range target distribution, which is more accurate than the conventional convolution model. Then, based on the Bayesian criterion, the widely-used sparse regularization is considered as the penalty term to recover the target distribution. The derivation of the cost function is described, and finally, an iterative expression for minimizing this function is presented. Alternatively, this paper discusses how to estimate the single parameter of Gaussian noise. With the advantage of a more accurate model, the proposed sparse Bayesian approach enjoys a lower model error. Meanwhile, when compared with the conventional superresolution methods, the proposed approach shows high cross-range resolution and small location error. The superresolution results for the simulated point target, scene data, and real measured data are presented to demonstrate the superior performance of the proposed approach. PMID:28604583

  8. Sparse Event Modeling with Hierarchical Bayesian Kernel Methods

    DTIC Science & Technology

    2016-01-05

    SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model , is able to model the rate of occurrence of...which adds specificity to the model and can make nonlinear data more manageable. Early results show that the 1. REPORT DATE (DD-MM-YYYY) 4. TITLE

  9. Bayesian Inference for Functional Dynamics Exploring in fMRI Data.

    PubMed

    Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing

    2016-01-01

    This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.

  10. Sampling-free Bayesian inversion with adaptive hierarchical tensor representations

    NASA Astrophysics Data System (ADS)

    Eigel, Martin; Marschall, Manuel; Schneider, Reinhold

    2018-03-01

    A sampling-free approach to Bayesian inversion with an explicit polynomial representation of the parameter densities is developed, based on an affine-parametric representation of a linear forward model. This becomes feasible due to the complete treatment in function spaces, which requires an efficient model reduction technique for numerical computations. The advocated perspective yields the crucial benefit that error bounds can be derived for all occuring approximations, leading to provable convergence subject to the discretization parameters. Moreover, it enables a fully adaptive a posteriori control with automatic problem-dependent adjustments of the employed discretizations. The method is discussed in the context of modern hierarchical tensor representations, which are used for the evaluation of a random PDE (the forward model) and the subsequent high-dimensional quadrature of the log-likelihood, alleviating the ‘curse of dimensionality’. Numerical experiments demonstrate the performance and confirm the theoretical results.

  11. Inference of missing data and chemical model parameters using experimental statistics

    NASA Astrophysics Data System (ADS)

    Casey, Tiernan; Najm, Habib

    2017-11-01

    A method for determining the joint parameter density of Arrhenius rate expressions through the inference of missing experimental data is presented. This approach proposes noisy hypothetical data sets from target experiments and accepts those which agree with the reported statistics, in the form of nominal parameter values and their associated uncertainties. The data exploration procedure is formalized using Bayesian inference, employing maximum entropy and approximate Bayesian computation methods to arrive at a joint density on data and parameters. The method is demonstrated in the context of reactions in the H2-O2 system for predictive modeling of combustion systems of interest. Work supported by the US DOE BES CSGB. Sandia National Labs is a multimission lab managed and operated by Nat. Technology and Eng'g Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell Intl, for the US DOE NCSA under contract DE-NA-0003525.

  12. Tracking student progress in a game-like physics learning environment with a Monte Carlo Bayesian knowledge tracing model

    NASA Astrophysics Data System (ADS)

    Gweon, Gey-Hong; Lee, Hee-Sun; Dorsey, Chad; Tinker, Robert; Finzer, William; Damelin, Daniel

    2015-03-01

    In tracking student learning in on-line learning systems, the Bayesian knowledge tracing (BKT) model is a popular model. However, the model has well-known problems such as the identifiability problem or the empirical degeneracy problem. Understanding of these problems remain unclear and solutions to them remain subjective. Here, we analyze the log data from an online physics learning program with our new model, a Monte Carlo BKT model. With our new approach, we are able to perform a completely unbiased analysis, which can then be used for classifying student learning patterns and performances. Furthermore, a theoretical analysis of the BKT model and our computational work shed new light on the nature of the aforementioned problems. This material is based upon work supported by the National Science Foundation under Grant REC-1147621 and REC-1435470.

  13. A Bayesian Approach for Image Segmentation with Shape Priors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chang, Hang; Yang, Qing; Parvin, Bahram

    2008-06-20

    Color and texture have been widely used in image segmentation; however, their performance is often hindered by scene ambiguities, overlapping objects, or missingparts. In this paper, we propose an interactive image segmentation approach with shape prior models within a Bayesian framework. Interactive features, through mouse strokes, reduce ambiguities, and the incorporation of shape priors enhances quality of the segmentation where color and/or texture are not solely adequate. The novelties of our approach are in (i) formulating the segmentation problem in a well-de?ned Bayesian framework with multiple shape priors, (ii) ef?ciently estimating parameters of the Bayesian model, and (iii) multi-object segmentationmore » through user-speci?ed priors. We demonstrate the effectiveness of our method on a set of natural and synthetic images.« less

  14. Editorial: Bayesian benefits for child psychology and psychiatry researchers.

    PubMed

    Oldehinkel, Albertine J

    2016-09-01

    For many scientists, performing statistical tests has become an almost automated routine. However, p-values are frequently used and interpreted incorrectly; and even when used appropriately, p-values tend to provide answers that do not match researchers' questions and hypotheses well. Bayesian statistics present an elegant and often more suitable alternative. The Bayesian approach has rarely been applied in child psychology and psychiatry research so far, but the development of user-friendly software packages and tutorials has placed it well within reach now. Because Bayesian analyses require a more refined definition of hypothesized probabilities of possible outcomes than the classical approach, going Bayesian may offer the additional benefit of sparkling the development and refinement of theoretical models in our field. © 2016 Association for Child and Adolescent Mental Health.

  15. cosmoabc: Likelihood-free inference for cosmology

    NASA Astrophysics Data System (ADS)

    Ishida, Emille E. O.; Vitenti, Sandro D. P.; Penna-Lima, Mariana; Trindade, Arlindo M.; Cisewski, Jessi; M.; de Souza, Rafael; Cameron, Ewan; Busti, Vinicius C.

    2015-05-01

    Approximate Bayesian Computation (ABC) enables parameter inference for complex physical systems in cases where the true likelihood function is unknown, unavailable, or computationally too expensive. It relies on the forward simulation of mock data and comparison between observed and synthetic catalogs. cosmoabc is a Python Approximate Bayesian Computation (ABC) sampler featuring a Population Monte Carlo variation of the original ABC algorithm, which uses an adaptive importance sampling scheme. The code can be coupled to an external simulator to allow incorporation of arbitrary distance and prior functions. When coupled with the numcosmo library, it has been used to estimate posterior probability distributions over cosmological parameters based on measurements of galaxy clusters number counts without computing the likelihood function.

  16. Quantitative trait nucleotide analysis using Bayesian model selection.

    PubMed

    Blangero, John; Goring, Harald H H; Kent, Jack W; Williams, Jeff T; Peterson, Charles P; Almasy, Laura; Dyer, Thomas D

    2005-10-01

    Although much attention has been given to statistical genetic methods for the initial localization and fine mapping of quantitative trait loci (QTLs), little methodological work has been done to date on the problem of statistically identifying the most likely functional polymorphisms using sequence data. In this paper we provide a general statistical genetic framework, called Bayesian quantitative trait nucleotide (BQTN) analysis, for assessing the likely functional status of genetic variants. The approach requires the initial enumeration of all genetic variants in a set of resequenced individuals. These polymorphisms are then typed in a large number of individuals (potentially in families), and marker variation is related to quantitative phenotypic variation using Bayesian model selection and averaging. For each sequence variant a posterior probability of effect is obtained and can be used to prioritize additional molecular functional experiments. An example of this quantitative nucleotide analysis is provided using the GAW12 simulated data. The results show that the BQTN method may be useful for choosing the most likely functional variants within a gene (or set of genes). We also include instructions on how to use our computer program, SOLAR, for association analysis and BQTN analysis.

  17. Bayesian Mediation Analysis

    ERIC Educational Resources Information Center

    Yuan, Ying; MacKinnon, David P.

    2009-01-01

    In this article, we propose Bayesian analysis of mediation effects. Compared with conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian…

  18. BayMeth: improved DNA methylation quantification for affinity capture sequencing data using a flexible Bayesian approach

    PubMed Central

    2014-01-01

    Affinity capture of DNA methylation combined with high-throughput sequencing strikes a good balance between the high cost of whole genome bisulfite sequencing and the low coverage of methylation arrays. We present BayMeth, an empirical Bayes approach that uses a fully methylated control sample to transform observed read counts into regional methylation levels. In our model, inefficient capture can readily be distinguished from low methylation levels. BayMeth improves on existing methods, allows explicit modeling of copy number variation, and offers computationally efficient analytical mean and variance estimators. BayMeth is available in the Repitools Bioconductor package. PMID:24517713

  19. Thirty Years After Marr's Vision: Levels of Analysis in Cognitive Science.

    PubMed

    Peebles, David; Cooper, Richard P

    2015-04-01

    Thirty years after the publication of Marr's seminal book Vision (Marr, 1982) the papers in this topic consider the contemporary status of his influential conception of three distinct levels of analysis for information-processing systems, and in particular the role of the algorithmic and representational level with its cognitive-level concepts. This level has (either implicitly or explicitly) been downplayed or eliminated both by reductionist neuroscience approaches from below that seek to account for behavior from the implementation level and by Bayesian approaches from above that seek to account for behavior in purely computational-level terms. Copyright © 2015 Cognitive Science Society, Inc.

  20. On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang

    2015-02-01

    The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesianmore » inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.« less

  1. Assessment methodology for computer-based instructional simulations.

    PubMed

    Koenig, Alan; Iseli, Markus; Wainess, Richard; Lee, John J

    2013-10-01

    Computer-based instructional simulations are becoming more and more ubiquitous, particularly in military and medical domains. As the technology that drives these simulations grows ever more sophisticated, the underlying pedagogical models for how instruction, assessment, and feedback are implemented within these systems must evolve accordingly. In this article, we review some of the existing educational approaches to medical simulations, and present pedagogical methodologies that have been used in the design and development of games and simulations at the University of California, Los Angeles, Center for Research on Evaluation, Standards, and Student Testing. In particular, we present a methodology for how automated assessments of computer-based simulations can be implemented using ontologies and Bayesian networks, and discuss their advantages and design considerations for pedagogical use. Reprint & Copyright © 2013 Association of Military Surgeons of the U.S.

  2. Bayesian data analysis for newcomers.

    PubMed

    Kruschke, John K; Liddell, Torrin M

    2018-02-01

    This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.

  3. Predictability of extreme weather events for NE U.S.: improvement of the numerical prediction using a Bayesian regression approach

    NASA Astrophysics Data System (ADS)

    Yang, J.; Astitha, M.; Anagnostou, E. N.; Hartman, B.; Kallos, G. B.

    2015-12-01

    Weather prediction accuracy has become very important for the Northeast U.S. given the devastating effects of extreme weather events in the recent years. Weather forecasting systems are used towards building strategies to prevent catastrophic losses for human lives and the environment. Concurrently, weather forecast tools and techniques have evolved with improved forecast skill as numerical prediction techniques are strengthened by increased super-computing resources. In this study, we examine the combination of two state-of-the-science atmospheric models (WRF and RAMS/ICLAMS) by utilizing a Bayesian regression approach to improve the prediction of extreme weather events for NE U.S. The basic concept behind the Bayesian regression approach is to take advantage of the strengths of two atmospheric modeling systems and, similar to the multi-model ensemble approach, limit their weaknesses which are related to systematic and random errors in the numerical prediction of physical processes. The first part of this study is focused on retrospective simulations of seventeen storms that affected the region in the period 2004-2013. Optimal variances are estimated by minimizing the root mean square error and are applied to out-of-sample weather events. The applicability and usefulness of this approach are demonstrated by conducting an error analysis based on in-situ observations from meteorological stations of the National Weather Service (NWS) for wind speed and wind direction, and NCEP Stage IV radar data, mosaicked from the regional multi-sensor for precipitation. The preliminary results indicate a significant improvement in the statistical metrics of the modeled-observed pairs for meteorological variables using various combinations of the sixteen events as predictors of the seventeenth. This presentation will illustrate the implemented methodology and the obtained results for wind speed, wind direction and precipitation, as well as set the research steps that will be followed in the future.

  4. Bayesian analysis of caustic-crossing microlensing events

    NASA Astrophysics Data System (ADS)

    Cassan, A.; Horne, K.; Kains, N.; Tsapras, Y.; Browne, P.

    2010-06-01

    Aims: Caustic-crossing binary-lens microlensing events are important anomalous events because they are capable of detecting an extrasolar planet companion orbiting the lens star. Fast and robust modelling methods are thus of prime interest in helping to decide whether a planet is detected by an event. Cassan introduced a new set of parameters to model binary-lens events, which are closely related to properties of the light curve. In this work, we explain how Bayesian priors can be added to this framework, and investigate on interesting options. Methods: We develop a mathematical formulation that allows us to compute analytically the priors on the new parameters, given some previous knowledge about other physical quantities. We explicitly compute the priors for a number of interesting cases, and show how this can be implemented in a fully Bayesian, Markov chain Monte Carlo algorithm. Results: Using Bayesian priors can accelerate microlens fitting codes by reducing the time spent considering physically implausible models, and helps us to discriminate between alternative models based on the physical plausibility of their parameters.

  5. Bayesian Action–Perception Computational Model: Interaction of Production and Recognition of Cursive Letters

    PubMed Central

    Gilet, Estelle; Diard, Julien; Bessière, Pierre

    2011-01-01

    In this paper, we study the collaboration of perception and action representations involved in cursive letter recognition and production. We propose a mathematical formulation for the whole perception–action loop, based on probabilistic modeling and Bayesian inference, which we call the Bayesian Action–Perception (BAP) model. Being a model of both perception and action processes, the purpose of this model is to study the interaction of these processes. More precisely, the model includes a feedback loop from motor production, which implements an internal simulation of movement. Motor knowledge can therefore be involved during perception tasks. In this paper, we formally define the BAP model and show how it solves the following six varied cognitive tasks using Bayesian inference: i) letter recognition (purely sensory), ii) writer recognition, iii) letter production (with different effectors), iv) copying of trajectories, v) copying of letters, and vi) letter recognition (with internal simulation of movements). We present computer simulations of each of these cognitive tasks, and discuss experimental predictions and theoretical developments. PMID:21674043

  6. Integrating System Dynamics and Bayesian Networks with Application to Counter-IED Scenarios

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jarman, Kenneth D.; Brothers, Alan J.; Whitney, Paul D.

    2010-06-06

    The practice of choosing a single modeling paradigm for predictive analysis can limit the scope and relevance of predictions and their utility to decision-making processes. Considering multiple modeling methods simultaneously may improve this situation, but a better solution provides a framework for directly integrating different, potentially complementary modeling paradigms to enable more comprehensive modeling and predictions, and thus better-informed decisions. The primary challenges of this kind of model integration are to bridge language and conceptual gaps between modeling paradigms, and to determine whether natural and useful linkages can be made in a formal mathematical manner. To address these challenges inmore » the context of two specific modeling paradigms, we explore mathematical and computational options for linking System Dynamics (SD) and Bayesian network (BN) models and incorporating data into the integrated models. We demonstrate that integrated SD/BN models can naturally be described as either state space equations or Dynamic Bayes Nets, which enables the use of many existing computational methods for simulation and data integration. To demonstrate, we apply our model integration approach to techno-social models of insurgent-led attacks and security force counter-measures centered on improvised explosive devices.« less

  7. Probabilistic Models and Generative Neural Networks: Towards an Unified Framework for Modeling Normal and Impaired Neurocognitive Functions

    PubMed Central

    Testolin, Alberto; Zorzi, Marco

    2016-01-01

    Connectionist models can be characterized within the more general framework of probabilistic graphical models, which allow to efficiently describe complex statistical distributions involving a large number of interacting variables. This integration allows building more realistic computational models of cognitive functions, which more faithfully reflect the underlying neural mechanisms at the same time providing a useful bridge to higher-level descriptions in terms of Bayesian computations. Here we discuss a powerful class of graphical models that can be implemented as stochastic, generative neural networks. These models overcome many limitations associated with classic connectionist models, for example by exploiting unsupervised learning in hierarchical architectures (deep networks) and by taking into account top-down, predictive processing supported by feedback loops. We review some recent cognitive models based on generative networks, and we point out promising research directions to investigate neuropsychological disorders within this approach. Though further efforts are required in order to fill the gap between structured Bayesian models and more realistic, biophysical models of neuronal dynamics, we argue that generative neural networks have the potential to bridge these levels of analysis, thereby improving our understanding of the neural bases of cognition and of pathologies caused by brain damage. PMID:27468262

  8. Bayesian Ising approximation for learning dictionaries of multispike timing patterns in premotor neurons

    NASA Astrophysics Data System (ADS)

    Hernandez Lahme, Damian; Sober, Samuel; Nemenman, Ilya

    Important questions in computational neuroscience are whether, how much, and how information is encoded in the precise timing of neural action potentials. We recently demonstrated that, in the premotor cortex during vocal control in songbirds, spike timing is far more informative about upcoming behavior than is spike rate (Tang et al, 2014). However, identification of complete dictionaries that relate spike timing patterns with the controled behavior remains an elusive problem. Here we present a computational approach to deciphering such codes for individual neurons in the songbird premotor area RA, an analog of mammalian primary motor cortex. Specifically, we analyze which multispike patterns of neural activity predict features of the upcoming vocalization, and hence are important codewords. We use a recently introduced Bayesian Ising Approximation, which properly accounts for the fact that many codewords overlap and hence are not independent. Our results show which complex, temporally precise multispike combinations are used by individual neurons to control acoustic features of the produced song, and that these code words are different across individual neurons and across different acoustic features. This work was supported, in part, by JSMF Grant 220020321, NSF Grant 1208126, NIH Grant NS084844 and NIH Grant 1 R01 EB022872.

  9. Hierarchical Bayesian Markov switching models with application to predicting spawning success of shovelnose sturgeon

    USGS Publications Warehouse

    Holan, S.H.; Davis, G.M.; Wildhaber, M.L.; DeLonay, A.J.; Papoulias, D.M.

    2009-01-01

    The timing of spawning in fish is tightly linked to environmental factors; however, these factors are not very well understood for many species. Specifically, little information is available to guide recruitment efforts for endangered species such as the sturgeon. Therefore, we propose a Bayesian hierarchical model for predicting the success of spawning of the shovelnose sturgeon which uses both biological and behavioural (longitudinal) data. In particular, we use data that were produced from a tracking study that was conducted in the Lower Missouri River. The data that were produced from this study consist of biological variables associated with readiness to spawn along with longitudinal behavioural data collected by using telemetry and archival data storage tags. These high frequency data are complex both biologically and in the underlying behavioural process. To accommodate such complexity we developed a hierarchical linear regression model that uses an eigenvalue predictor, derived from the transition probability matrix of a two-state Markov switching model with generalized auto-regressive conditional heteroscedastic dynamics. Finally, to minimize the computational burden that is associated with estimation of this model, a parallel computing approach is proposed. ?? Journal compilation 2009 Royal Statistical Society.

  10. Probabilistic Models and Generative Neural Networks: Towards an Unified Framework for Modeling Normal and Impaired Neurocognitive Functions.

    PubMed

    Testolin, Alberto; Zorzi, Marco

    2016-01-01

    Connectionist models can be characterized within the more general framework of probabilistic graphical models, which allow to efficiently describe complex statistical distributions involving a large number of interacting variables. This integration allows building more realistic computational models of cognitive functions, which more faithfully reflect the underlying neural mechanisms at the same time providing a useful bridge to higher-level descriptions in terms of Bayesian computations. Here we discuss a powerful class of graphical models that can be implemented as stochastic, generative neural networks. These models overcome many limitations associated with classic connectionist models, for example by exploiting unsupervised learning in hierarchical architectures (deep networks) and by taking into account top-down, predictive processing supported by feedback loops. We review some recent cognitive models based on generative networks, and we point out promising research directions to investigate neuropsychological disorders within this approach. Though further efforts are required in order to fill the gap between structured Bayesian models and more realistic, biophysical models of neuronal dynamics, we argue that generative neural networks have the potential to bridge these levels of analysis, thereby improving our understanding of the neural bases of cognition and of pathologies caused by brain damage.

  11. Adaptively Reevaluated Bayesian Localization (ARBL). A Novel Technique for Radiological Source Localization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miller, Erin A.; Robinson, Sean M.; Anderson, Kevin K.

    2015-01-19

    Here we present a novel technique for the localization of radiological sources in urban or rural environments from an aerial platform. The technique is based on a Bayesian approach to localization, in which measured count rates in a time series are compared with predicted count rates from a series of pre-calculated test sources to define likelihood. Furthermore, this technique is expanded by using a localized treatment with a limited field of view (FOV), coupled with a likelihood ratio reevaluation, allowing for real-time computation on commodity hardware for arbitrarily complex detector models and terrain. In particular, detectors with inherent asymmetry ofmore » response (such as those employing internal collimation or self-shielding for enhanced directional awareness) are leveraged by this approach to provide improved localization. Our results from the localization technique are shown for simulated flight data using monolithic as well as directionally-aware detector models, and the capability of the methodology to locate radioisotopes is estimated for several test cases. This localization technique is shown to facilitate urban search by allowing quick and adaptive estimates of source location, in many cases from a single flyover near a source. In particular, this method represents a significant advancement from earlier methods like full-field Bayesian likelihood, which is not generally fast enough to allow for broad-field search in real time, and highest-net-counts estimation, which has a localization error that depends strongly on flight path and cannot generally operate without exhaustive search« less

  12. Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology.

    PubMed

    Hardcastle, Thomas J

    2016-01-15

    High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a 'large P, small n' setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses. We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs. The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html. tjh48@cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. BAYESIAN METHODS FOR REGIONAL-SCALE EUTROPHICATION MODELS. (R830887)

    EPA Science Inventory

    We demonstrate a Bayesian classification and regression tree (CART) approach to link multiple environmental stressors to biological responses and quantify uncertainty in model predictions. Such an approach can: (1) report prediction uncertainty, (2) be consistent with the amou...

  14. Bayesian approach for counting experiment statistics applied to a neutrino point source analysis

    NASA Astrophysics Data System (ADS)

    Bose, D.; Brayeur, L.; Casier, M.; de Vries, K. D.; Golup, G.; van Eijndhoven, N.

    2013-12-01

    In this paper we present a model independent analysis method following Bayesian statistics to analyse data from a generic counting experiment and apply it to the search for neutrinos from point sources. We discuss a test statistic defined following a Bayesian framework that will be used in the search for a signal. In case no signal is found, we derive an upper limit without the introduction of approximations. The Bayesian approach allows us to obtain the full probability density function for both the background and the signal rate. As such, we have direct access to any signal upper limit. The upper limit derivation directly compares with a frequentist approach and is robust in the case of low-counting observations. Furthermore, it allows also to account for previous upper limits obtained by other analyses via the concept of prior information without the need of the ad hoc application of trial factors. To investigate the validity of the presented Bayesian approach, we have applied this method to the public IceCube 40-string configuration data for 10 nearby blazars and we have obtained a flux upper limit, which is in agreement with the upper limits determined via a frequentist approach. Furthermore, the upper limit obtained compares well with the previously published result of IceCube, using the same data set.

  15. [Bayesian geostatistical prediction of soil organic carbon contents of solonchak soils in nor-thern Tarim Basin, Xinjiang, China.

    PubMed

    Wu, Wei Mo; Wang, Jia Qiang; Cao, Qi; Wu, Jia Ping

    2017-02-01

    Accurate prediction of soil organic carbon (SOC) distribution is crucial for soil resources utilization and conservation, climate change adaptation, and ecosystem health. In this study, we selected a 1300 m×1700 m solonchak sampling area in northern Tarim Basin, Xinjiang, China, and collected a total of 144 soil samples (5-10 cm). The objectives of this study were to build a Baye-sian geostatistical model to predict SOC content, and to assess the performance of the Bayesian model for the prediction of SOC content by comparing with other three geostatistical approaches [ordinary kriging (OK), sequential Gaussian simulation (SGS), and inverse distance weighting (IDW)]. In the study area, soil organic carbon contents ranged from 1.59 to 9.30 g·kg -1 with a mean of 4.36 g·kg -1 and a standard deviation of 1.62 g·kg -1 . Sample semivariogram was best fitted by an exponential model with the ratio of nugget to sill being 0.57. By using the Bayesian geostatistical approach, we generated the SOC content map, and obtained the prediction variance, upper 95% and lower 95% of SOC contents, which were then used to evaluate the prediction uncertainty. Bayesian geostatistical approach performed better than that of the OK, SGS and IDW, demonstrating the advantages of Bayesian approach in SOC prediction.

  16. A compressive sensing-based computational method for the inversion of wide-band ground penetrating radar data

    NASA Astrophysics Data System (ADS)

    Gelmini, A.; Gottardi, G.; Moriyama, T.

    2017-10-01

    This work presents an innovative computational approach for the inversion of wideband ground penetrating radar (GPR) data. The retrieval of the dielectric characteristics of sparse scatterers buried in a lossy soil is performed by combining a multi-task Bayesian compressive sensing (MT-BCS) solver and a frequency hopping (FH) strategy. The developed methodology is able to benefit from the regularization capabilities of the MT-BCS as well as to exploit the multi-chromatic informative content of GPR measurements. A set of numerical results is reported in order to assess the effectiveness of the proposed GPR inverse scattering technique, as well as to compare it to a simpler single-task implementation.

  17. Mental health assessment: Inference, explanation, and coherence.

    PubMed

    Thagard, Paul; Larocque, Laurette

    2018-06-01

    Mental health professionals such as psychiatrists and psychotherapists assess their patients by identifying disorders that explain their symptoms. This assessment requires an inference to the best explanation that compares different disorders with respect to how well they explain the available evidence. Such comparisons are captured by the theory of explanatory coherence that states 7 principles for evaluating competing hypotheses in the light of evidence. The computational model ECHO shows how explanatory coherence can be efficiently computed. We show the applicability of explanatory coherence to mental health assessment by modelling a case of psychiatric interviewing and a case of psychotherapeutic evaluation. We argue that this approach is more plausible than Bayesian inference and hermeneutic interpretation. © 2018 John Wiley & Sons, Ltd.

  18. Bayesian networks and statistical analysis application to analyze the diagnostic test accuracy

    NASA Astrophysics Data System (ADS)

    Orzechowski, P.; Makal, Jaroslaw; Onisko, A.

    2005-02-01

    The computer aided BPH diagnosis system based on Bayesian network is described in the paper. First result are compared to a given statistical method. Different statistical methods are used successfully in medicine for years. However, the undoubted advantages of probabilistic methods make them useful in application in newly created systems which are frequent in medicine, but do not have full and competent knowledge. The article presents advantages of the computer aided BPH diagnosis system in clinical practice for urologists.

  19. Cortical Hierarchies Perform Bayesian Causal Inference in Multisensory Perception

    PubMed Central

    Rohe, Tim; Noppeney, Uta

    2015-01-01

    To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the “causal inference problem.” Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI), and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation). At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion). Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world. PMID:25710328

  20. Rediscovery of Good-Turing estimators via Bayesian nonparametrics.

    PubMed

    Favaro, Stefano; Nipoti, Bernardo; Teh, Yee Whye

    2016-03-01

    The problem of estimating discovery probabilities originated in the context of statistical ecology, and in recent years it has become popular due to its frequent appearance in challenging applications arising in genetics, bioinformatics, linguistics, designs of experiments, machine learning, etc. A full range of statistical approaches, parametric and nonparametric as well as frequentist and Bayesian, has been proposed for estimating discovery probabilities. In this article, we investigate the relationships between the celebrated Good-Turing approach, which is a frequentist nonparametric approach developed in the 1940s, and a Bayesian nonparametric approach recently introduced in the literature. Specifically, under the assumption of a two parameter Poisson-Dirichlet prior, we show that Bayesian nonparametric estimators of discovery probabilities are asymptotically equivalent, for a large sample size, to suitably smoothed Good-Turing estimators. As a by-product of this result, we introduce and investigate a methodology for deriving exact and asymptotic credible intervals to be associated with the Bayesian nonparametric estimators of discovery probabilities. The proposed methodology is illustrated through a comprehensive simulation study and the analysis of Expressed Sequence Tags data generated by sequencing a benchmark complementary DNA library. © 2015, The International Biometric Society.

  1. Review of Reliability-Based Design Optimization Approach and Its Integration with Bayesian Method

    NASA Astrophysics Data System (ADS)

    Zhang, Xiangnan

    2018-03-01

    A lot of uncertain factors lie in practical engineering, such as external load environment, material property, geometrical shape, initial condition, boundary condition, etc. Reliability method measures the structural safety condition and determine the optimal design parameter combination based on the probabilistic theory. Reliability-based design optimization (RBDO) is the most commonly used approach to minimize the structural cost or other performance under uncertainty variables which combines the reliability theory and optimization. However, it cannot handle the various incomplete information. The Bayesian approach is utilized to incorporate this kind of incomplete information in its uncertainty quantification. In this paper, the RBDO approach and its integration with Bayesian method are introduced.

  2. A FAST BAYESIAN METHOD FOR UPDATING AND FORECASTING HOURLY OZONE LEVELS

    EPA Science Inventory

    A Bayesian hierarchical space-time model is proposed by combining information from real-time ambient AIRNow air monitoring data, and output from a computer simulation model known as the Community Multi-scale Air Quality (Eta-CMAQ) forecast model. A model validation analysis shows...

  3. Efficient Dependency Computation for Dynamic Hybrid Bayesian Network in On-line System Health Management Applications

    DTIC Science & Technology

    2014-10-02

    intervals (Neil, Tailor, Marquez, Fenton , & Hear, 2007). This is cumbersome, error prone and usually inaccurate. Even though a universal framework...Science. Neil, M., Tailor, M., Marquez, D., Fenton , N., & Hear. (2007). Inference in Bayesian networks using dynamic discretisation. Statistics

  4. Uncertainty quantification in capacitive RF MEMS switches

    NASA Astrophysics Data System (ADS)

    Pax, Benjamin J.

    Development of radio frequency micro electrical-mechanical systems (RF MEMS) has led to novel approaches to implement electrical circuitry. The introduction of capacitive MEMS switches, in particular, has shown promise in low-loss, low-power devices. However, the promise of MEMS switches has not yet been completely realized. RF-MEMS switches are known to fail after only a few months of operation, and nominally similar designs show wide variability in lifetime. Modeling switch operation using nominal or as-designed parameters cannot predict the statistical spread in the number of cycles to failure, and probabilistic methods are necessary. A Bayesian framework for calibration, validation and prediction offers an integrated approach to quantifying the uncertainty in predictions of MEMS switch performance. The objective of this thesis is to use the Bayesian framework to predict the creep-related deflection of the PRISM RF-MEMS switch over several thousand hours of operation. The PRISM switch used in this thesis is the focus of research at Purdue's PRISM center, and is a capacitive contacting RF-MEMS switch. It employs a fixed-fixed nickel membrane which is electrostatically actuated by applying voltage between the membrane and a pull-down electrode. Creep plays a central role in the reliability of this switch. The focus of this thesis is on the creep model, which is calibrated against experimental data measured for a frog-leg varactor fabricated and characterized at Purdue University. Creep plasticity is modeled using plate element theory with electrostatic forces being generated using either parallel plate approximations where appropriate, or solving for the full 3D potential field. For the latter, structure-electrostatics interaction is determined through immersed boundary method. A probabilistic framework using generalized polynomial chaos (gPC) is used to create surrogate models to mitigate the costly full physics simulations, and Bayesian calibration and forward propagation of uncertainty are performed using this surrogate model. The first step in the analysis is Bayesian calibration of the creep related parameters. A computational model of the frog-leg varactor is created, and the computed creep deflection of the device over 800 hours is used to generate a surrogate model using a polynomial chaos expansion in Hermite polynomials. Parameters related to the creep phenomenon are calibrated using Bayesian calibration with experimental deflection data from the frog-leg device. The calibrated input distributions are subsequently propagated through a surrogate gPC model for the PRISM MEMS switch to produce probability density functions of the maximum membrane deflection of the membrane over several thousand hours. The assumptions related to the Bayesian calibration and forward propagation are analyzed to determine the sensitivity to these assumptions of the calibrated input distributions and propagated output distributions of the PRISM device. The work is an early step in understanding the role of geometric variability, model uncertainty, numerical errors and experimental uncertainties in the long-term performance of RF-MEMS.

  5. A Bayesian Supertree Model for Genome-Wide Species Tree Reconstruction

    PubMed Central

    De Oliveira Martins, Leonardo; Mallo, Diego; Posada, David

    2016-01-01

    Current phylogenomic data sets highlight the need for species tree methods able to deal with several sources of gene tree/species tree incongruence. At the same time, we need to make most use of all available data. Most species tree methods deal with single processes of phylogenetic discordance, namely, gene duplication and loss, incomplete lineage sorting (ILS) or horizontal gene transfer. In this manuscript, we address the problem of species tree inference from multilocus, genome-wide data sets regardless of the presence of gene duplication and loss and ILS therefore without the need to identify orthologs or to use a single individual per species. We do this by extending the idea of Maximum Likelihood (ML) supertrees to a hierarchical Bayesian model where several sources of gene tree/species tree disagreement can be accounted for in a modular manner. We implemented this model in a computer program called guenomu whose inputs are posterior distributions of unrooted gene tree topologies for multiple gene families, and whose output is the posterior distribution of rooted species tree topologies. We conducted extensive simulations to evaluate the performance of our approach in comparison with other species tree approaches able to deal with more than one leaf from the same species. Our method ranked best under simulated data sets, in spite of ignoring branch lengths, and performed well on empirical data, as well as being fast enough to analyze relatively large data sets. Our Bayesian supertree method was also very successful in obtaining better estimates of gene trees, by reducing the uncertainty in their distributions. In addition, our results show that under complex simulation scenarios, gene tree parsimony is also a competitive approach once we consider its speed, in contrast to more sophisticated models. PMID:25281847

  6. Extracting Prior Distributions from a Large Dataset of In-Situ Measurements to Support SWOT-based Estimation of River Discharge

    NASA Astrophysics Data System (ADS)

    Hagemann, M.; Gleason, C. J.

    2017-12-01

    The upcoming (2021) Surface Water and Ocean Topography (SWOT) NASA satellite mission aims, in part, to estimate discharge on major rivers worldwide using reach-scale measurements of stream width, slope, and height. Current formalizations of channel and floodplain hydraulics are insufficient to fully constrain this problem mathematically, resulting in an infinitely large solution set for any set of satellite observations. Recent work has reformulated this problem in a Bayesian statistical setting, in which the likelihood distributions derive directly from hydraulic flow-law equations. When coupled with prior distributions on unknown flow-law parameters, this formulation probabilistically constrains the parameter space, and results in a computationally tractable description of discharge. Using a curated dataset of over 200,000 in-situ acoustic Doppler current profiler (ADCP) discharge measurements from over 10,000 USGS gaging stations throughout the United States, we developed empirical prior distributions for flow-law parameters that are not observable by SWOT, but that are required in order to estimate discharge. This analysis quantified prior uncertainties on quantities including cross-sectional area, at-a-station hydraulic geometry width exponent, and discharge variability, that are dependent on SWOT-observable variables including reach-scale statistics of width and height. When compared against discharge estimation approaches that do not use this prior information, the Bayesian approach using ADCP-derived priors demonstrated consistently improved performance across a range of performance metrics. This Bayesian approach formally transfers information from in-situ gaging stations to remote-sensed estimation of discharge, in which the desired quantities are not directly observable. Further investigation using large in-situ datasets is therefore a promising way forward in improving satellite-based estimates of river discharge.

  7. BATS: a Bayesian user-friendly software for analyzing time series microarray experiments.

    PubMed

    Angelini, Claudia; Cutillo, Luisa; De Canditiis, Daniela; Mutarelli, Margherita; Pensky, Marianna

    2008-10-06

    Gene expression levels in a given cell can be influenced by different factors, namely pharmacological or medical treatments. The response to a given stimulus is usually different for different genes and may depend on time. One of the goals of modern molecular biology is the high-throughput identification of genes associated with a particular treatment or a biological process of interest. From methodological and computational point of view, analyzing high-dimensional time course microarray data requires very specific set of tools which are usually not included in standard software packages. Recently, the authors of this paper developed a fully Bayesian approach which allows one to identify differentially expressed genes in a 'one-sample' time-course microarray experiment, to rank them and to estimate their expression profiles. The method is based on explicit expressions for calculations and, hence, very computationally efficient. The software package BATS (Bayesian Analysis of Time Series) presented here implements the methodology described above. It allows an user to automatically identify and rank differentially expressed genes and to estimate their expression profiles when at least 5-6 time points are available. The package has a user-friendly interface. BATS successfully manages various technical difficulties which arise in time-course microarray experiments, such as a small number of observations, non-uniform sampling intervals and replicated or missing data. BATS is a free user-friendly software for the analysis of both simulated and real microarray time course experiments. The software, the user manual and a brief illustrative example are freely available online at the BATS website: http://www.na.iac.cnr.it/bats.

  8. Improving the Accuracy of Planet Occurrence Rates from Kepler Using Approximate Bayesian Computation

    NASA Astrophysics Data System (ADS)

    Hsu, Danley C.; Ford, Eric B.; Ragozzine, Darin; Morehead, Robert C.

    2018-05-01

    We present a new framework to characterize the occurrence rates of planet candidates identified by Kepler based on hierarchical Bayesian modeling, approximate Bayesian computing (ABC), and sequential importance sampling. For this study, we adopt a simple 2D grid in planet radius and orbital period as our model and apply our algorithm to estimate occurrence rates for Q1–Q16 planet candidates orbiting solar-type stars. We arrive at significantly increased planet occurrence rates for small planet candidates (R p < 1.25 R ⊕) at larger orbital periods (P > 80 day) compared to the rates estimated by the more common inverse detection efficiency method (IDEM). Our improved methodology estimates that the occurrence rate density of small planet candidates in the habitable zone of solar-type stars is {1.6}-0.5+1.2 per factor of 2 in planet radius and orbital period. Additionally, we observe a local minimum in the occurrence rate for strong planet candidates marginalized over orbital period between 1.5 and 2 R ⊕ that is consistent with previous studies. For future improvements, the forward modeling approach of ABC is ideally suited to incorporating multiple populations, such as planets, astrophysical false positives, and pipeline false alarms, to provide accurate planet occurrence rates and uncertainties. Furthermore, ABC provides a practical statistical framework for answering complex questions (e.g., frequency of different planetary architectures) and providing sound uncertainties, even in the face of complex selection effects, observational biases, and follow-up strategies. In summary, ABC offers a powerful tool for accurately characterizing a wide variety of astrophysical populations.

  9. Efficient computation of the phylogenetic likelihood function on multi-gene alignments and multi-core architectures.

    PubMed

    Stamatakis, Alexandros; Ott, Michael

    2008-12-27

    The continuous accumulation of sequence data, for example, due to novel wet-laboratory techniques such as pyrosequencing, coupled with the increasing popularity of multi-gene phylogenies and emerging multi-core processor architectures that face problems of cache congestion, poses new challenges with respect to the efficient computation of the phylogenetic maximum-likelihood (ML) function. Here, we propose two approaches that can significantly speed up likelihood computations that typically represent over 95 per cent of the computational effort conducted by current ML or Bayesian inference programs. Initially, we present a method and an appropriate data structure to efficiently compute the likelihood score on 'gappy' multi-gene alignments. By 'gappy' we denote sampling-induced gaps owing to missing sequences in individual genes (partitions), i.e. not real alignment gaps. A first proof-of-concept implementation in RAXML indicates that this approach can accelerate inferences on large and gappy alignments by approximately one order of magnitude. Moreover, we present insights and initial performance results on multi-core architectures obtained during the transition from an OpenMP-based to a Pthreads-based fine-grained parallelization of the ML function.

  10. Accurate Biomass Estimation via Bayesian Adaptive Sampling

    NASA Technical Reports Server (NTRS)

    Wheeler, Kevin R.; Knuth, Kevin H.; Castle, Joseph P.; Lvov, Nikolay

    2005-01-01

    The following concepts were introduced: a) Bayesian adaptive sampling for solving biomass estimation; b) Characterization of MISR Rahman model parameters conditioned upon MODIS landcover. c) Rigorous non-parametric Bayesian approach to analytic mixture model determination. d) Unique U.S. asset for science product validation and verification.

  11. Bayesian cloud detection for MERIS, AATSR, and their combination

    NASA Astrophysics Data System (ADS)

    Hollstein, A.; Fischer, J.; Carbajal Henken, C.; Preusker, R.

    2014-11-01

    A broad range of different of Bayesian cloud detection schemes is applied to measurements from the Medium Resolution Imaging Spectrometer (MERIS), the Advanced Along-Track Scanning Radiometer (AATSR), and their combination. The cloud masks were designed to be numerically efficient and suited for the processing of large amounts of data. Results from the classical and naive approach to Bayesian cloud masking are discussed for MERIS and AATSR as well as for their combination. A sensitivity study on the resolution of multidimensional histograms, which were post-processed by Gaussian smoothing, shows how theoretically insufficient amounts of truth data can be used to set up accurate classical Bayesian cloud masks. Sets of exploited features from single and derived channels are numerically optimized and results for naive and classical Bayesian cloud masks are presented. The application of the Bayesian approach is discussed in terms of reproducing existing algorithms, enhancing existing algorithms, increasing the robustness of existing algorithms, and on setting up new classification schemes based on manually classified scenes.

  12. Bayesian cloud detection for MERIS, AATSR, and their combination

    NASA Astrophysics Data System (ADS)

    Hollstein, A.; Fischer, J.; Carbajal Henken, C.; Preusker, R.

    2015-04-01

    A broad range of different of Bayesian cloud detection schemes is applied to measurements from the Medium Resolution Imaging Spectrometer (MERIS), the Advanced Along-Track Scanning Radiometer (AATSR), and their combination. The cloud detection schemes were designed to be numerically efficient and suited for the processing of large numbers of data. Results from the classical and naive approach to Bayesian cloud masking are discussed for MERIS and AATSR as well as for their combination. A sensitivity study on the resolution of multidimensional histograms, which were post-processed by Gaussian smoothing, shows how theoretically insufficient numbers of truth data can be used to set up accurate classical Bayesian cloud masks. Sets of exploited features from single and derived channels are numerically optimized and results for naive and classical Bayesian cloud masks are presented. The application of the Bayesian approach is discussed in terms of reproducing existing algorithms, enhancing existing algorithms, increasing the robustness of existing algorithms, and on setting up new classification schemes based on manually classified scenes.

  13. Bayesian demography 250 years after Bayes

    PubMed Central

    Bijak, Jakub; Bryant, John

    2016-01-01

    Bayesian statistics offers an alternative to classical (frequentist) statistics. It is distinguished by its use of probability distributions to describe uncertain quantities, which leads to elegant solutions to many difficult statistical problems. Although Bayesian demography, like Bayesian statistics more generally, is around 250 years old, only recently has it begun to flourish. The aim of this paper is to review the achievements of Bayesian demography, address some misconceptions, and make the case for wider use of Bayesian methods in population studies. We focus on three applications: demographic forecasts, limited data, and highly structured or complex models. The key advantages of Bayesian methods are the ability to integrate information from multiple sources and to describe uncertainty coherently. Bayesian methods also allow for including additional (prior) information next to the data sample. As such, Bayesian approaches are complementary to many traditional methods, which can be productively re-expressed in Bayesian terms. PMID:26902889

  14. Machine learning approach for automated screening of malaria parasite using light microscopic images.

    PubMed

    Das, Dev Kumar; Ghosh, Madhumala; Pal, Mallika; Maiti, Asok K; Chakraborty, Chandan

    2013-02-01

    The aim of this paper is to address the development of computer assisted malaria parasite characterization and classification using machine learning approach based on light microscopic images of peripheral blood smears. In doing this, microscopic image acquisition from stained slides, illumination correction and noise reduction, erythrocyte segmentation, feature extraction, feature selection and finally classification of different stages of malaria (Plasmodium vivax and Plasmodium falciparum) have been investigated. The erythrocytes are segmented using marker controlled watershed transformation and subsequently total ninety six features describing shape-size and texture of erythrocytes are extracted in respect to the parasitemia infected versus non-infected cells. Ninety four features are found to be statistically significant in discriminating six classes. Here a feature selection-cum-classification scheme has been devised by combining F-statistic, statistical learning techniques i.e., Bayesian learning and support vector machine (SVM) in order to provide the higher classification accuracy using best set of discriminating features. Results show that Bayesian approach provides the highest accuracy i.e., 84% for malaria classification by selecting 19 most significant features while SVM provides highest accuracy i.e., 83.5% with 9 most significant features. Finally, the performance of these two classifiers under feature selection framework has been compared toward malaria parasite classification. Copyright © 2012 Elsevier Ltd. All rights reserved.

  15. Soil moisture mapping using Sentinel 1 images: the proposed approach and its preliminary validation carried out in view of an operational product

    NASA Astrophysics Data System (ADS)

    Paloscia, S.; Pettinato, S.; Santi, E.; Pierdicca, N.; Pulvirenti, L.; Notarnicola, C.; Pace, G.; Reppucci, A.

    2011-11-01

    The main objective of this research is to develop, test and validate a soil moisture (SMC)) algorithm for the GMES Sentinel-1 characteristics, within the framework of an ESA project. The SMC product, to be generated from Sentinel-1 data, requires an algorithm able to process operationally in near-real-time and deliver the product to the GMES services within 3 hours from observations. Two different complementary approaches have been proposed: an Artificial Neural Network (ANN), which represented the best compromise between retrieval accuracy and processing time, thus allowing compliance with the timeliness requirements and a Bayesian Multi-temporal approach, allowing an increase of the retrieval accuracy, especially in case where little ancillary data are available, at the cost of computational efficiency, taking advantage of the frequent revisit time achieved by Sentinel-1. The algorithm was validated in several test areas in Italy, US and Australia, and finally in Spain with a 'blind' validation. The Multi-temporal Bayesian algorithm was validated in Central Italy. The validation results are in all cases very much in line with the requirements. However, the blind validation results were penalized by the availability of only VV polarization SAR images and MODIS lowresolution NDVI, although the RMS is slightly > 4%.

  16. The drift diffusion model as the choice rule in reinforcement learning.

    PubMed

    Pedersen, Mads Lund; Frank, Michael J; Biele, Guido

    2017-08-01

    Current reinforcement-learning models often assume simplified decision processes that do not fully reflect the dynamic complexities of choice processes. Conversely, sequential-sampling models of decision making account for both choice accuracy and response time, but assume that decisions are based on static decision values. To combine these two computational models of decision making and learning, we implemented reinforcement-learning models in which the drift diffusion model describes the choice process, thereby capturing both within- and across-trial dynamics. To exemplify the utility of this approach, we quantitatively fit data from a common reinforcement-learning paradigm using hierarchical Bayesian parameter estimation, and compared model variants to determine whether they could capture the effects of stimulant medication in adult patients with attention-deficit hyperactivity disorder (ADHD). The model with the best relative fit provided a good description of the learning process, choices, and response times. A parameter recovery experiment showed that the hierarchical Bayesian modeling approach enabled accurate estimation of the model parameters. The model approach described here, using simultaneous estimation of reinforcement-learning and drift diffusion model parameters, shows promise for revealing new insights into the cognitive and neural mechanisms of learning and decision making, as well as the alteration of such processes in clinical groups.

  17. High-Dimensional Bayesian Geostatistics

    PubMed Central

    Banerjee, Sudipto

    2017-01-01

    With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as “priors” for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ~ n floating point operations (flops), where n the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings. PMID:29391920

  18. The drift diffusion model as the choice rule in reinforcement learning

    PubMed Central

    Frank, Michael J.

    2017-01-01

    Current reinforcement-learning models often assume simplified decision processes that do not fully reflect the dynamic complexities of choice processes. Conversely, sequential-sampling models of decision making account for both choice accuracy and response time, but assume that decisions are based on static decision values. To combine these two computational models of decision making and learning, we implemented reinforcement-learning models in which the drift diffusion model describes the choice process, thereby capturing both within- and across-trial dynamics. To exemplify the utility of this approach, we quantitatively fit data from a common reinforcement-learning paradigm using hierarchical Bayesian parameter estimation, and compared model variants to determine whether they could capture the effects of stimulant medication in adult patients with attention-deficit hyper-activity disorder (ADHD). The model with the best relative fit provided a good description of the learning process, choices, and response times. A parameter recovery experiment showed that the hierarchical Bayesian modeling approach enabled accurate estimation of the model parameters. The model approach described here, using simultaneous estimation of reinforcement-learning and drift diffusion model parameters, shows promise for revealing new insights into the cognitive and neural mechanisms of learning and decision making, as well as the alteration of such processes in clinical groups. PMID:27966103

  19. High-Dimensional Bayesian Geostatistics.

    PubMed

    Banerjee, Sudipto

    2017-06-01

    With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ~ n floating point operations (flops), where n the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings.

  20. Poly-Omic Prediction of Complex Traits: OmicKriging

    PubMed Central

    Wheeler, Heather E.; Aquino-Michaels, Keston; Gamazon, Eric R.; Trubetskoy, Vassily V.; Dolan, M. Eileen; Huang, R. Stephanie; Cox, Nancy J.; Im, Hae Kyung

    2014-01-01

    High-confidence prediction of complex traits such as disease risk or drug response is an ultimate goal of personalized medicine. Although genome-wide association studies have discovered thousands of well-replicated polymorphisms associated with a broad spectrum of complex traits, the combined predictive power of these associations for any given trait is generally too low to be of clinical relevance. We propose a novel systems approach to complex trait prediction, which leverages and integrates similarity in genetic, transcriptomic, or other omics-level data. We translate the omic similarity into phenotypic similarity using a method called Kriging, commonly used in geostatistics and machine learning. Our method called OmicKriging emphasizes the use of a wide variety of systems-level data, such as those increasingly made available by comprehensive surveys of the genome, transcriptome, and epigenome, for complex trait prediction. Furthermore, our OmicKriging framework allows easy integration of prior information on the function of subsets of omics-level data from heterogeneous sources without the sometimes heavy computational burden of Bayesian approaches. Using seven disease datasets from the Wellcome Trust Case Control Consortium (WTCCC), we show that OmicKriging allows simple integration of sparse and highly polygenic components yielding comparable performance at a fraction of the computing time of a recently published Bayesian sparse linear mixed model method. Using a cellular growth phenotype, we show that integrating mRNA and microRNA expression data substantially increases performance over either dataset alone. Using clinical statin response, we show improved prediction over existing methods. PMID:24799323

  1. A General and Flexible Approach to Estimating the Social Relations Model Using Bayesian Methods

    ERIC Educational Resources Information Center

    Ludtke, Oliver; Robitzsch, Alexander; Kenny, David A.; Trautwein, Ulrich

    2013-01-01

    The social relations model (SRM) is a conceptual, methodological, and analytical approach that is widely used to examine dyadic behaviors and interpersonal perception within groups. This article introduces a general and flexible approach to estimating the parameters of the SRM that is based on Bayesian methods using Markov chain Monte Carlo…

  2. Quantifying the uncertainty in heritability

    PubMed Central

    Furlotte, Nicholas A; Heckerman, David; Lippert, Christoph

    2014-01-01

    The use of mixed models to determine narrow-sense heritability and related quantities such as SNP heritability has received much recent attention. Less attention has been paid to the inherent variability in these estimates. One approach for quantifying variability in estimates of heritability is a frequentist approach, in which heritability is estimated using maximum likelihood and its variance is quantified through an asymptotic normal approximation. An alternative approach is to quantify the uncertainty in heritability through its Bayesian posterior distribution. In this paper, we develop the latter approach, make it computationally efficient and compare it to the frequentist approach. We show theoretically that, for a sufficiently large sample size and intermediate values of heritability, the two approaches provide similar results. Using the Atherosclerosis Risk in Communities cohort, we show empirically that the two approaches can give different results and that the variance/uncertainty can remain large. PMID:24670270

  3. Probabilistic Inference in General Graphical Models through Sampling in Stochastic Networks of Spiking Neurons

    PubMed Central

    Pecevski, Dejan; Buesing, Lars; Maass, Wolfgang

    2011-01-01

    An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows (“explaining away”) and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons. PMID:22219717

  4. The Bayesian reader: explaining word recognition as an optimal Bayesian decision process.

    PubMed

    Norris, Dennis

    2006-04-01

    This article presents a theory of visual word recognition that assumes that, in the tasks of word identification, lexical decision, and semantic categorization, human readers behave as optimal Bayesian decision makers. This leads to the development of a computational model of word recognition, the Bayesian reader. The Bayesian reader successfully simulates some of the most significant data on human reading. The model accounts for the nature of the function relating word frequency to reaction time and identification threshold, the effects of neighborhood density and its interaction with frequency, and the variation in the pattern of neighborhood density effects seen in different experimental tasks. Both the general behavior of the model and the way the model predicts different patterns of results in different tasks follow entirely from the assumption that human readers approximate optimal Bayesian decision makers. ((c) 2006 APA, all rights reserved).

  5. Prediction of new bioactive molecules using a Bayesian belief network.

    PubMed

    Abdo, Ammar; Leclère, Valérie; Jacques, Philippe; Salim, Naomie; Pupin, Maude

    2014-01-27

    Natural products and synthetic compounds are a valuable source of new small molecules leading to novel drugs to cure diseases. However identifying new biologically active small molecules is still a challenge. In this paper, we introduce a new activity prediction approach using Bayesian belief network for classification (BBNC). The roots of the network are the fragments composing a compound. The leaves are, on one side, the activities to predict and, on another side, the unknown compound. The activities are represented by sets of known compounds, and sets of inactive compounds are also used. We calculated a similarity between an unknown compound and each activity class. The more similar activity is assigned to the unknown compound. We applied this new approach on eight well-known data sets extracted from the literature and compared its performance to three classical machine learning algorithms. Experiments showed that BBNC provides interesting prediction rates (from 79% accuracy for high diverse data sets to 99% for low diverse ones) with a short time calculation. Experiments also showed that BBNC is particularly effective for homogeneous data sets but has been found to perform less well with structurally heterogeneous sets. However, it is important to stress that we believe that using several approaches whenever possible for activity prediction can often give a broader understanding of the data than using only one approach alone. Thus, BBNC is a useful addition to the computational chemist's toolbox.

  6. Cuckoo Search with Lévy Flights for Weighted Bayesian Energy Functional Optimization in Global-Support Curve Data Fitting

    PubMed Central

    Gálvez, Akemi; Iglesias, Andrés; Cabellos, Luis

    2014-01-01

    The problem of data fitting is very important in many theoretical and applied fields. In this paper, we consider the problem of optimizing a weighted Bayesian energy functional for data fitting by using global-support approximating curves. By global-support curves we mean curves expressed as a linear combination of basis functions whose support is the whole domain of the problem, as opposed to other common approaches in CAD/CAM and computer graphics driven by piecewise functions (such as B-splines and NURBS) that provide local control of the shape of the curve. Our method applies a powerful nature-inspired metaheuristic algorithm called cuckoo search, introduced recently to solve optimization problems. A major advantage of this method is its simplicity: cuckoo search requires only two parameters, many fewer than other metaheuristic approaches, so the parameter tuning becomes a very simple task. The paper shows that this new approach can be successfully used to solve our optimization problem. To check the performance of our approach, it has been applied to five illustrative examples of different types, including open and closed 2D and 3D curves that exhibit challenging features, such as cusps and self-intersections. Our results show that the method performs pretty well, being able to solve our minimization problem in an astonishingly straightforward way. PMID:24977175

  7. Cuckoo search with Lévy flights for weighted Bayesian energy functional optimization in global-support curve data fitting.

    PubMed

    Gálvez, Akemi; Iglesias, Andrés; Cabellos, Luis

    2014-01-01

    The problem of data fitting is very important in many theoretical and applied fields. In this paper, we consider the problem of optimizing a weighted Bayesian energy functional for data fitting by using global-support approximating curves. By global-support curves we mean curves expressed as a linear combination of basis functions whose support is the whole domain of the problem, as opposed to other common approaches in CAD/CAM and computer graphics driven by piecewise functions (such as B-splines and NURBS) that provide local control of the shape of the curve. Our method applies a powerful nature-inspired metaheuristic algorithm called cuckoo search, introduced recently to solve optimization problems. A major advantage of this method is its simplicity: cuckoo search requires only two parameters, many fewer than other metaheuristic approaches, so the parameter tuning becomes a very simple task. The paper shows that this new approach can be successfully used to solve our optimization problem. To check the performance of our approach, it has been applied to five illustrative examples of different types, including open and closed 2D and 3D curves that exhibit challenging features, such as cusps and self-intersections. Our results show that the method performs pretty well, being able to solve our minimization problem in an astonishingly straightforward way.

  8. A species-level phylogeny of all extant and late Quaternary extinct mammals using a novel heuristic-hierarchical Bayesian approach.

    PubMed

    Faurby, Søren; Svenning, Jens-Christian

    2015-03-01

    Across large clades, two problems are generally encountered in the estimation of species-level phylogenies: (a) the number of taxa involved is generally so high that computation-intensive approaches cannot readily be utilized and (b) even for clades that have received intense study (e.g., mammals), attention has been centered on relatively few selected species, and most taxa must therefore be positioned on the basis of very limited genetic data. Here, we describe a new heuristic-hierarchical Bayesian approach and use it to construct a species-level phylogeny for all extant and late Quaternary extinct mammals. In this approach, species with large quantities of genetic data are placed nearly freely in the mammalian phylogeny according to these data, whereas the placement of species with lower quantities of data is performed with steadily stricter restrictions for decreasing data quantities. The advantages of the proposed method include (a) an improved ability to incorporate phylogenetic uncertainty in downstream analyses based on the resulting phylogeny, (b) a reduced potential for long-branch attraction or other types of errors that place low-data taxa far from their true position, while maintaining minimal restrictions for better-studied taxa, and (c) likely improved placement of low-data taxa due to the use of closer outgroups. Copyright © 2014 Elsevier Inc. All rights reserved.

  9. Using Bayesian analysis in repeated preclinical in vivo studies for a more effective use of animals.

    PubMed

    Walley, Rosalind; Sherington, John; Rastrick, Joe; Detrait, Eric; Hanon, Etienne; Watt, Gillian

    2016-05-01

    Whilst innovative Bayesian approaches are increasingly used in clinical studies, in the preclinical area Bayesian methods appear to be rarely used in the reporting of pharmacology data. This is particularly surprising in the context of regularly repeated in vivo studies where there is a considerable amount of data from historical control groups, which has potential value. This paper describes our experience with introducing Bayesian analysis for such studies using a Bayesian meta-analytic predictive approach. This leads naturally either to an informative prior for a control group as part of a full Bayesian analysis of the next study or using a predictive distribution to replace a control group entirely. We use quality control charts to illustrate study-to-study variation to the scientists and describe informative priors in terms of their approximate effective numbers of animals. We describe two case studies of animal models: the lipopolysaccharide-induced cytokine release model used in inflammation and the novel object recognition model used to screen cognitive enhancers, both of which show the advantage of a Bayesian approach over the standard frequentist analysis. We conclude that using Bayesian methods in stable repeated in vivo studies can result in a more effective use of animals, either by reducing the total number of animals used or by increasing the precision of key treatment differences. This will lead to clearer results and supports the "3Rs initiative" to Refine, Reduce and Replace animals in research. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  10. Bayesian estimates of the incidence of rare cancers in Europe.

    PubMed

    Botta, Laura; Capocaccia, Riccardo; Trama, Annalisa; Herrmann, Christian; Salmerón, Diego; De Angelis, Roberta; Mallone, Sandra; Bidoli, Ettore; Marcos-Gragera, Rafael; Dudek-Godeau, Dorota; Gatta, Gemma; Cleries, Ramon

    2018-04-21

    The RARECAREnet project has updated the estimates of the burden of the 198 rare cancers in each European country. Suspecting that scant data could affect the reliability of statistical analysis, we employed a Bayesian approach to estimate the incidence of these cancers. We analyzed about 2,000,000 rare cancers diagnosed in 2000-2007 provided by 83 population-based cancer registries from 27 European countries. We considered European incidence rates (IRs), calculated over all the data available in RARECAREnet, as a valid a priori to merge with country-specific observed data. Therefore we provided (1) Bayesian estimates of IRs and the yearly numbers of cases of rare cancers in each country; (2) the expected time (T) in years needed to observe one new case; and (3) practical criteria to decide when to use the Bayesian approach. Bayesian and classical estimates did not differ much; substantial differences (>10%) ranged from 77 rare cancers in Iceland to 14 in England. The smaller the population the larger the number of rare cancers needing a Bayesian approach. Bayesian estimates were useful for cancers with fewer than 150 observed cases in a country during the study period; this occurred mostly when the population of the country is small. For the first time the Bayesian estimates of IRs and the yearly expected numbers of cases for each rare cancer in each individual European country were calculated. Moreover, the indicator T is useful to convey incidence estimates for exceptionally rare cancers and in small countries; it far exceeds the professional lifespan of a medical doctor. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. A novel Bayesian approach to quantify clinical variables and to determine their spectroscopic counterparts in 1H NMR metabonomic data

    PubMed Central

    Vehtari, Aki; Mäkinen, Ville-Petteri; Soininen, Pasi; Ingman, Petri; Mäkelä, Sanna M; Savolainen, Markku J; Hannuksela, Minna L; Kaski, Kimmo; Ala-Korpela, Mika

    2007-01-01

    Background A key challenge in metabonomics is to uncover quantitative associations between multidimensional spectroscopic data and biochemical measures used for disease risk assessment and diagnostics. Here we focus on clinically relevant estimation of lipoprotein lipids by 1H NMR spectroscopy of serum. Results A Bayesian methodology, with a biochemical motivation, is presented for a real 1H NMR metabonomics data set of 75 serum samples. Lipoprotein lipid concentrations were independently obtained for these samples via ultracentrifugation and specific biochemical assays. The Bayesian models were constructed by Markov chain Monte Carlo (MCMC) and they showed remarkably good quantitative performance, the predictive R-values being 0.985 for the very low density lipoprotein triglycerides (VLDL-TG), 0.787 for the intermediate, 0.943 for the low, and 0.933 for the high density lipoprotein cholesterol (IDL-C, LDL-C and HDL-C, respectively). The modelling produced a kernel-based reformulation of the data, the parameters of which coincided with the well-known biochemical characteristics of the 1H NMR spectra; particularly for VLDL-TG and HDL-C the Bayesian methodology was able to clearly identify the most characteristic resonances within the heavily overlapping information in the spectra. For IDL-C and LDL-C the resulting model kernels were more complex than those for VLDL-TG and HDL-C, probably reflecting the severe overlap of the IDL and LDL resonances in the 1H NMR spectra. Conclusion The systematic use of Bayesian MCMC analysis is computationally demanding. Nevertheless, the combination of high-quality quantification and the biochemical rationale of the resulting models is expected to be useful in the field of metabonomics. PMID:17493257

  12. Bayesian inference of the number of factors in gene-expression analysis: application to human virus challenge studies.

    PubMed

    Chen, Bo; Chen, Minhua; Paisley, John; Zaas, Aimee; Woods, Christopher; Ginsburg, Geoffrey S; Hero, Alfred; Lucas, Joseph; Dunson, David; Carin, Lawrence

    2010-11-09

    Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.

  13. Hierarchical Bayesian Models of Subtask Learning

    ERIC Educational Resources Information Center

    Anglim, Jeromy; Wynton, Sarah K. A.

    2015-01-01

    The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking…

  14. A Bayesian approach to estimating variance components within a multivariate generalizability theory framework.

    PubMed

    Jiang, Zhehan; Skorupski, William

    2017-12-12

    In many behavioral research areas, multivariate generalizability theory (mG theory) has been typically used to investigate the reliability of certain multidimensional assessments. However, traditional mG-theory estimation-namely, using frequentist approaches-has limits, leading researchers to fail to take full advantage of the information that mG theory can offer regarding the reliability of measurements. Alternatively, Bayesian methods provide more information than frequentist approaches can offer. This article presents instructional guidelines on how to implement mG-theory analyses in a Bayesian framework; in particular, BUGS code is presented to fit commonly seen designs from mG theory, including single-facet designs, two-facet crossed designs, and two-facet nested designs. In addition to concrete examples that are closely related to the selected designs and the corresponding BUGS code, a simulated dataset is provided to demonstrate the utility and advantages of the Bayesian approach. This article is intended to serve as a tutorial reference for applied researchers and methodologists conducting mG-theory studies.

  15. Bayesian-information-gap decision theory with an application to CO 2 sequestration

    DOE PAGES

    O'Malley, D.; Vesselinov, V. V.

    2015-09-04

    Decisions related to subsurface engineering problems such as groundwater management, fossil fuel production, and geologic carbon sequestration are frequently challenging because of an overabundance of uncertainties (related to conceptualizations, parameters, observations, etc.). Because of the importance of these problems to agriculture, energy, and the climate (respectively), good decisions that are scientifically defensible must be made despite the uncertainties. We describe a general approach to making decisions for challenging problems such as these in the presence of severe uncertainties that combines probabilistic and non-probabilistic methods. The approach uses Bayesian sampling to assess parametric uncertainty and Information-Gap Decision Theory (IGDT) to addressmore » model inadequacy. The combined approach also resolves an issue that frequently arises when applying Bayesian methods to real-world engineering problems related to the enumeration of possible outcomes. In the case of zero non-probabilistic uncertainty, the method reduces to a Bayesian method. Lastly, to illustrate the approach, we apply it to a site-selection decision for geologic CO 2 sequestration.« less

  16. Nonparametric Bayesian Modeling for Automated Database Schema Matching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ferragut, Erik M; Laska, Jason A

    2015-01-01

    The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.

  17. Feasibility of streamlining an interactive Bayesian-based diagnostic support tool designed for clinical practice

    NASA Astrophysics Data System (ADS)

    Chen, Po-Hao; Botzolakis, Emmanuel; Mohan, Suyash; Bryan, R. N.; Cook, Tessa

    2016-03-01

    In radiology, diagnostic errors occur either through the failure of detection or incorrect interpretation. Errors are estimated to occur in 30-35% of all exams and contribute to 40-54% of medical malpractice litigations. In this work, we focus on reducing incorrect interpretation of known imaging features. Existing literature categorizes cognitive bias leading a radiologist to an incorrect diagnosis despite having correctly recognized the abnormal imaging features: anchoring bias, framing effect, availability bias, and premature closure. Computational methods make a unique contribution, as they do not exhibit the same cognitive biases as a human. Bayesian networks formalize the diagnostic process. They modify pre-test diagnostic probabilities using clinical and imaging features, arriving at a post-test probability for each possible diagnosis. To translate Bayesian networks to clinical practice, we implemented an entirely web-based open-source software tool. In this tool, the radiologist first selects a network of choice (e.g. basal ganglia). Then, large, clearly labeled buttons displaying salient imaging features are displayed on the screen serving both as a checklist and for input. As the radiologist inputs the value of an extracted imaging feature, the conditional probabilities of each possible diagnosis are updated. The software presents its level of diagnostic discrimination using a Pareto distribution chart, updated with each additional imaging feature. Active collaboration with the clinical radiologist is a feasible approach to software design and leads to design decisions closely coupling the complex mathematics of conditional probability in Bayesian networks with practice.

  18. Models and simulation of 3D neuronal dendritic trees using Bayesian networks.

    PubMed

    López-Cruz, Pedro L; Bielza, Concha; Larrañaga, Pedro; Benavides-Piccione, Ruth; DeFelipe, Javier

    2011-12-01

    Neuron morphology is crucial for neuronal connectivity and brain information processing. Computational models are important tools for studying dendritic morphology and its role in brain function. We applied a class of probabilistic graphical models called Bayesian networks to generate virtual dendrites from layer III pyramidal neurons from three different regions of the neocortex of the mouse. A set of 41 morphological variables were measured from the 3D reconstructions of real dendrites and their probability distributions used in a machine learning algorithm to induce the model from the data. A simulation algorithm is also proposed to obtain new dendrites by sampling values from Bayesian networks. The main advantage of this approach is that it takes into account and automatically locates the relationships between variables in the data instead of using predefined dependencies. Therefore, the methodology can be applied to any neuronal class while at the same time exploiting class-specific properties. Also, a Bayesian network was defined for each part of the dendrite, allowing the relationships to change in the different sections and to model heterogeneous developmental factors or spatial influences. Several univariate statistical tests and a novel multivariate test based on Kullback-Leibler divergence estimation confirmed that virtual dendrites were similar to real ones. The analyses of the models showed relationships that conform to current neuroanatomical knowledge and support model correctness. At the same time, studying the relationships in the models can help to identify new interactions between variables related to dendritic morphology.

  19. An economic growth model based on financial credits distribution to the government economy priority sectors of each regency in Indonesia using hierarchical Bayesian method

    NASA Astrophysics Data System (ADS)

    Yasmirullah, Septia Devi Prihastuti; Iriawan, Nur; Sipayung, Feronika Rosalinda

    2017-11-01

    The success of regional economic establishment could be measured by economic growth. Since the Act No. 32 of 2004 has been implemented, unbalance economic among the regency in Indonesia is increasing. This condition is contrary different with the government goal to build society welfare through the economic activity development in each region. This research aims to examine economic growth through the distribution of bank credits to each Indonesia's regency. The data analyzed in this research is hierarchically structured data which follow normal distribution in first level. Two modeling approaches are employed in this research, a global-one level Bayesian approach and two-level hierarchical Bayesian approach. The result shows that hierarchical Bayesian has succeeded to demonstrate a better estimation than a global-one level Bayesian. It proves that the different economic growth in each province is significantly influenced by the variations of micro level characteristics in each province. These variations are significantly affected by cities and province characteristics in second level.

  20. Bayesian ensemble refinement by replica simulations and reweighting.

    PubMed

    Hummer, Gerhard; Köfinger, Jürgen

    2015-12-28

    We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.

  1. Bayesian ensemble refinement by replica simulations and reweighting

    NASA Astrophysics Data System (ADS)

    Hummer, Gerhard; Köfinger, Jürgen

    2015-12-01

    We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.

  2. Delamination detection using methods of computational intelligence

    NASA Astrophysics Data System (ADS)

    Ihesiulor, Obinna K.; Shankar, Krishna; Zhang, Zhifang; Ray, Tapabrata

    2012-11-01

    Abstract Reliable delamination prediction scheme is indispensable in order to prevent potential risks of catastrophic failures in composite structures. The existence of delaminations changes the vibration characteristics of composite laminates and hence such indicators can be used to quantify the health characteristics of laminates. An approach for online health monitoring of in-service composite laminates is presented in this paper that relies on methods based on computational intelligence. Typical changes in the observed vibration characteristics (i.e. change in natural frequencies) are considered as inputs to identify the existence, location and magnitude of delaminations. The performance of the proposed approach is demonstrated using numerical models of composite laminates. Since this identification problem essentially involves the solution of an optimization problem, the use of finite element (FE) methods as the underlying tool for analysis turns out to be computationally expensive. A surrogate assisted optimization approach is hence introduced to contain the computational time within affordable limits. An artificial neural network (ANN) model with Bayesian regularization is used as the underlying approximation scheme while an improved rate of convergence is achieved using a memetic algorithm. However, building of ANN surrogate models usually requires large training datasets. K-means clustering is effectively employed to reduce the size of datasets. ANN is also used via inverse modeling to determine the position, size and location of delaminations using changes in measured natural frequencies. The results clearly highlight the efficiency and the robustness of the approach.

  3. Hip fracture in the elderly: a re-analysis of the EPIDOS study with causal Bayesian networks.

    PubMed

    Caillet, Pascal; Klemm, Sarah; Ducher, Michel; Aussem, Alexandre; Schott, Anne-Marie

    2015-01-01

    Hip fractures commonly result in permanent disability, institutionalization or death in elderly. Existing hip-fracture predicting tools are underused in clinical practice, partly due to their lack of intuitive interpretation. By use of a graphical layer, Bayesian network models could increase the attractiveness of fracture prediction tools. Our aim was to study the potential contribution of a causal Bayesian network in this clinical setting. A logistic regression was performed as a standard control approach to check the robustness of the causal Bayesian network approach. EPIDOS is a multicenter study, conducted in an ambulatory care setting in five French cities between 1992 and 1996 and updated in 2010. The study included 7598 women aged 75 years or older, in which fractures were assessed quarterly during 4 years. A causal Bayesian network and a logistic regression were performed on EPIDOS data to describe major variables involved in hip fractures occurrences. Both models had similar association estimations and predictive performances. They detected gait speed and mineral bone density as variables the most involved in the fracture process. The causal Bayesian network showed that gait speed and bone mineral density were directly connected to fracture and seem to mediate the influence of all the other variables included in our model. The logistic regression approach detected multiple interactions involving psychotropic drug use, age and bone mineral density. Both approaches retrieved similar variables as predictors of hip fractures. However, Bayesian network highlighted the whole web of relation between the variables involved in the analysis, suggesting a possible mechanism leading to hip fracture. According to the latter results, intervention focusing concomitantly on gait speed and bone mineral density may be necessary for an optimal prevention of hip fracture occurrence in elderly people.

  4. Accounting for nonsampling error in estimates of HIV epidemic trends from antenatal clinic sentinel surveillance

    PubMed Central

    Eaton, Jeffrey W.; Bao, Le

    2017-01-01

    Objectives The aim of the study was to propose and demonstrate an approach to allow additional nonsampling uncertainty about HIV prevalence measured at antenatal clinic sentinel surveillance (ANC-SS) in model-based inferences about trends in HIV incidence and prevalence. Design Mathematical model fitted to surveillance data with Bayesian inference. Methods We introduce a variance inflation parameter σinfl2 that accounts for the uncertainty of nonsampling errors in ANC-SS prevalence. It is additive to the sampling error variance. Three approaches are tested for estimating σinfl2 using ANC-SS and household survey data from 40 subnational regions in nine countries in sub-Saharan, as defined in UNAIDS 2016 estimates. Methods were compared using in-sample fit and out-of-sample prediction of ANC-SS data, fit to household survey prevalence data, and the computational implications. Results Introducing the additional variance parameter σinfl2 increased the error variance around ANC-SS prevalence observations by a median of 2.7 times (interquartile range 1.9–3.8). Using only sampling error in ANC-SS prevalence ( σinfl2=0), coverage of 95% prediction intervals was 69% in out-of-sample prediction tests. This increased to 90% after introducing the additional variance parameter σinfl2. The revised probabilistic model improved model fit to household survey prevalence and increased epidemic uncertainty intervals most during the early epidemic period before 2005. Estimating σinfl2 did not increase the computational cost of model fitting. Conclusions: We recommend estimating nonsampling error in ANC-SS as an additional parameter in Bayesian inference using the Estimation and Projection Package model. This approach may prove useful for incorporating other data sources such as routine prevalence from Prevention of mother-to-child transmission testing into future epidemic estimates. PMID:28296801

  5. Bayesian Meta-Analysis of Coefficient Alpha

    ERIC Educational Resources Information Center

    Brannick, Michael T.; Zhang, Nanhua

    2013-01-01

    The current paper describes and illustrates a Bayesian approach to the meta-analysis of coefficient alpha. Alpha is the most commonly used estimate of the reliability or consistency (freedom from measurement error) for educational and psychological measures. The conventional approach to meta-analysis uses inverse variance weights to combine…

  6. An evaluation of the Bayesian approach to fitting the N-mixture model for use with pseudo-replicated count data

    USGS Publications Warehouse

    Toribo, S.G.; Gray, B.R.; Liang, S.

    2011-01-01

    The N-mixture model proposed by Royle in 2004 may be used to approximate the abundance and detection probability of animal species in a given region. In 2006, Royle and Dorazio discussed the advantages of using a Bayesian approach in modelling animal abundance and occurrence using a hierarchical N-mixture model. N-mixture models assume replication on sampling sites, an assumption that may be violated when the site is not closed to changes in abundance during the survey period or when nominal replicates are defined spatially. In this paper, we studied the robustness of a Bayesian approach to fitting the N-mixture model for pseudo-replicated count data. Our simulation results showed that the Bayesian estimates for abundance and detection probability are slightly biased when the actual detection probability is small and are sensitive to the presence of extra variability within local sites.

  7. Spatiotemporal Bayesian analysis of Lyme disease in New York state, 1990-2000.

    PubMed

    Chen, Haiyan; Stratton, Howard H; Caraco, Thomas B; White, Dennis J

    2006-07-01

    Mapping ordinarily increases our understanding of nontrivial spatial and temporal heterogeneities in disease rates. However, the large number of parameters required by the corresponding statistical models often complicates detailed analysis. This study investigates the feasibility of a fully Bayesian hierarchical regression approach to the problem and identifies how it outperforms two more popular methods: crude rate estimates (CRE) and empirical Bayes standardization (EBS). In particular, we apply a fully Bayesian approach to the spatiotemporal analysis of Lyme disease incidence in New York state for the period 1990-2000. These results are compared with those obtained by CRE and EBS in Chen et al. (2005). We show that the fully Bayesian regression model not only gives more reliable estimates of disease rates than the other two approaches but also allows for tractable models that can accommodate more numerous sources of variation and unknown parameters.

  8. Multivariate Bayesian modeling of known and unknown causes of events--an application to biosurveillance.

    PubMed

    Shen, Yanna; Cooper, Gregory F

    2012-09-01

    This paper investigates Bayesian modeling of known and unknown causes of events in the context of disease-outbreak detection. We introduce a multivariate Bayesian approach that models multiple evidential features of every person in the population. This approach models and detects (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A contribution of this paper is that it introduces a multivariate Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has general applicability in domains where the space of known causes is incomplete. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  9. Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data

    ERIC Educational Resources Information Center

    Lee, Sik-Yum

    2006-01-01

    A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…

  10. Bayesian Data-Model Fit Assessment for Structural Equation Modeling

    ERIC Educational Resources Information Center

    Levy, Roy

    2011-01-01

    Bayesian approaches to modeling are receiving an increasing amount of attention in the areas of model construction and estimation in factor analysis, structural equation modeling (SEM), and related latent variable models. However, model diagnostics and model criticism remain relatively understudied aspects of Bayesian SEM. This article describes…

  11. Bayesian estimation of seasonal course of canopy leaf area index from hyperspectral satellite data

    NASA Astrophysics Data System (ADS)

    Varvia, Petri; Rautiainen, Miina; Seppänen, Aku

    2018-03-01

    In this paper, Bayesian inversion of a physically-based forest reflectance model is investigated to estimate of boreal forest canopy leaf area index (LAI) from EO-1 Hyperion hyperspectral data. The data consist of multiple forest stands with different species compositions and structures, imaged in three phases of the growing season. The Bayesian estimates of canopy LAI are compared to reference estimates based on a spectral vegetation index. The forest reflectance model contains also other unknown variables in addition to LAI, for example leaf single scattering albedo and understory reflectance. In the Bayesian approach, these variables are estimated simultaneously with LAI. The feasibility and seasonal variation of these estimates is also examined. Credible intervals for the estimates are also calculated and evaluated. The results show that the Bayesian inversion approach is significantly better than using a comparable spectral vegetation index regression.

  12. Predicting Football Matches Results using Bayesian Networks for English Premier League (EPL)

    NASA Astrophysics Data System (ADS)

    Razali, Nazim; Mustapha, Aida; Yatim, Faiz Ahmad; Aziz, Ruhaya Ab

    2017-08-01

    The issues of modeling asscoiation football prediction model has become increasingly popular in the last few years and many different approaches of prediction models have been proposed with the point of evaluating the attributes that lead a football team to lose, draw or win the match. There are three types of approaches has been considered for predicting football matches results which include statistical approaches, machine learning approaches and Bayesian approaches. Lately, many studies regarding football prediction models has been produced using Bayesian approaches. This paper proposes a Bayesian Networks (BNs) to predict the results of football matches in term of home win (H), away win (A) and draw (D). The English Premier League (EPL) for three seasons of 2010-2011, 2011-2012 and 2012-2013 has been selected and reviewed. K-fold cross validation has been used for testing the accuracy of prediction model. The required information about the football data is sourced from a legitimate site at http://www.football-data.co.uk. BNs achieved predictive accuracy of 75.09% in average across three seasons. It is hoped that the results could be used as the benchmark output for future research in predicting football matches results.

  13. Technical Note: Approximate Bayesian parameterization of a complex tropical forest model

    NASA Astrophysics Data System (ADS)

    Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.

    2013-08-01

    Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can successfully be applied to process-based models of high complexity. The methodology is particularly suited to heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models in ecology and evolution.

  14. A Surrogate-based Adaptive Sampling Approach for History Matching and Uncertainty Quantification

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Weixuan; Zhang, Dongxiao; Lin, Guang

    A critical procedure in reservoir simulations is history matching (or data assimilation in a broader sense), which calibrates model parameters such that the simulation results are consistent with field measurements, and hence improves the credibility of the predictions given by the simulations. Often there exist non-unique combinations of parameter values that all yield the simulation results matching the measurements. For such ill-posed history matching problems, Bayesian theorem provides a theoretical foundation to represent different solutions and to quantify the uncertainty with the posterior PDF. Lacking an analytical solution in most situations, the posterior PDF may be characterized with a samplemore » of realizations, each representing a possible scenario. A novel sampling algorithm is presented here for the Bayesian solutions to history matching problems. We aim to deal with two commonly encountered issues: 1) as a result of the nonlinear input-output relationship in a reservoir model, the posterior distribution could be in a complex form, such as multimodal, which violates the Gaussian assumption required by most of the commonly used data assimilation approaches; 2) a typical sampling method requires intensive model evaluations and hence may cause unaffordable computational cost. In the developed algorithm, we use a Gaussian mixture model as the proposal distribution in the sampling process, which is simple but also flexible to approximate non-Gaussian distributions and is particularly efficient when the posterior is multimodal. Also, a Gaussian process is utilized as a surrogate model to speed up the sampling process. Furthermore, an iterative scheme of adaptive surrogate refinement and re-sampling ensures sampling accuracy while keeping the computational cost at a minimum level. The developed approach is demonstrated with an illustrative example and shows its capability in handling the above-mentioned issues. Multimodal posterior of the history matching problem is captured and are used to give a reliable production prediction with uncertainty quantification. The new algorithm reveals a great improvement in terms of computational efficiency comparing previously studied approaches for the sample problem.« less

  15. A Bayesian approach to meta-analysis of plant pathology studies.

    PubMed

    Mila, A L; Ngugi, H K

    2011-01-01

    Bayesian statistical methods are used for meta-analysis in many disciplines, including medicine, molecular biology, and engineering, but have not yet been applied for quantitative synthesis of plant pathology studies. In this paper, we illustrate the key concepts of Bayesian statistics and outline the differences between Bayesian and classical (frequentist) methods in the way parameters describing population attributes are considered. We then describe a Bayesian approach to meta-analysis and present a plant pathological example based on studies evaluating the efficacy of plant protection products that induce systemic acquired resistance for the management of fire blight of apple. In a simple random-effects model assuming a normal distribution of effect sizes and no prior information (i.e., a noninformative prior), the results of the Bayesian meta-analysis are similar to those obtained with classical methods. Implementing the same model with a Student's t distribution and a noninformative prior for the effect sizes, instead of a normal distribution, yields similar results for all but acibenzolar-S-methyl (Actigard) which was evaluated only in seven studies in this example. Whereas both the classical (P = 0.28) and the Bayesian analysis with a noninformative prior (95% credibility interval [CRI] for the log response ratio: -0.63 to 0.08) indicate a nonsignificant effect for Actigard, specifying a t distribution resulted in a significant, albeit variable, effect for this product (CRI: -0.73 to -0.10). These results confirm the sensitivity of the analytical outcome (i.e., the posterior distribution) to the choice of prior in Bayesian meta-analyses involving a limited number of studies. We review some pertinent literature on more advanced topics, including modeling of among-study heterogeneity, publication bias, analyses involving a limited number of studies, and methods for dealing with missing data, and show how these issues can be approached in a Bayesian framework. Bayesian meta-analysis can readily include information not easily incorporated in classical methods, and allow for a full evaluation of competing models. Given the power and flexibility of Bayesian methods, we expect them to become widely adopted for meta-analysis of plant pathology studies.

  16. Radiation dose reduction in computed tomography perfusion using spatial-temporal Bayesian methods

    NASA Astrophysics Data System (ADS)

    Fang, Ruogu; Raj, Ashish; Chen, Tsuhan; Sanelli, Pina C.

    2012-03-01

    In current computed tomography (CT) examinations, the associated X-ray radiation dose is of significant concern to patients and operators, especially CT perfusion (CTP) imaging that has higher radiation dose due to its cine scanning technique. A simple and cost-effective means to perform the examinations is to lower the milliampere-seconds (mAs) parameter as low as reasonably achievable in data acquisition. However, lowering the mAs parameter will unavoidably increase data noise and degrade CT perfusion maps greatly if no adequate noise control is applied during image reconstruction. To capture the essential dynamics of CT perfusion, a simple spatial-temporal Bayesian method that uses a piecewise parametric model of the residual function is used, and then the model parameters are estimated from a Bayesian formulation of prior smoothness constraints on perfusion parameters. From the fitted residual function, reliable CTP parameter maps are obtained from low dose CT data. The merit of this scheme exists in the combination of analytical piecewise residual function with Bayesian framework using a simpler prior spatial constrain for CT perfusion application. On a dataset of 22 patients, this dynamic spatial-temporal Bayesian model yielded an increase in signal-tonoise-ratio (SNR) of 78% and a decrease in mean-square-error (MSE) of 40% at low dose radiation of 43mA.

  17. Tractable Experiment Design via Mathematical Surrogates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Brian J.

    This presentation summarizes the development and implementation of quantitative design criteria motivated by targeted inference objectives for identifying new, potentially expensive computational or physical experiments. The first application is concerned with estimating features of quantities of interest arising from complex computational models, such as quantiles or failure probabilities. A sequential strategy is proposed for iterative refinement of the importance distributions used to efficiently sample the uncertain inputs to the computational model. In the second application, effective use of mathematical surrogates is investigated to help alleviate the analytical and numerical intractability often associated with Bayesian experiment design. This approach allows formore » the incorporation of prior information into the design process without the need for gross simplification of the design criterion. Illustrative examples of both design problems will be presented as an argument for the relevance of these research problems.« less

  18. Fitting mechanistic epidemic models to data: A comparison of simple Markov chain Monte Carlo approaches.

    PubMed

    Li, Michael; Dushoff, Jonathan; Bolker, Benjamin M

    2018-07-01

    Simple mechanistic epidemic models are widely used for forecasting and parameter estimation of infectious diseases based on noisy case reporting data. Despite the widespread application of models to emerging infectious diseases, we know little about the comparative performance of standard computational-statistical frameworks in these contexts. Here we build a simple stochastic, discrete-time, discrete-state epidemic model with both process and observation error and use it to characterize the effectiveness of different flavours of Bayesian Markov chain Monte Carlo (MCMC) techniques. We use fits to simulated data, where parameters (and future behaviour) are known, to explore the limitations of different platforms and quantify parameter estimation accuracy, forecasting accuracy, and computational efficiency across combinations of modeling decisions (e.g. discrete vs. continuous latent states, levels of stochasticity) and computational platforms (JAGS, NIMBLE, Stan).

  19. Can computational goals inform theories of vision?

    PubMed

    Anderson, Barton L

    2015-04-01

    One of the most lasting contributions of Marr's posthumous book is his articulation of the different "levels of analysis" that are needed to understand vision. Although a variety of work has examined how these different levels are related, there is comparatively little examination of the assumptions on which his proposed levels rest, or the plausibility of the approach Marr articulated given those assumptions. Marr placed particular significance on computational level theory, which specifies the "goal" of a computation, its appropriateness for solving a particular problem, and the logic by which it can be carried out. The structure of computational level theory is inherently teleological: What the brain does is described in terms of its purpose. I argue that computational level theory, and the reverse-engineering approach it inspires, requires understanding the historical trajectory that gave rise to functional capacities that can be meaningfully attributed with some sense of purpose or goal, that is, a reconstruction of the fitness function on which natural selection acted in shaping our visual abilities. I argue that this reconstruction is required to distinguish abilities shaped by natural selection-"natural tasks" -from evolutionary "by-products" (spandrels, co-optations, and exaptations), rather than merely demonstrating that computational goals can be embedded in a Bayesian model that renders a particular behavior or process rational. Copyright © 2015 Cognitive Science Society, Inc.

  20. A Bayesian Approach for Analyzing Longitudinal Structural Equation Models

    ERIC Educational Resources Information Center

    Song, Xin-Yuan; Lu, Zhao-Hua; Hser, Yih-Ing; Lee, Sik-Yum

    2011-01-01

    This article considers a Bayesian approach for analyzing a longitudinal 2-level nonlinear structural equation model with covariates, and mixed continuous and ordered categorical variables. The first-level model is formulated for measures taken at each time point nested within individuals for investigating their characteristics that are dynamically…

Top