Multilevel modeling of single-case data: A comparison of maximum likelihood and Bayesian estimation.
Moeyaert, Mariola; Rindskopf, David; Onghena, Patrick; Van den Noortgate, Wim
2017-12-01
The focus of this article is to describe Bayesian estimation, including construction of prior distributions, and to compare parameter recovery under the Bayesian framework (using weakly informative priors) and the maximum likelihood (ML) framework in the context of multilevel modeling of single-case experimental data. Bayesian estimation results were found similar to ML estimation results in terms of the treatment effect estimates, regardless of the functional form and degree of information included in the prior specification in the Bayesian framework. In terms of the variance component estimates, both the ML and Bayesian estimation procedures result in biased and less precise variance estimates when the number of participants is small (i.e., 3). By increasing the number of participants to 5 or 7, the relative bias is close to 5% and more precise estimates are obtained for all approaches, except for the inverse-Wishart prior using the identity matrix. When a more informative prior was added, more precise estimates for the fixed effects and random effects were obtained, even when only 3 participants were included. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Schwartz, Rachel S; Mueller, Rachel L
2010-01-11
Estimates of divergence dates between species improve our understanding of processes ranging from nucleotide substitution to speciation. Such estimates are frequently based on molecular genetic differences between species; therefore, they rely on accurate estimates of the number of such differences (i.e. substitutions per site, measured as branch length on phylogenies). We used simulations to determine the effects of dataset size, branch length heterogeneity, branch depth, and analytical framework on branch length estimation across a range of branch lengths. We then reanalyzed an empirical dataset for plethodontid salamanders to determine how inaccurate branch length estimation can affect estimates of divergence dates. The accuracy of branch length estimation varied with branch length, dataset size (both number of taxa and sites), branch length heterogeneity, branch depth, dataset complexity, and analytical framework. For simple phylogenies analyzed in a Bayesian framework, branches were increasingly underestimated as branch length increased; in a maximum likelihood framework, longer branch lengths were somewhat overestimated. Longer datasets improved estimates in both frameworks; however, when the number of taxa was increased, estimation accuracy for deeper branches was less than for tip branches. Increasing the complexity of the dataset produced more misestimated branches in a Bayesian framework; however, in an ML framework, more branches were estimated more accurately. Using ML branch length estimates to re-estimate plethodontid salamander divergence dates generally resulted in an increase in the estimated age of older nodes and a decrease in the estimated age of younger nodes. Branch lengths are misestimated in both statistical frameworks for simulations of simple datasets. However, for complex datasets, length estimates are quite accurate in ML (even for short datasets), whereas few branches are estimated accurately in a Bayesian framework. Our reanalysis of empirical data demonstrates the magnitude of effects of Bayesian branch length misestimation on divergence date estimates. Because the length of branches for empirical datasets can be estimated most reliably in an ML framework when branches are <1 substitution/site and datasets are > or =1 kb, we suggest that divergence date estimates using datasets, branch lengths, and/or analytical techniques that fall outside of these parameters should be interpreted with caution.
Bayesian Framework for Water Quality Model Uncertainty Estimation and Risk Management
A formal Bayesian methodology is presented for integrated model calibration and risk-based water quality management using Bayesian Monte Carlo simulation and maximum likelihood estimation (BMCML). The primary focus is on lucid integration of model calibration with risk-based wat...
A Bayesian framework to estimate diversification rates and their variation through time and space
2011-01-01
Background Patterns of species diversity are the result of speciation and extinction processes, and molecular phylogenetic data can provide valuable information to derive their variability through time and across clades. Bayesian Markov chain Monte Carlo methods offer a promising framework to incorporate phylogenetic uncertainty when estimating rates of diversification. Results We introduce a new approach to estimate diversification rates in a Bayesian framework over a distribution of trees under various constant and variable rate birth-death and pure-birth models, and test it on simulated phylogenies. Furthermore, speciation and extinction rates and their posterior credibility intervals can be estimated while accounting for non-random taxon sampling. The framework is particularly suitable for hypothesis testing using Bayes factors, as we demonstrate analyzing dated phylogenies of Chondrostoma (Cyprinidae) and Lupinus (Fabaceae). In addition, we develop a model that extends the rate estimation to a meta-analysis framework in which different data sets are combined in a single analysis to detect general temporal and spatial trends in diversification. Conclusions Our approach provides a flexible framework for the estimation of diversification parameters and hypothesis testing while simultaneously accounting for uncertainties in the divergence times and incomplete taxon sampling. PMID:22013891
Uncertainty estimation of Intensity-Duration-Frequency relationships: A regional analysis
NASA Astrophysics Data System (ADS)
Mélèse, Victor; Blanchet, Juliette; Molinié, Gilles
2018-03-01
We propose in this article a regional study of uncertainties in IDF curves derived from point-rainfall maxima. We develop two generalized extreme value models based on the simple scaling assumption, first in the frequentist framework and second in the Bayesian framework. Within the frequentist framework, uncertainties are obtained i) from the Gaussian density stemming from the asymptotic normality theorem of the maximum likelihood and ii) with a bootstrap procedure. Within the Bayesian framework, uncertainties are obtained from the posterior densities. We confront these two frameworks on the same database covering a large region of 100, 000 km2 in southern France with contrasted rainfall regime, in order to be able to draw conclusion that are not specific to the data. The two frameworks are applied to 405 hourly stations with data back to the 1980's, accumulated in the range 3 h-120 h. We show that i) the Bayesian framework is more robust than the frequentist one to the starting point of the estimation procedure, ii) the posterior and the bootstrap densities are able to better adjust uncertainty estimation to the data than the Gaussian density, and iii) the bootstrap density give unreasonable confidence intervals, in particular for return levels associated to large return period. Therefore our recommendation goes towards the use of the Bayesian framework to compute uncertainty.
Varughese, Eunice A.; Brinkman, Nichole E; Anneken, Emily M; Cashdollar, Jennifer S; Fout, G. Shay; Furlong, Edward T.; Kolpin, Dana W.; Glassmeyer, Susan T.; Keely, Scott P
2017-01-01
incorporated into a Bayesian model to more accurately determine viral load in both source and treated water. Results of the Bayesian model indicated that viruses are present in source water and treated water. By using a Bayesian framework that incorporates inhibition, as well as many other parameters that affect viral detection, this study offers an approach for more accurately estimating the occurrence of viral pathogens in environmental waters.
Bayesian Model Averaging for Propensity Score Analysis
ERIC Educational Resources Information Center
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
NASA Astrophysics Data System (ADS)
Eadie, Gwendolyn M.; Springford, Aaron; Harris, William E.
2017-02-01
We present a hierarchical Bayesian method for estimating the total mass and mass profile of the Milky Way Galaxy. The new hierarchical Bayesian approach further improves the framework presented by Eadie et al. and Eadie and Harris and builds upon the preliminary reports by Eadie et al. The method uses a distribution function f({ E },L) to model the Galaxy and kinematic data from satellite objects, such as globular clusters (GCs), to trace the Galaxy’s gravitational potential. A major advantage of the method is that it not only includes complete and incomplete data simultaneously in the analysis, but also incorporates measurement uncertainties in a coherent and meaningful way. We first test the hierarchical Bayesian framework, which includes measurement uncertainties, using the same data and power-law model assumed in Eadie and Harris and find the results are similar but more strongly constrained. Next, we take advantage of the new statistical framework and incorporate all possible GC data, finding a cumulative mass profile with Bayesian credible regions. This profile implies a mass within 125 kpc of 4.8× {10}11{M}⊙ with a 95% Bayesian credible region of (4.0{--}5.8)× {10}11{M}⊙ . Our results also provide estimates of the true specific energies of all the GCs. By comparing these estimated energies to the measured energies of GCs with complete velocity measurements, we observe that (the few) remote tracers with complete measurements may play a large role in determining a total mass estimate of the Galaxy. Thus, our study stresses the need for more remote tracers with complete velocity measurements.
Bayesian parameter estimation for chiral effective field theory
NASA Astrophysics Data System (ADS)
Wesolowski, Sarah; Furnstahl, Richard; Phillips, Daniel; Klco, Natalie
2016-09-01
The low-energy constants (LECs) of a chiral effective field theory (EFT) interaction in the two-body sector are fit to observable data using a Bayesian parameter estimation framework. By using Bayesian prior probability distributions (pdfs), we quantify relevant physical expectations such as LEC naturalness and include them in the parameter estimation procedure. The final result is a posterior pdf for the LECs, which can be used to propagate uncertainty resulting from the fit to data to the final observable predictions. The posterior pdf also allows an empirical test of operator redundancy and other features of the potential. We compare results of our framework with other fitting procedures, interpreting the underlying assumptions in Bayesian probabilistic language. We also compare results from fitting all partial waves of the interaction simultaneously to cross section data compared to fitting to extracted phase shifts, appropriately accounting for correlations in the data. Supported in part by the NSF and DOE.
Probabilistic models in human sensorimotor control
Wolpert, Daniel M.
2009-01-01
Sensory and motor uncertainty form a fundamental constraint on human sensorimotor control. Bayesian decision theory (BDT) has emerged as a unifying framework to understand how the central nervous system performs optimal estimation and control in the face of such uncertainty. BDT has two components: Bayesian statistics and decision theory. Here we review Bayesian statistics and show how it applies to estimating the state of the world and our own body. Recent results suggest that when learning novel tasks we are able to learn the statistical properties of both the world and our own sensory apparatus so as to perform estimation using Bayesian statistics. We review studies which suggest that humans can combine multiple sources of information to form maximum likelihood estimates, can incorporate prior beliefs about possible states of the world so as to generate maximum a posteriori estimates and can use Kalman filter-based processes to estimate time-varying states. Finally, we review Bayesian decision theory in motor control and how the central nervous system processes errors to determine loss functions and optimal actions. We review results that suggest we plan movements based on statistics of our actions that result from signal-dependent noise on our motor outputs. Taken together these studies provide a statistical framework for how the motor system performs in the presence of uncertainty. PMID:17628731
XID+: Next generation XID development
NASA Astrophysics Data System (ADS)
Hurley, Peter
2017-04-01
XID+ is a prior-based source extraction tool which carries out photometry in the Herschel SPIRE (Spectral and Photometric Imaging Receiver) maps at the positions of known sources. It uses a probabilistic Bayesian framework that provides a natural framework in which to include prior information, and uses the Bayesian inference tool Stan to obtain the full posterior probability distribution on flux estimates.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vrugt, Jasper A; Robinson, Bruce A; Ter Braak, Cajo J F
In recent years, a strong debate has emerged in the hydrologic literature regarding what constitutes an appropriate framework for uncertainty estimation. Particularly, there is strong disagreement whether an uncertainty framework should have its roots within a proper statistical (Bayesian) context, or whether such a framework should be based on a different philosophy and implement informal measures and weaker inference to summarize parameter and predictive distributions. In this paper, we compare a formal Bayesian approach using Markov Chain Monte Carlo (MCMC) with generalized likelihood uncertainty estimation (GLUE) for assessing uncertainty in conceptual watershed modeling. Our formal Bayesian approach is implemented usingmore » the recently developed differential evolution adaptive metropolis (DREAM) MCMC scheme with a likelihood function that explicitly considers model structural, input and parameter uncertainty. Our results demonstrate that DREAM and GLUE can generate very similar estimates of total streamflow uncertainty. This suggests that formal and informal Bayesian approaches have more common ground than the hydrologic literature and ongoing debate might suggest. The main advantage of formal approaches is, however, that they attempt to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty. This is key to improving hydrologic theory and to better understand and predict the flow of water through catchments.« less
NASA Astrophysics Data System (ADS)
Sadegh, Mojtaba; Ragno, Elisa; AghaKouchak, Amir
2017-06-01
We present a newly developed Multivariate Copula Analysis Toolbox (MvCAT) which includes a wide range of copula families with different levels of complexity. MvCAT employs a Bayesian framework with a residual-based Gaussian likelihood function for inferring copula parameters and estimating the underlying uncertainties. The contribution of this paper is threefold: (a) providing a Bayesian framework to approximate the predictive uncertainties of fitted copulas, (b) introducing a hybrid-evolution Markov Chain Monte Carlo (MCMC) approach designed for numerical estimation of the posterior distribution of copula parameters, and (c) enabling the community to explore a wide range of copulas and evaluate them relative to the fitting uncertainties. We show that the commonly used local optimization methods for copula parameter estimation often get trapped in local minima. The proposed method, however, addresses this limitation and improves describing the dependence structure. MvCAT also enables evaluation of uncertainties relative to the length of record, which is fundamental to a wide range of applications such as multivariate frequency analysis.
Bayesian estimation of the discrete coefficient of determination.
Chen, Ting; Braga-Neto, Ulisses M
2016-12-01
The discrete coefficient of determination (CoD) measures the nonlinear interaction between discrete predictor and target variables and has had far-reaching applications in Genomic Signal Processing. Previous work has addressed the inference of the discrete CoD using classical parametric and nonparametric approaches. In this paper, we introduce a Bayesian framework for the inference of the discrete CoD. We derive analytically the optimal minimum mean-square error (MMSE) CoD estimator, as well as a CoD estimator based on the Optimal Bayesian Predictor (OBP). For the latter estimator, exact expressions for its bias, variance, and root-mean-square (RMS) are given. The accuracy of both Bayesian CoD estimators with non-informative and informative priors, under fixed or random parameters, is studied via analytical and numerical approaches. We also demonstrate the application of the proposed Bayesian approach in the inference of gene regulatory networks, using gene-expression data from a previously published study on metastatic melanoma.
Bayesian hierarchical model for large-scale covariance matrix estimation.
Zhu, Dongxiao; Hero, Alfred O
2007-12-01
Many bioinformatics problems implicitly depend on estimating large-scale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to "overfitting." We cast the large-scale covariance matrix estimation problem into the Bayesian hierarchical model framework, and introduce dependency between covariance parameters. We demonstrate the advantages of our approaches over the traditional approaches using simulations and OMICS data analysis.
A Bayesian estimation of a stochastic predator-prey model of economic fluctuations
NASA Astrophysics Data System (ADS)
Dibeh, Ghassan; Luchinsky, Dmitry G.; Luchinskaya, Daria D.; Smelyanskiy, Vadim N.
2007-06-01
In this paper, we develop a Bayesian framework for the empirical estimation of the parameters of one of the best known nonlinear models of the business cycle: The Marx-inspired model of a growth cycle introduced by R. M. Goodwin. The model predicts a series of closed cycles representing the dynamics of labor's share and the employment rate in the capitalist economy. The Bayesian framework is used to empirically estimate a modified Goodwin model. The original model is extended in two ways. First, we allow for exogenous periodic variations of the otherwise steady growth rates of the labor force and productivity per worker. Second, we allow for stochastic variations of those parameters. The resultant modified Goodwin model is a stochastic predator-prey model with periodic forcing. The model is then estimated using a newly developed Bayesian estimation method on data sets representing growth cycles in France and Italy during the years 1960-2005. Results show that inference of the parameters of the stochastic Goodwin model can be achieved. The comparison of the dynamics of the Goodwin model with the inferred values of parameters demonstrates quantitative agreement with the growth cycle empirical data.
Statistical estimation via convex optimization for trending and performance monitoring
NASA Astrophysics Data System (ADS)
Samar, Sikandar
This thesis presents an optimization-based statistical estimation approach to find unknown trends in noisy data. A Bayesian framework is used to explicitly take into account prior information about the trends via trend models and constraints. The main focus is on convex formulation of the Bayesian estimation problem, which allows efficient computation of (globally) optimal estimates. There are two main parts of this thesis. The first part formulates trend estimation in systems described by known detailed models as a convex optimization problem. Statistically optimal estimates are then obtained by maximizing a concave log-likelihood function subject to convex constraints. We consider the problem of increasing problem dimension as more measurements become available, and introduce a moving horizon framework to enable recursive estimation of the unknown trend by solving a fixed size convex optimization problem at each horizon. We also present a distributed estimation framework, based on the dual decomposition method, for a system formed by a network of complex sensors with local (convex) estimation. Two specific applications of the convex optimization-based Bayesian estimation approach are described in the second part of the thesis. Batch estimation for parametric diagnostics in a flight control simulation of a space launch vehicle is shown to detect incipient fault trends despite the natural masking properties of feedback in the guidance and control loops. Moving horizon approach is used to estimate time varying fault parameters in a detailed nonlinear simulation model of an unmanned aerial vehicle. An excellent performance is demonstrated in the presence of winds and turbulence.
Bayesian analysis of rare events
NASA Astrophysics Data System (ADS)
Straub, Daniel; Papaioannou, Iason; Betz, Wolfgang
2016-06-01
In many areas of engineering and science there is an interest in predicting the probability of rare events, in particular in applications related to safety and security. Increasingly, such predictions are made through computer models of physical systems in an uncertainty quantification framework. Additionally, with advances in IT, monitoring and sensor technology, an increasing amount of data on the performance of the systems is collected. This data can be used to reduce uncertainty, improve the probability estimates and consequently enhance the management of rare events and associated risks. Bayesian analysis is the ideal method to include the data into the probabilistic model. It ensures a consistent probabilistic treatment of uncertainty, which is central in the prediction of rare events, where extrapolation from the domain of observation is common. We present a framework for performing Bayesian updating of rare event probabilities, termed BUS. It is based on a reinterpretation of the classical rejection-sampling approach to Bayesian analysis, which enables the use of established methods for estimating probabilities of rare events. By drawing upon these methods, the framework makes use of their computational efficiency. These methods include the First-Order Reliability Method (FORM), tailored importance sampling (IS) methods and Subset Simulation (SuS). In this contribution, we briefly review these methods in the context of the BUS framework and investigate their applicability to Bayesian analysis of rare events in different settings. We find that, for some applications, FORM can be highly efficient and is surprisingly accurate, enabling Bayesian analysis of rare events with just a few model evaluations. In a general setting, BUS implemented through IS and SuS is more robust and flexible.
Bayesian Image Segmentations by Potts Prior and Loopy Belief Propagation
NASA Astrophysics Data System (ADS)
Tanaka, Kazuyuki; Kataoka, Shun; Yasuda, Muneki; Waizumi, Yuji; Hsu, Chiou-Ting
2014-12-01
This paper presents a Bayesian image segmentation model based on Potts prior and loopy belief propagation. The proposed Bayesian model involves several terms, including the pairwise interactions of Potts models, and the average vectors and covariant matrices of Gauss distributions in color image modeling. These terms are often referred to as hyperparameters in statistical machine learning theory. In order to determine these hyperparameters, we propose a new scheme for hyperparameter estimation based on conditional maximization of entropy in the Potts prior. The algorithm is given based on loopy belief propagation. In addition, we compare our conditional maximum entropy framework with the conventional maximum likelihood framework, and also clarify how the first order phase transitions in loopy belief propagations for Potts models influence our hyperparameter estimation procedures.
NASA Astrophysics Data System (ADS)
Ebrahimian, Hamed; Astroza, Rodrigo; Conte, Joel P.; de Callafon, Raymond A.
2017-02-01
This paper presents a framework for structural health monitoring (SHM) and damage identification of civil structures. This framework integrates advanced mechanics-based nonlinear finite element (FE) modeling and analysis techniques with a batch Bayesian estimation approach to estimate time-invariant model parameters used in the FE model of the structure of interest. The framework uses input excitation and dynamic response of the structure and updates a nonlinear FE model of the structure to minimize the discrepancies between predicted and measured response time histories. The updated FE model can then be interrogated to detect, localize, classify, and quantify the state of damage and predict the remaining useful life of the structure. As opposed to recursive estimation methods, in the batch Bayesian estimation approach, the entire time history of the input excitation and output response of the structure are used as a batch of data to estimate the FE model parameters through a number of iterations. In the case of non-informative prior, the batch Bayesian method leads to an extended maximum likelihood (ML) estimation method to estimate jointly time-invariant model parameters and the measurement noise amplitude. The extended ML estimation problem is solved efficiently using a gradient-based interior-point optimization algorithm. Gradient-based optimization algorithms require the FE response sensitivities with respect to the model parameters to be identified. The FE response sensitivities are computed accurately and efficiently using the direct differentiation method (DDM). The estimation uncertainties are evaluated based on the Cramer-Rao lower bound (CRLB) theorem by computing the exact Fisher Information matrix using the FE response sensitivities with respect to the model parameters. The accuracy of the proposed uncertainty quantification approach is verified using a sampling approach based on the unscented transformation. Two validation studies, based on realistic structural FE models of a bridge pier and a moment resisting steel frame, are performed to validate the performance and accuracy of the presented nonlinear FE model updating approach and demonstrate its application to SHM. These validation studies show the excellent performance of the proposed framework for SHM and damage identification even in the presence of high measurement noise and/or way-out initial estimates of the model parameters. Furthermore, the detrimental effects of the input measurement noise on the performance of the proposed framework are illustrated and quantified through one of the validation studies.
A Bayesian perspective on magnitude estimation.
Petzschner, Frederike H; Glasauer, Stefan; Stephan, Klaas E
2015-05-01
Our representation of the physical world requires judgments of magnitudes, such as loudness, distance, or time. Interestingly, magnitude estimates are often not veridical but subject to characteristic biases. These biases are strikingly similar across different sensory modalities, suggesting common processing mechanisms that are shared by different sensory systems. However, the search for universal neurobiological principles of magnitude judgments requires guidance by formal theories. Here, we discuss a unifying Bayesian framework for understanding biases in magnitude estimation. This Bayesian perspective enables a re-interpretation of a range of established psychophysical findings, reconciles seemingly incompatible classical views on magnitude estimation, and can guide future investigations of magnitude estimation and its neurobiological mechanisms in health and in psychiatric diseases, such as schizophrenia. Copyright © 2015 Elsevier Ltd. All rights reserved.
Jiang, Zhehan; Skorupski, William
2017-12-12
In many behavioral research areas, multivariate generalizability theory (mG theory) has been typically used to investigate the reliability of certain multidimensional assessments. However, traditional mG-theory estimation-namely, using frequentist approaches-has limits, leading researchers to fail to take full advantage of the information that mG theory can offer regarding the reliability of measurements. Alternatively, Bayesian methods provide more information than frequentist approaches can offer. This article presents instructional guidelines on how to implement mG-theory analyses in a Bayesian framework; in particular, BUGS code is presented to fit commonly seen designs from mG theory, including single-facet designs, two-facet crossed designs, and two-facet nested designs. In addition to concrete examples that are closely related to the selected designs and the corresponding BUGS code, a simulated dataset is provided to demonstrate the utility and advantages of the Bayesian approach. This article is intended to serve as a tutorial reference for applied researchers and methodologists conducting mG-theory studies.
Bayesian analysis of rare events
DOE Office of Scientific and Technical Information (OSTI.GOV)
Straub, Daniel, E-mail: straub@tum.de; Papaioannou, Iason; Betz, Wolfgang
2016-06-01
In many areas of engineering and science there is an interest in predicting the probability of rare events, in particular in applications related to safety and security. Increasingly, such predictions are made through computer models of physical systems in an uncertainty quantification framework. Additionally, with advances in IT, monitoring and sensor technology, an increasing amount of data on the performance of the systems is collected. This data can be used to reduce uncertainty, improve the probability estimates and consequently enhance the management of rare events and associated risks. Bayesian analysis is the ideal method to include the data into themore » probabilistic model. It ensures a consistent probabilistic treatment of uncertainty, which is central in the prediction of rare events, where extrapolation from the domain of observation is common. We present a framework for performing Bayesian updating of rare event probabilities, termed BUS. It is based on a reinterpretation of the classical rejection-sampling approach to Bayesian analysis, which enables the use of established methods for estimating probabilities of rare events. By drawing upon these methods, the framework makes use of their computational efficiency. These methods include the First-Order Reliability Method (FORM), tailored importance sampling (IS) methods and Subset Simulation (SuS). In this contribution, we briefly review these methods in the context of the BUS framework and investigate their applicability to Bayesian analysis of rare events in different settings. We find that, for some applications, FORM can be highly efficient and is surprisingly accurate, enabling Bayesian analysis of rare events with just a few model evaluations. In a general setting, BUS implemented through IS and SuS is more robust and flexible.« less
Bayesian estimation inherent in a Mexican-hat-type neural network
NASA Astrophysics Data System (ADS)
Takiyama, Ken
2016-05-01
Brain functions, such as perception, motor control and learning, and decision making, have been explained based on a Bayesian framework, i.e., to decrease the effects of noise inherent in the human nervous system or external environment, our brain integrates sensory and a priori information in a Bayesian optimal manner. However, it remains unclear how Bayesian computations are implemented in the brain. Herein, I address this issue by analyzing a Mexican-hat-type neural network, which was used as a model of the visual cortex, motor cortex, and prefrontal cortex. I analytically demonstrate that the dynamics of an order parameter in the model corresponds exactly to a variational inference of a linear Gaussian state-space model, a Bayesian estimation, when the strength of recurrent synaptic connectivity is appropriately stronger than that of an external stimulus, a plausible condition in the brain. This exact correspondence can reveal the relationship between the parameters in the Bayesian estimation and those in the neural network, providing insight for understanding brain functions.
NASA Astrophysics Data System (ADS)
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert elicitation methodology is developed and applied to the real-world test case in order to provide a road map for the use of fuzzy Bayesian inference in groundwater modeling applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan
In this study we developed an efficient Bayesian inversion framework for interpreting marine seismic amplitude versus angle (AVA) and controlled source electromagnetic (CSEM) data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo (MCMC) sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis (DREAM) and Adaptive Metropolis (AM) samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and CSEM data. The multi-chain MCMC is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration,more » the approach is used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic AVA and CSEM joint inversion provides better estimation of reservoir saturations than the seismic AVA-only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated – reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less
EEG-fMRI Bayesian framework for neural activity estimation: a simulation study
NASA Astrophysics Data System (ADS)
Croce, Pierpaolo; Basti, Alessio; Marzetti, Laura; Zappasodi, Filippo; Del Gratta, Cosimo
2016-12-01
Objective. Due to the complementary nature of electroencephalography (EEG) and functional magnetic resonance imaging (fMRI), and given the possibility of simultaneous acquisition, the joint data analysis can afford a better understanding of the underlying neural activity estimation. In this simulation study we want to show the benefit of the joint EEG-fMRI neural activity estimation in a Bayesian framework. Approach. We built a dynamic Bayesian framework in order to perform joint EEG-fMRI neural activity time course estimation. The neural activity is originated by a given brain area and detected by means of both measurement techniques. We have chosen a resting state neural activity situation to address the worst case in terms of the signal-to-noise ratio. To infer information by EEG and fMRI concurrently we used a tool belonging to the sequential Monte Carlo (SMC) methods: the particle filter (PF). Main results. First, despite a high computational cost, we showed the feasibility of such an approach. Second, we obtained an improvement in neural activity reconstruction when using both EEG and fMRI measurements. Significance. The proposed simulation shows the improvements in neural activity reconstruction with EEG-fMRI simultaneous data. The application of such an approach to real data allows a better comprehension of the neural dynamics.
NASA Astrophysics Data System (ADS)
Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Bao, Jie; Swiler, Laura
2017-12-01
In this study we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach is used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated - reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.
EEG-fMRI Bayesian framework for neural activity estimation: a simulation study.
Croce, Pierpaolo; Basti, Alessio; Marzetti, Laura; Zappasodi, Filippo; Gratta, Cosimo Del
2016-12-01
Due to the complementary nature of electroencephalography (EEG) and functional magnetic resonance imaging (fMRI), and given the possibility of simultaneous acquisition, the joint data analysis can afford a better understanding of the underlying neural activity estimation. In this simulation study we want to show the benefit of the joint EEG-fMRI neural activity estimation in a Bayesian framework. We built a dynamic Bayesian framework in order to perform joint EEG-fMRI neural activity time course estimation. The neural activity is originated by a given brain area and detected by means of both measurement techniques. We have chosen a resting state neural activity situation to address the worst case in terms of the signal-to-noise ratio. To infer information by EEG and fMRI concurrently we used a tool belonging to the sequential Monte Carlo (SMC) methods: the particle filter (PF). First, despite a high computational cost, we showed the feasibility of such an approach. Second, we obtained an improvement in neural activity reconstruction when using both EEG and fMRI measurements. The proposed simulation shows the improvements in neural activity reconstruction with EEG-fMRI simultaneous data. The application of such an approach to real data allows a better comprehension of the neural dynamics.
Feng, Dai; Baumgartner, Richard; Svetnik, Vladimir
2018-04-05
The concordance correlation coefficient (CCC) is a widely used scaled index in the study of agreement. In this article, we propose estimating the CCC by a unified Bayesian framework that can (1) accommodate symmetric or asymmetric and light- or heavy-tailed data; (2) select model from several candidates; and (3) address other issues frequently encountered in practice such as confounding covariates and missing data. The performance of the proposal was studied and demonstrated using simulated as well as real-life biomarker data from a clinical study of an insomnia drug. The implementation of the proposal is accessible through a package in the Comprehensive R Archive Network.
NASA Astrophysics Data System (ADS)
Kim, Seongryong; Tkalčić, Hrvoje; Mustać, Marija; Rhie, Junkee; Ford, Sean
2016-04-01
A framework is presented within which we provide rigorous estimations for seismic sources and structures in the Northeast Asia. We use Bayesian inversion methods, which enable statistical estimations of models and their uncertainties based on data information. Ambiguities in error statistics and model parameterizations are addressed by hierarchical and trans-dimensional (trans-D) techniques, which can be inherently implemented in the Bayesian inversions. Hence reliable estimation of model parameters and their uncertainties is possible, thus avoiding arbitrary regularizations and parameterizations. Hierarchical and trans-D inversions are performed to develop a three-dimensional velocity model using ambient noise data. To further improve the model, we perform joint inversions with receiver function data using a newly developed Bayesian method. For the source estimation, a novel moment tensor inversion method is presented and applied to regional waveform data of the North Korean nuclear explosion tests. By the combination of new Bayesian techniques and the structural model, coupled with meaningful uncertainties related to each of the processes, more quantitative monitoring and discrimination of seismic events is possible.
Bayesian sparse channel estimation
NASA Astrophysics Data System (ADS)
Chen, Chulong; Zoltowski, Michael D.
2012-05-01
In Orthogonal Frequency Division Multiplexing (OFDM) systems, the technique used to estimate and track the time-varying multipath channel is critical to ensure reliable, high data rate communications. It is recognized that wireless channels often exhibit a sparse structure, especially for wideband and ultra-wideband systems. In order to exploit this sparse structure to reduce the number of pilot tones and increase the channel estimation quality, the application of compressed sensing to channel estimation is proposed. In this article, to make the compressed channel estimation more feasible for practical applications, it is investigated from a perspective of Bayesian learning. Under the Bayesian learning framework, the large-scale compressed sensing problem, as well as large time delay for the estimation of the doubly selective channel over multiple consecutive OFDM symbols, can be avoided. Simulation studies show a significant improvement in channel estimation MSE and less computing time compared to the conventional compressed channel estimation techniques.
Sequential Inverse Problems Bayesian Principles and the Logistic Map Example
NASA Astrophysics Data System (ADS)
Duan, Lian; Farmer, Chris L.; Moroz, Irene M.
2010-09-01
Bayesian statistics provides a general framework for solving inverse problems, but is not without interpretation and implementation problems. This paper discusses difficulties arising from the fact that forward models are always in error to some extent. Using a simple example based on the one-dimensional logistic map, we argue that, when implementation problems are minimal, the Bayesian framework is quite adequate. In this paper the Bayesian Filter is shown to be able to recover excellent state estimates in the perfect model scenario (PMS) and to distinguish the PMS from the imperfect model scenario (IMS). Through a quantitative comparison of the way in which the observations are assimilated in both the PMS and the IMS scenarios, we suggest that one can, sometimes, measure the degree of imperfection.
A Bayesian Approach for Image Segmentation with Shape Priors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Hang; Yang, Qing; Parvin, Bahram
2008-06-20
Color and texture have been widely used in image segmentation; however, their performance is often hindered by scene ambiguities, overlapping objects, or missingparts. In this paper, we propose an interactive image segmentation approach with shape prior models within a Bayesian framework. Interactive features, through mouse strokes, reduce ambiguities, and the incorporation of shape priors enhances quality of the segmentation where color and/or texture are not solely adequate. The novelties of our approach are in (i) formulating the segmentation problem in a well-de?ned Bayesian framework with multiple shape priors, (ii) ef?ciently estimating parameters of the Bayesian model, and (iii) multi-object segmentationmore » through user-speci?ed priors. We demonstrate the effectiveness of our method on a set of natural and synthetic images.« less
A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research
van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B; Neyer, Franz J; van Aken, Marcel AG
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, the ingredients underlying Bayesian methods are introduced using a simplified example. Thereafter, the advantages and pitfalls of the specification of prior knowledge are discussed. To illustrate Bayesian methods explained in this study, in a second example a series of studies that examine the theoretical framework of dynamic interactionism are considered. In the Discussion the advantages and disadvantages of using Bayesian statistics are reviewed, and guidelines on how to report on Bayesian statistics are provided. PMID:24116396
Tree Biomass Estimation of Chinese fir (Cunninghamia lanceolata) Based on Bayesian Method
Zhang, Jianguo
2013-01-01
Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass. PMID:24278198
Tree biomass estimation of Chinese fir (Cunninghamia lanceolata) based on Bayesian method.
Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo
2013-01-01
Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation W = a(D2H)b was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hou, Z; Terry, N; Hubbard, S S
2013-02-12
In this study, we evaluate the possibility of monitoring soil moisture variation using tomographic ground penetrating radar travel time data through Bayesian inversion, which is integrated with entropy memory function and pilot point concepts, as well as efficient sampling approaches. It is critical to accurately estimate soil moisture content and variations in vadose zone studies. Many studies have illustrated the promise and value of GPR tomographic data for estimating soil moisture and associated changes, however, challenges still exist in the inversion of GPR tomographic data in a manner that quantifies input and predictive uncertainty, incorporates multiple data types, handles non-uniquenessmore » and nonlinearity, and honors time-lapse tomograms collected in a series. To address these challenges, we develop a minimum relative entropy (MRE)-Bayesian based inverse modeling framework that non-subjectively defines prior probabilities, incorporates information from multiple sources, and quantifies uncertainty. The framework enables us to estimate dielectric permittivity at pilot point locations distributed within the tomogram, as well as the spatial correlation range. In the inversion framework, MRE is first used to derive prior probability distribution functions (pdfs) of dielectric permittivity based on prior information obtained from a straight-ray GPR inversion. The probability distributions are then sampled using a Quasi-Monte Carlo (QMC) approach, and the sample sets provide inputs to a sequential Gaussian simulation (SGSim) algorithm that constructs a highly resolved permittivity/velocity field for evaluation with a curved-ray GPR forward model. The likelihood functions are computed as a function of misfits, and posterior pdfs are constructed using a Gaussian kernel. Inversion of subsequent time-lapse datasets combines the Bayesian estimates from the previous inversion (as a memory function) with new data. The memory function and pilot point design takes advantage of the spatial-temporal correlation of the state variables. We first apply the inversion framework to a static synthetic example and then to a time-lapse GPR tomographic dataset collected during a dynamic experiment conducted at the Hanford Site in Richland, WA. We demonstrate that the MRE-Bayesian inversion enables us to merge various data types, quantify uncertainty, evaluate nonlinear models, and produce more detailed and better resolved estimates than straight-ray based inversion; therefore, it has the potential to improve estimates of inter-wellbore dielectric permittivity and soil moisture content and to monitor their temporal dynamics more accurately.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hou, Zhangshuan; Terry, Neil C.; Hubbard, Susan S.
2013-02-22
In this study, we evaluate the possibility of monitoring soil moisture variation using tomographic ground penetrating radar travel time data through Bayesian inversion, which is integrated with entropy memory function and pilot point concepts, as well as efficient sampling approaches. It is critical to accurately estimate soil moisture content and variations in vadose zone studies. Many studies have illustrated the promise and value of GPR tomographic data for estimating soil moisture and associated changes, however, challenges still exist in the inversion of GPR tomographic data in a manner that quantifies input and predictive uncertainty, incorporates multiple data types, handles non-uniquenessmore » and nonlinearity, and honors time-lapse tomograms collected in a series. To address these challenges, we develop a minimum relative entropy (MRE)-Bayesian based inverse modeling framework that non-subjectively defines prior probabilities, incorporates information from multiple sources, and quantifies uncertainty. The framework enables us to estimate dielectric permittivity at pilot point locations distributed within the tomogram, as well as the spatial correlation range. In the inversion framework, MRE is first used to derive prior probability density functions (pdfs) of dielectric permittivity based on prior information obtained from a straight-ray GPR inversion. The probability distributions are then sampled using a Quasi-Monte Carlo (QMC) approach, and the sample sets provide inputs to a sequential Gaussian simulation (SGSIM) algorithm that constructs a highly resolved permittivity/velocity field for evaluation with a curved-ray GPR forward model. The likelihood functions are computed as a function of misfits, and posterior pdfs are constructed using a Gaussian kernel. Inversion of subsequent time-lapse datasets combines the Bayesian estimates from the previous inversion (as a memory function) with new data. The memory function and pilot point design takes advantage of the spatial-temporal correlation of the state variables. We first apply the inversion framework to a static synthetic example and then to a time-lapse GPR tomographic dataset collected during a dynamic experiment conducted at the Hanford Site in Richland, WA. We demonstrate that the MRE-Bayesian inversion enables us to merge various data types, quantify uncertainty, evaluate nonlinear models, and produce more detailed and better resolved estimates than straight-ray based inversion; therefore, it has the potential to improve estimates of inter-wellbore dielectric permittivity and soil moisture content and to monitor their temporal dynamics more accurately.« less
ERIC Educational Resources Information Center
Abayomi, Kobi; Pizarro, Gonzalo
2013-01-01
We offer a straightforward framework for measurement of progress, across many dimensions, using cross-national social indices, which we classify as linear combinations of multivariate country level data onto a univariate score. We suggest a Bayesian approach which yields probabilistic (confidence type) intervals for the point estimates of country…
Bayesian analysis of the flutter margin method in aeroelasticity
Khalil, Mohammad; Poirel, Dominique; Sarkar, Abhijit
2016-08-27
A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis–Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the fluttermore » speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. In conclusion, it will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.« less
Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Sapiro, Guillermo; Lenglet, Christophe
2017-09-01
We propose a sparse Bayesian learning algorithm for improved estimation of white matter fiber parameters from compressed (under-sampled q-space) multi-shell diffusion MRI data. The multi-shell data is represented in a dictionary form using a non-monoexponential decay model of diffusion, based on continuous gamma distribution of diffusivities. The fiber volume fractions with predefined orientations, which are the unknown parameters, form the dictionary weights. These unknown parameters are estimated with a linear un-mixing framework, using a sparse Bayesian learning algorithm. A localized learning of hyperparameters at each voxel and for each possible fiber orientations improves the parameter estimation. Our experiments using synthetic data from the ISBI 2012 HARDI reconstruction challenge and in-vivo data from the Human Connectome Project demonstrate the improvements.
Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan; ...
2017-10-17
In this paper we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach ismore » used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated — reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ren, Huiying; Ray, Jaideep; Hou, Zhangshuan
In this paper we developed an efficient Bayesian inversion framework for interpreting marine seismic Amplitude Versus Angle and Controlled-Source Electromagnetic data for marine reservoir characterization. The framework uses a multi-chain Markov-chain Monte Carlo sampler, which is a hybrid of DiffeRential Evolution Adaptive Metropolis and Adaptive Metropolis samplers. The inversion framework is tested by estimating reservoir-fluid saturations and porosity based on marine seismic and Controlled-Source Electromagnetic data. The multi-chain Markov-chain Monte Carlo is scalable in terms of the number of chains, and is useful for computationally demanding Bayesian model calibration in scientific and engineering problems. As a demonstration, the approach ismore » used to efficiently and accurately estimate the porosity and saturations in a representative layered synthetic reservoir. The results indicate that the seismic Amplitude Versus Angle and Controlled-Source Electromagnetic joint inversion provides better estimation of reservoir saturations than the seismic Amplitude Versus Angle only inversion, especially for the parameters in deep layers. The performance of the inversion approach for various levels of noise in observational data was evaluated — reasonable estimates can be obtained with noise levels up to 25%. Sampling efficiency due to the use of multiple chains was also checked and was found to have almost linear scalability.« less
Uncertainty Management for Diagnostics and Prognostics of Batteries using Bayesian Techniques
NASA Technical Reports Server (NTRS)
Saha, Bhaskar; Goebel, kai
2007-01-01
Uncertainty management has always been the key hurdle faced by diagnostics and prognostics algorithms. A Bayesian treatment of this problem provides an elegant and theoretically sound approach to the modern Condition- Based Maintenance (CBM)/Prognostic Health Management (PHM) paradigm. The application of the Bayesian techniques to regression and classification in the form of Relevance Vector Machine (RVM), and to state estimation as in Particle Filters (PF), provides a powerful tool to integrate the diagnosis and prognosis of battery health. The RVM, which is a Bayesian treatment of the Support Vector Machine (SVM), is used for model identification, while the PF framework uses the learnt model, statistical estimates of noise and anticipated operational conditions to provide estimates of remaining useful life (RUL) in the form of a probability density function (PDF). This type of prognostics generates a significant value addition to the management of any operation involving electrical systems.
Bayesian networks for evaluation of evidence from forensic entomology.
Andersson, M Gunnar; Sundström, Anders; Lindström, Anders
2013-09-01
In the aftermath of a CBRN incident, there is an urgent need to reconstruct events in order to bring the perpetrators to court and to take preventive actions for the future. The challenge is to discriminate, based on available information, between alternative scenarios. Forensic interpretation is used to evaluate to what extent results from the forensic investigation favor the prosecutors' or the defendants' arguments, using the framework of Bayesian hypothesis testing. Recently, several new scientific disciplines have been used in a forensic context. In the AniBioThreat project, the framework was applied to veterinary forensic pathology, tracing of pathogenic microorganisms, and forensic entomology. Forensic entomology is an important tool for estimating the postmortem interval in, for example, homicide investigations as a complement to more traditional methods. In this article we demonstrate the applicability of the Bayesian framework for evaluating entomological evidence in a forensic investigation through the analysis of a hypothetical scenario involving suspect movement of carcasses from a clandestine laboratory. Probabilities of different findings under the alternative hypotheses were estimated using a combination of statistical analysis of data, expert knowledge, and simulation, and entomological findings are used to update the beliefs about the prosecutors' and defendants' hypotheses and to calculate the value of evidence. The Bayesian framework proved useful for evaluating complex hypotheses using findings from several insect species, accounting for uncertainty about development rate, temperature, and precolonization. The applicability of the forensic statistic approach to evaluating forensic results from a CBRN incident is discussed.
Photo-z-SQL: Photometric redshift estimation framework
NASA Astrophysics Data System (ADS)
Beck, Róbert; Dobos, László; Budavári, Tamás; Szalay, Alexander S.; Csabai, István
2017-04-01
Photo-z-SQL is a flexible template-based photometric redshift estimation framework that can be seamlessly integrated into a SQL database (or DB) server and executed on demand in SQL. The DB integration eliminates the need to move large photometric datasets outside a database for redshift estimation, and uses the computational capabilities of DB hardware. Photo-z-SQL performs both maximum likelihood and Bayesian estimation and handles inputs of variable photometric filter sets and corresponding broad-band magnitudes.
Heudtlass, Peter; Guha-Sapir, Debarati; Speybroeck, Niko
2018-05-31
The crude death rate (CDR) is one of the defining indicators of humanitarian emergencies. When data from vital registration systems are not available, it is common practice to estimate the CDR from household surveys with cluster-sampling design. However, sample sizes are often too small to compare mortality estimates to emergency thresholds, at least in a frequentist framework. Several authors have proposed Bayesian methods for health surveys in humanitarian crises. Here, we develop an approach specifically for mortality data and cluster-sampling surveys. We describe a Bayesian hierarchical Poisson-Gamma mixture model with generic (weakly informative) priors that could be used as default in absence of any specific prior knowledge, and compare Bayesian and frequentist CDR estimates using five different mortality datasets. We provide an interpretation of the Bayesian estimates in the context of an emergency threshold and demonstrate how to interpret parameters at the cluster level and ways in which informative priors can be introduced. With the same set of weakly informative priors, Bayesian CDR estimates are equivalent to frequentist estimates, for all practical purposes. The probability that the CDR surpasses the emergency threshold can be derived directly from the posterior of the mean of the mixing distribution. All observation in the datasets contribute to the estimation of cluster-level estimates, through the hierarchical structure of the model. In a context of sparse data, Bayesian mortality assessments have advantages over frequentist ones already when using only weakly informative priors. More informative priors offer a formal and transparent way of combining new data with existing data and expert knowledge and can help to improve decision-making in humanitarian crises by complementing frequentist estimates.
Estimation and Application of Ecological Memory Functions in Time and Space
NASA Astrophysics Data System (ADS)
Itter, M.; Finley, A. O.; Dawson, A.
2017-12-01
A common goal in quantitative ecology is the estimation or prediction of ecological processes as a function of explanatory variables (or covariates). Frequently, the ecological process of interest and associated covariates vary in time, space, or both. Theory indicates many ecological processes exhibit memory to local, past conditions. Despite such theoretical understanding, few methods exist to integrate observations from the recent past or within a local neighborhood as drivers of these processes. We build upon recent methodological advances in ecology and spatial statistics to develop a Bayesian hierarchical framework to estimate so-called ecological memory functions; that is, weight-generating functions that specify the relative importance of local, past covariate observations to ecological processes. Memory functions are estimated using a set of basis functions in time and/or space, allowing for flexible ecological memory based on a reduced set of parameters. Ecological memory functions are entirely data driven under the Bayesian hierarchical framework—no a priori assumptions are made regarding functional forms. Memory function uncertainty follows directly from posterior distributions for model parameters allowing for tractable propagation of error to predictions of ecological processes. We apply the model framework to simulated spatio-temporal datasets generated using memory functions of varying complexity. The framework is also applied to estimate the ecological memory of annual boreal forest growth to local, past water availability. Consistent with ecological understanding of boreal forest growth dynamics, memory to past water availability peaks in the year previous to growth and slowly decays to zero in five to eight years. The Bayesian hierarchical framework has applicability to a broad range of ecosystems and processes allowing for increased understanding of ecosystem responses to local and past conditions and improved prediction of ecological processes.
Bayesian population receptive field modelling.
Zeidman, Peter; Silson, Edward Harry; Schwarzkopf, Dietrich Samuel; Baker, Chris Ian; Penny, Will
2017-09-08
We introduce a probabilistic (Bayesian) framework and associated software toolbox for mapping population receptive fields (pRFs) based on fMRI data. This generic approach is intended to work with stimuli of any dimension and is demonstrated and validated in the context of 2D retinotopic mapping. The framework enables the experimenter to specify generative (encoding) models of fMRI timeseries, in which experimental stimuli enter a pRF model of neural activity, which in turns drives a nonlinear model of neurovascular coupling and Blood Oxygenation Level Dependent (BOLD) response. The neuronal and haemodynamic parameters are estimated together on a voxel-by-voxel or region-of-interest basis using a Bayesian estimation algorithm (variational Laplace). This offers several novel contributions to receptive field modelling. The variance/covariance of parameters are estimated, enabling receptive fields to be plotted while properly representing uncertainty about pRF size and location. Variability in the haemodynamic response across the brain is accounted for. Furthermore, the framework introduces formal hypothesis testing to pRF analysis, enabling competing models to be evaluated based on their log model evidence (approximated by the variational free energy), which represents the optimal tradeoff between accuracy and complexity. Using simulations and empirical data, we found that parameters typically used to represent pRF size and neuronal scaling are strongly correlated, which is taken into account by the Bayesian methods we describe when making inferences. We used the framework to compare the evidence for six variants of pRF model using 7 T functional MRI data and we found a circular Difference of Gaussians (DoG) model to be the best explanation for our data overall. We hope this framework will prove useful for mapping stimulus spaces with any number of dimensions onto the anatomy of the brain. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Bayesian estimation of dynamic matching function for U-V analysis in Japan
NASA Astrophysics Data System (ADS)
Kyo, Koki; Noda, Hideo; Kitagawa, Genshiro
2012-05-01
In this paper we propose a Bayesian method for analyzing unemployment dynamics. We derive a Beveridge curve for unemployment and vacancy (U-V) analysis from a Bayesian model based on a labor market matching function. In our framework, the efficiency of matching and the elasticities of new hiring with respect to unemployment and vacancy are regarded as time varying parameters. To construct a flexible model and obtain reasonable estimates in an underdetermined estimation problem, we treat the time varying parameters as random variables and introduce smoothness priors. The model is then described in a state space representation, enabling the parameter estimation to be carried out using Kalman filter and fixed interval smoothing. In such a representation, dynamic features of the cyclic unemployment rate and the structural-frictional unemployment rate can be accurately captured.
Boehm, Udo; Steingroever, Helen; Wagenmakers, Eric-Jan
2018-06-01
An important tool in the advancement of cognitive science are quantitative models that represent different cognitive variables in terms of model parameters. To evaluate such models, their parameters are typically tested for relationships with behavioral and physiological variables that are thought to reflect specific cognitive processes. However, many models do not come equipped with the statistical framework needed to relate model parameters to covariates. Instead, researchers often revert to classifying participants into groups depending on their values on the covariates, and subsequently comparing the estimated model parameters between these groups. Here we develop a comprehensive solution to the covariate problem in the form of a Bayesian regression framework. Our framework can be easily added to existing cognitive models and allows researchers to quantify the evidential support for relationships between covariates and model parameters using Bayes factors. Moreover, we present a simulation study that demonstrates the superiority of the Bayesian regression framework to the conventional classification-based approach.
Adkison, Milo D.; Peterman, R.M.
1996-01-01
Bayesian methods have been proposed to estimate optimal escapement goals, using both knowledge about physical determinants of salmon productivity and stock-recruitment data. The Bayesian approach has several advantages over many traditional methods for estimating stock productivity: it allows integration of information from diverse sources and provides a framework for decision-making that takes into account uncertainty reflected in the data. However, results can be critically dependent on details of implementation of this approach. For instance, unintended and unwarranted confidence about stock-recruitment relationships can arise if the range of relationships examined is too narrow, if too few discrete alternatives are considered, or if data are contradictory. This unfounded confidence can result in a suboptimal choice of a spawning escapement goal.
NASA Astrophysics Data System (ADS)
Mustac, M.; Kim, S.; Tkalcic, H.; Rhie, J.; Chen, Y.; Ford, S. R.; Sebastian, N.
2015-12-01
Conventional approaches to inverse problems suffer from non-linearity and non-uniqueness in estimations of seismic structures and source properties. Estimated results and associated uncertainties are often biased by applied regularizations and additional constraints, which are commonly introduced to solve such problems. Bayesian methods, however, provide statistically meaningful estimations of models and their uncertainties constrained by data information. In addition, hierarchical and trans-dimensional (trans-D) techniques are inherently implemented in the Bayesian framework to account for involved error statistics and model parameterizations, and, in turn, allow more rigorous estimations of the same. Here, we apply Bayesian methods throughout the entire inference process to estimate seismic structures and source properties in Northeast Asia including east China, the Korean peninsula, and the Japanese islands. Ambient noise analysis is first performed to obtain a base three-dimensional (3-D) heterogeneity model using continuous broadband waveforms from more than 300 stations. As for the tomography of surface wave group and phase velocities in the 5-70 s band, we adopt a hierarchical and trans-D Bayesian inversion method using Voronoi partition. The 3-D heterogeneity model is further improved by joint inversions of teleseismic receiver functions and dispersion data using a newly developed high-efficiency Bayesian technique. The obtained model is subsequently used to prepare 3-D structural Green's functions for the source characterization. A hierarchical Bayesian method for point source inversion using regional complete waveform data is applied to selected events from the region. The seismic structure and source characteristics with rigorously estimated uncertainties from the novel Bayesian methods provide enhanced monitoring and discrimination of seismic events in northeast Asia.
A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.
Houseman, E Andres; Virji, M Abbas
2017-08-01
Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates were significant in some frequentist models, but in the Bayesian model their credible intervals contained zero; such discrepancies were observed in multiple datasets. Variance components from the Bayesian model reflected substantial autocorrelation, consistent with the frequentist models, except for the auto-regressive moving average model. Plots of means from the Bayesian model showed good fit to the observed data. The proposed Bayesian model provides an approach for modeling non-stationary autocorrelation in a hierarchical modeling framework to estimate task means, standard deviations, quantiles, and parameter estimates for covariates that are less biased and have better performance characteristics than some of the contemporary methods. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2017.
BAYESIAN ESTIMATION OF THERMONUCLEAR REACTION RATES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Iliadis, C.; Anderson, K. S.; Coc, A.
The problem of estimating non-resonant astrophysical S -factors and thermonuclear reaction rates, based on measured nuclear cross sections, is of major interest for nuclear energy generation, neutrino physics, and element synthesis. Many different methods have been applied to this problem in the past, almost all of them based on traditional statistics. Bayesian methods, on the other hand, are now in widespread use in the physical sciences. In astronomy, for example, Bayesian statistics is applied to the observation of extrasolar planets, gravitational waves, and Type Ia supernovae. However, nuclear physics, in particular, has been slow to adopt Bayesian methods. We presentmore » astrophysical S -factors and reaction rates based on Bayesian statistics. We develop a framework that incorporates robust parameter estimation, systematic effects, and non-Gaussian uncertainties in a consistent manner. The method is applied to the reactions d(p, γ ){sup 3}He, {sup 3}He({sup 3}He,2p){sup 4}He, and {sup 3}He( α , γ ){sup 7}Be, important for deuterium burning, solar neutrinos, and Big Bang nucleosynthesis.« less
Bayesian multimodel inference for dose-response studies
Link, W.A.; Albers, P.H.
2007-01-01
Statistical inference in dose?response studies is model-based: The analyst posits a mathematical model of the relation between exposure and response, estimates parameters of the model, and reports conclusions conditional on the model. Such analyses rarely include any accounting for the uncertainties associated with model selection. The Bayesian inferential system provides a convenient framework for model selection and multimodel inference. In this paper we briefly describe the Bayesian paradigm and Bayesian multimodel inference. We then present a family of models for multinomial dose?response data and apply Bayesian multimodel inferential methods to the analysis of data on the reproductive success of American kestrels (Falco sparveriuss) exposed to various sublethal dietary concentrations of methylmercury.
Bayesian-MCMC-based parameter estimation of stealth aircraft RCS models
NASA Astrophysics Data System (ADS)
Xia, Wei; Dai, Xiao-Xia; Feng, Yuan
2015-12-01
When modeling a stealth aircraft with low RCS (Radar Cross Section), conventional parameter estimation methods may cause a deviation from the actual distribution, owing to the fact that the characteristic parameters are estimated via directly calculating the statistics of RCS. The Bayesian-Markov Chain Monte Carlo (Bayesian-MCMC) method is introduced herein to estimate the parameters so as to improve the fitting accuracies of fluctuation models. The parameter estimations of the lognormal and the Legendre polynomial models are reformulated in the Bayesian framework. The MCMC algorithm is then adopted to calculate the parameter estimates. Numerical results show that the distribution curves obtained by the proposed method exhibit improved consistence with the actual ones, compared with those fitted by the conventional method. The fitting accuracy could be improved by no less than 25% for both fluctuation models, which implies that the Bayesian-MCMC method might be a good candidate among the optimal parameter estimation methods for stealth aircraft RCS models. Project supported by the National Natural Science Foundation of China (Grant No. 61101173), the National Basic Research Program of China (Grant No. 613206), the National High Technology Research and Development Program of China (Grant No. 2012AA01A308), the State Scholarship Fund by the China Scholarship Council (CSC), and the Oversea Academic Training Funds, and University of Electronic Science and Technology of China (UESTC).
Bayesian characterization of uncertainty in species interaction strengths.
Wolf, Christopher; Novak, Mark; Gitelman, Alix I
2017-06-01
Considerable effort has been devoted to the estimation of species interaction strengths. This effort has focused primarily on statistical significance testing and obtaining point estimates of parameters that contribute to interaction strength magnitudes, leaving the characterization of uncertainty associated with those estimates unconsidered. We consider a means of characterizing the uncertainty of a generalist predator's interaction strengths by formulating an observational method for estimating a predator's prey-specific per capita attack rates as a Bayesian statistical model. This formulation permits the explicit incorporation of multiple sources of uncertainty. A key insight is the informative nature of several so-called non-informative priors that have been used in modeling the sparse data typical of predator feeding surveys. We introduce to ecology a new neutral prior and provide evidence for its superior performance. We use a case study to consider the attack rates in a New Zealand intertidal whelk predator, and we illustrate not only that Bayesian point estimates can be made to correspond with those obtained by frequentist approaches, but also that estimation uncertainty as described by 95% intervals is more useful and biologically realistic using the Bayesian method. In particular, unlike in bootstrap confidence intervals, the lower bounds of the Bayesian posterior intervals for attack rates do not include zero when a predator-prey interaction is in fact observed. We conclude that the Bayesian framework provides a straightforward, probabilistic characterization of interaction strength uncertainty, enabling future considerations of both the deterministic and stochastic drivers of interaction strength and their impact on food webs.
NASA Astrophysics Data System (ADS)
Lowman, L.; Barros, A. P.
2014-12-01
Computational modeling of surface erosion processes is inherently difficult because of the four-dimensional nature of the problem and the multiple temporal and spatial scales that govern individual mechanisms. Landscapes are modified via surface and fluvial erosion and exhumation, each of which takes place over a range of time scales. Traditional field measurements of erosion/exhumation rates are scale dependent, often valid for a single point-wise location or averaging over large aerial extents and periods with intense and mild erosion. We present a method of remotely estimating erosion rates using a Bayesian hierarchical model based upon the stream power erosion law (SPEL). A Bayesian approach allows for estimating erosion rates using the deterministic relationship given by the SPEL and data on channel slopes and precipitation at the basin and sub-basin scale. The spatial scale associated with this framework is the elevation class, where each class is characterized by distinct morphologic behavior observed through different modes in the distribution of basin outlet elevations. Interestingly, the distributions of first-order outlets are similar in shape and extent to the distribution of precipitation events (i.e. individual storms) over a 14-year period between 1998-2011. We demonstrate an application of the Bayesian hierarchical modeling framework for five basins and one intermontane basin located in the central Andes between 5S and 20S. Using remotely sensed data of current annual precipitation rates from the Tropical Rainfall Measuring Mission (TRMM) and topography from a high resolution (3 arc-seconds) digital elevation map (DEM), our erosion rate estimates are consistent with decadal-scale estimates based on landslide mapping and sediment flux observations and 1-2 orders of magnitude larger than most millennial and million year timescale estimates from thermochronology and cosmogenic nuclides.
Nariai, N; Kim, S; Imoto, S; Miyano, S
2004-01-01
We propose a statistical method to estimate gene networks from DNA microarray data and protein-protein interactions. Because physical interactions between proteins or multiprotein complexes are likely to regulate biological processes, using only mRNA expression data is not sufficient for estimating a gene network accurately. Our method adds knowledge about protein-protein interactions to the estimation method of gene networks under a Bayesian statistical framework. In the estimated gene network, a protein complex is modeled as a virtual node based on principal component analysis. We show the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae cell cycle data. The proposed method improves the accuracy of the estimated gene networks, and successfully identifies some biological facts.
Depaoli, Sarah
2013-06-01
Growth mixture modeling (GMM) represents a technique that is designed to capture change over time for unobserved subgroups (or latent classes) that exhibit qualitatively different patterns of growth. The aim of the current article was to explore the impact of latent class separation (i.e., how similar growth trajectories are across latent classes) on GMM performance. Several estimation conditions were compared: maximum likelihood via the expectation maximization (EM) algorithm and the Bayesian framework implementing diffuse priors, "accurate" informative priors, weakly informative priors, data-driven informative priors, priors reflecting partial-knowledge of parameters, and "inaccurate" (but informative) priors. The main goal was to provide insight about the optimal estimation condition under different degrees of latent class separation for GMM. Results indicated that optimal parameter recovery was obtained though the Bayesian approach using "accurate" informative priors, and partial-knowledge priors showed promise for the recovery of the growth trajectory parameters. Maximum likelihood and the remaining Bayesian estimation conditions yielded poor parameter recovery for the latent class proportions and the growth trajectories. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Gucciardi, Daniel F; Zhang, Chun-Qing; Ponnusamy, Vellapandian; Si, Gangyan; Stenling, Andreas
2016-04-01
The aims of this study were to assess the cross-cultural invariance of athletes' self-reports of mental toughness and to introduce and illustrate the application of approximate measurement invariance using Bayesian estimation for sport and exercise psychology scholars. Athletes from Australia (n = 353, Mage = 19.13, SD = 3.27, men = 161), China (n = 254, Mage = 17.82, SD = 2.28, men = 138), and Malaysia (n = 341, Mage = 19.13, SD = 3.27, men = 200) provided a cross-sectional snapshot of their mental toughness. The cross-cultural invariance of the mental toughness inventory in terms of (a) the factor structure (configural invariance), (b) factor loadings (metric invariance), and (c) item intercepts (scalar invariance) was tested using an approximate measurement framework with Bayesian estimation. Results indicated that approximate metric and scalar invariance was established. From a methodological standpoint, this study demonstrated the usefulness and flexibility of Bayesian estimation for single-sample and multigroup analyses of measurement instruments. Substantively, the current findings suggest that the measurement of mental toughness requires cultural adjustments to better capture the contextually salient (emic) aspects of this concept.
ERIC Educational Resources Information Center
Lee, Soo; Suh, Youngsuk
2018-01-01
Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…
A Bayesian framework for infrasound location
NASA Astrophysics Data System (ADS)
Modrak, Ryan T.; Arrowsmith, Stephen J.; Anderson, Dale N.
2010-04-01
We develop a framework for location of infrasound events using backazimuth and infrasonic arrival times from multiple arrays. Bayesian infrasonic source location (BISL) developed here estimates event location and associated credibility regions. BISL accounts for unknown source-to-array path or phase by formulating infrasonic group velocity as random. Differences between observed and predicted source-to-array traveltimes are partitioned into two additive Gaussian sources, measurement error and model error, the second of which accounts for the unknown influence of wind and temperature on path. By applying the technique to both synthetic tests and ground-truth events, we highlight the complementary nature of back azimuths and arrival times for estimating well-constrained event locations. BISL is an extension to methods developed earlier by Arrowsmith et al. that provided simple bounds on location using a grid-search technique.
Bayesian Methods for Effective Field Theories
NASA Astrophysics Data System (ADS)
Wesolowski, Sarah
Microscopic predictions of the properties of atomic nuclei have reached a high level of precision in the past decade. This progress mandates improved uncertainty quantification (UQ) for a robust comparison of experiment with theory. With the uncertainty from many-body methods under control, calculations are now sensitive to the input inter-nucleon interactions. These interactions include parameters that must be fit to experiment, inducing both uncertainty from the fit and from missing physics in the operator structure of the Hamiltonian. Furthermore, the implementation of the inter-nucleon interactions is not unique, which presents the additional problem of assessing results using different interactions. Effective field theories (EFTs) take advantage of a separation of high- and low-energy scales in the problem to form a power-counting scheme that allows the organization of terms in the Hamiltonian based on their expected contribution to observable predictions. This scheme gives a natural framework for quantification of uncertainty due to missing physics. The free parameters of the EFT, called the low-energy constants (LECs), must be fit to data, but in a properly constructed EFT these constants will be natural-sized, i.e., of order unity. The constraints provided by the EFT, namely the size of the systematic uncertainty from truncation of the theory and the natural size of the LECs, are assumed information even before a calculation is performed or a fit is done. Bayesian statistical methods provide a framework for treating uncertainties that naturally incorporates prior information as well as putting stochastic and systematic uncertainties on an equal footing. For EFT UQ Bayesian methods allow the relevant EFT properties to be incorporated quantitatively as prior probability distribution functions (pdfs). Following the logic of probability theory, observable quantities and underlying physical parameters such as the EFT breakdown scale may be expressed as pdfs that incorporate the prior pdfs. Problems of model selection, such as distinguishing between competing EFT implementations, are also natural in a Bayesian framework. In this thesis we focus on two complementary topics for EFT UQ using Bayesian methods--quantifying EFT truncation uncertainty and parameter estimation for LECs. Using the order-by-order calculations and underlying EFT constraints as prior information, we show how to estimate EFT truncation uncertainties. We then apply the result to calculating truncation uncertainties on predictions of nucleon-nucleon scattering in chiral effective field theory. We apply model-checking diagnostics to our calculations to ensure that the statistical model of truncation uncertainty produces consistent results. A framework for EFT parameter estimation based on EFT convergence properties and naturalness is developed which includes a series of diagnostics to ensure the extraction of the maximum amount of available information from data to estimate LECs with minimal bias. We develop this framework using model EFTs and apply it to the problem of extrapolating lattice quantum chromodynamics results for the nucleon mass. We then apply aspects of the parameter estimation framework to perform case studies in chiral EFT parameter estimation, investigating a possible operator redundancy at fourth order in the chiral expansion and the appropriate inclusion of truncation uncertainty in estimating LECs.
Vrancken, Bram; Lemey, Philippe; Rambaut, Andrew; Bedford, Trevor; Longdon, Ben; Günthard, Huldrych F.; Suchard, Marc A.
2014-01-01
Phylogenetic signal quantifies the degree to which resemblance in continuously-valued traits reflects phylogenetic relatedness. Measures of phylogenetic signal are widely used in ecological and evolutionary research, and are recently gaining traction in viral evolutionary studies. Standard estimators of phylogenetic signal frequently condition on data summary statistics of the repeated trait observations and fixed phylogenetics trees, resulting in information loss and potential bias. To incorporate the observation process and phylogenetic uncertainty in a model-based approach, we develop a novel Bayesian inference method to simultaneously estimate the evolutionary history and phylogenetic signal from molecular sequence data and repeated multivariate traits. Our approach builds upon a phylogenetic diffusion framework that model continuous trait evolution as a Brownian motion process and incorporates Pagel’s λ transformation parameter to estimate dependence among traits. We provide a computationally efficient inference implementation in the BEAST software package. We evaluate the synthetic performance of the Bayesian estimator of phylogenetic signal against standard estimators, and demonstrate the use of our coherent framework to address several virus-host evolutionary questions, including virulence heritability for HIV, antigenic evolution in influenza and HIV, and Drosophila sensitivity to sigma virus infection. Finally, we discuss model extensions that will make useful contributions to our flexible framework for simultaneously studying sequence and trait evolution. PMID:25780554
Comparability and Reliability Considerations of Adequate Yearly Progress
ERIC Educational Resources Information Center
Maier, Kimberly S.; Maiti, Tapabrata; Dass, Sarat C.; Lim, Chae Young
2012-01-01
The purpose of this study is to develop an estimate of Adequate Yearly Progress (AYP) that will allow for reliable and valid comparisons among student subgroups, schools, and districts. A shrinkage-type estimator of AYP using the Bayesian framework is described. Using simulated data, the performance of the Bayes estimator will be compared to…
A fuzzy Bayesian approach to flood frequency estimation with imprecise historical information
Kiss, Andrea; Viglione, Alberto; Viertl, Reinhard; Blöschl, Günter
2016-01-01
Abstract This paper presents a novel framework that links imprecision (through a fuzzy approach) and stochastic uncertainty (through a Bayesian approach) in estimating flood probabilities from historical flood information and systematic flood discharge data. The method exploits the linguistic characteristics of historical source material to construct membership functions, which may be wider or narrower, depending on the vagueness of the statements. The membership functions are either included in the prior distribution or the likelihood function to obtain a fuzzy version of the flood frequency curve. The viability of the approach is demonstrated by three case studies that differ in terms of their hydromorphological conditions (from an Alpine river with bedrock profile to a flat lowland river with extensive flood plains) and historical source material (including narratives, town and county meeting protocols, flood marks and damage accounts). The case studies are presented in order of increasing fuzziness (the Rhine at Basel, Switzerland; the Werra at Meiningen, Germany; and the Tisza at Szeged, Hungary). Incorporating imprecise historical information is found to reduce the range between the 5% and 95% Bayesian credibility bounds of the 100 year floods by 45% and 61% for the Rhine and Werra case studies, respectively. The strengths and limitations of the framework are discussed relative to alternative (non‐fuzzy) methods. The fuzzy Bayesian inference framework provides a flexible methodology that fits the imprecise nature of linguistic information on historical floods as available in historical written documentation. PMID:27840456
A fuzzy Bayesian approach to flood frequency estimation with imprecise historical information
NASA Astrophysics Data System (ADS)
Salinas, José Luis; Kiss, Andrea; Viglione, Alberto; Viertl, Reinhard; Blöschl, Günter
2016-09-01
This paper presents a novel framework that links imprecision (through a fuzzy approach) and stochastic uncertainty (through a Bayesian approach) in estimating flood probabilities from historical flood information and systematic flood discharge data. The method exploits the linguistic characteristics of historical source material to construct membership functions, which may be wider or narrower, depending on the vagueness of the statements. The membership functions are either included in the prior distribution or the likelihood function to obtain a fuzzy version of the flood frequency curve. The viability of the approach is demonstrated by three case studies that differ in terms of their hydromorphological conditions (from an Alpine river with bedrock profile to a flat lowland river with extensive flood plains) and historical source material (including narratives, town and county meeting protocols, flood marks and damage accounts). The case studies are presented in order of increasing fuzziness (the Rhine at Basel, Switzerland; the Werra at Meiningen, Germany; and the Tisza at Szeged, Hungary). Incorporating imprecise historical information is found to reduce the range between the 5% and 95% Bayesian credibility bounds of the 100 year floods by 45% and 61% for the Rhine and Werra case studies, respectively. The strengths and limitations of the framework are discussed relative to alternative (non-fuzzy) methods. The fuzzy Bayesian inference framework provides a flexible methodology that fits the imprecise nature of linguistic information on historical floods as available in historical written documentation.
A fuzzy Bayesian approach to flood frequency estimation with imprecise historical information.
Salinas, José Luis; Kiss, Andrea; Viglione, Alberto; Viertl, Reinhard; Blöschl, Günter
2016-09-01
This paper presents a novel framework that links imprecision (through a fuzzy approach) and stochastic uncertainty (through a Bayesian approach) in estimating flood probabilities from historical flood information and systematic flood discharge data. The method exploits the linguistic characteristics of historical source material to construct membership functions, which may be wider or narrower, depending on the vagueness of the statements. The membership functions are either included in the prior distribution or the likelihood function to obtain a fuzzy version of the flood frequency curve. The viability of the approach is demonstrated by three case studies that differ in terms of their hydromorphological conditions (from an Alpine river with bedrock profile to a flat lowland river with extensive flood plains) and historical source material (including narratives, town and county meeting protocols, flood marks and damage accounts). The case studies are presented in order of increasing fuzziness (the Rhine at Basel, Switzerland; the Werra at Meiningen, Germany; and the Tisza at Szeged, Hungary). Incorporating imprecise historical information is found to reduce the range between the 5% and 95% Bayesian credibility bounds of the 100 year floods by 45% and 61% for the Rhine and Werra case studies, respectively. The strengths and limitations of the framework are discussed relative to alternative (non-fuzzy) methods. The fuzzy Bayesian inference framework provides a flexible methodology that fits the imprecise nature of linguistic information on historical floods as available in historical written documentation.
Bayesian Hierarchical Grouping: perceptual grouping as mixture estimation
Froyen, Vicky; Feldman, Jacob; Singh, Manish
2015-01-01
We propose a novel framework for perceptual grouping based on the idea of mixture models, called Bayesian Hierarchical Grouping (BHG). In BHG we assume that the configuration of image elements is generated by a mixture of distinct objects, each of which generates image elements according to some generative assumptions. Grouping, in this framework, means estimating the number and the parameters of the mixture components that generated the image, including estimating which image elements are “owned” by which objects. We present a tractable implementation of the framework, based on the hierarchical clustering approach of Heller and Ghahramani (2005). We illustrate it with examples drawn from a number of classical perceptual grouping problems, including dot clustering, contour integration, and part decomposition. Our approach yields an intuitive hierarchical representation of image elements, giving an explicit decomposition of the image into mixture components, along with estimates of the probability of various candidate decompositions. We show that BHG accounts well for a diverse range of empirical data drawn from the literature. Because BHG provides a principled quantification of the plausibility of grouping interpretations over a wide range of grouping problems, we argue that it provides an appealing unifying account of the elusive Gestalt notion of Prägnanz. PMID:26322548
A Bayesian Machine Learning Model for Estimating Building Occupancy from Open Source Data
Stewart, Robert N.; Urban, Marie L.; Duchscherer, Samantha E.; ...
2016-01-01
Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the artmore » by introducing the Population Data Tables (PDT), a Bayesian based informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT probabilistically estimates ambient occupancy in units of people/1000ft2 for over 50 building types at the national and sub-national level with the goal of providing global coverage. The challenge of global coverage led to the development of an interdisciplinary geospatial informatics system tool that provides the framework for capturing, storing, and managing open source data, handling subject matter expertise, carrying out Bayesian analytics as well as visualizing and exporting occupancy estimation results. We present the PDT project, situate the work within the larger community, and report on the progress of this multi-year project.Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the art by introducing the Population Data Tables (PDT), a Bayesian model and informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT probabilistically estimates ambient occupancy in units of people/1000 ft 2 for over 50 building types at the national and sub-national level with the goal of providing global coverage. The challenge of global coverage led to the development of an interdisciplinary geospatial informatics system tool that provides the framework for capturing, storing, and managing open source data, handling subject matter expertise, carrying out Bayesian analytics as well as visualizing and exporting occupancy estimation results. We present the PDT project, situate the work within the larger community, and report on the progress of this multi-year project.« less
Robust Tracking of Small Displacements with a Bayesian Estimator
Dumont, Douglas M.; Byram, Brett C.
2016-01-01
Radiation-force-based elasticity imaging describes a group of techniques that use acoustic radiation force (ARF) to displace tissue in order to obtain qualitative or quantitative measurements of tissue properties. Because ARF-induced displacements are on the order of micrometers, tracking these displacements in vivo can be challenging. Previously, it has been shown that Bayesian-based estimation can overcome some of the limitations of a traditional displacement estimator like normalized cross-correlation (NCC). In this work, we describe a Bayesian framework that combines a generalized Gaussian-Markov random field (GGMRF) prior with an automated method for selecting the prior’s width. We then evaluate its performance in the context of tracking the micrometer-order displacements encountered in an ARF-based method like acoustic radiation force impulse (ARFI) imaging. The results show that bias, variance, and mean-square error performance vary with prior shape and width, and that an almost one order-of-magnitude reduction in mean-square error can be achieved by the estimator at the automatically-selected prior width. Lesion simulations show that the proposed estimator has a higher contrast-to-noise ratio but lower contrast than NCC, median-filtered NCC, and the previous Bayesian estimator, with a non-Gaussian prior shape having better lesion-edge resolution than a Gaussian prior. In vivo results from a cardiac, radiofrequency ablation ARFI imaging dataset show quantitative improvements in lesion contrast-to-noise ratio over NCC as well as the previous Bayesian estimator. PMID:26529761
Ghosh, Sujit K
2010-01-01
Bayesian methods are rapidly becoming popular tools for making statistical inference in various fields of science including biology, engineering, finance, and genetics. One of the key aspects of Bayesian inferential method is its logical foundation that provides a coherent framework to utilize not only empirical but also scientific information available to a researcher. Prior knowledge arising from scientific background, expert judgment, or previously collected data is used to build a prior distribution which is then combined with current data via the likelihood function to characterize the current state of knowledge using the so-called posterior distribution. Bayesian methods allow the use of models of complex physical phenomena that were previously too difficult to estimate (e.g., using asymptotic approximations). Bayesian methods offer a means of more fully understanding issues that are central to many practical problems by allowing researchers to build integrated models based on hierarchical conditional distributions that can be estimated even with limited amounts of data. Furthermore, advances in numerical integration methods, particularly those based on Monte Carlo methods, have made it possible to compute the optimal Bayes estimators. However, there is a reasonably wide gap between the background of the empirically trained scientists and the full weight of Bayesian statistical inference. Hence, one of the goals of this chapter is to bridge the gap by offering elementary to advanced concepts that emphasize linkages between standard approaches and full probability modeling via Bayesian methods.
An introduction to using Bayesian linear regression with clinical data.
Baldwin, Scott A; Larson, Michael J
2017-11-01
Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Fienen, Michael N.; D'Oria, Marco; Doherty, John E.; Hunt, Randall J.
2013-01-01
The application bgaPEST is a highly parameterized inversion software package implementing the Bayesian Geostatistical Approach in a framework compatible with the parameter estimation suite PEST. Highly parameterized inversion refers to cases in which parameters are distributed in space or time and are correlated with one another. The Bayesian aspect of bgaPEST is related to Bayesian probability theory in which prior information about parameters is formally revised on the basis of the calibration dataset used for the inversion. Conceptually, this approach formalizes the conditionality of estimated parameters on the specific data and model available. The geostatistical component of the method refers to the way in which prior information about the parameters is used. A geostatistical autocorrelation function is used to enforce structure on the parameters to avoid overfitting and unrealistic results. Bayesian Geostatistical Approach is designed to provide the smoothest solution that is consistent with the data. Optionally, users can specify a level of fit or estimate a balance between fit and model complexity informed by the data. Groundwater and surface-water applications are used as examples in this text, but the possible uses of bgaPEST extend to any distributed parameter applications.
NASA Astrophysics Data System (ADS)
Bowman, C.; Gibson, K. J.; La Haye, R. J.; Groebner, R. J.; Taylor, N. Z.; Grierson, B. A.
2014-10-01
A Bayesian inference framework has been developed for the DIII-D charge-exchange recombination (CER) system, capable of computing probability distribution functions (PDFs) for desired parameters. CER is a key diagnostic system at DIII-D, measuring important physics parameters such as plasma rotation and impurity ion temperature. This work is motivated by a case in which the CER system was used to probe the plasma rotation radial profile around an m/n = 2/1 tearing mode island rotating at ~ 1 kHz. Due to limited resolution in the tearing mode phase and short integration time, it has proven challenging to observe the structure of the rotation profile across the island. We seek to solve this problem by using the Bayesian framework to improve the estimation accuracy of the plasma rotation, helping to reveal details of how it is perturbed in the magnetic island vicinity. Examples of the PDFs obtained through the Bayesian framework will be presented, and compared with results from a conventional least-squares analysis of the CER data. Work supported by the US DOE under DE-FC02-04ER54698 and DE-AC02-09CH11466.
A Bayesian Framework for Reliability Analysis of Spacecraft Deployments
NASA Technical Reports Server (NTRS)
Evans, John W.; Gallo, Luis; Kaminsky, Mark
2012-01-01
Deployable subsystems are essential to mission success of most spacecraft. These subsystems enable critical functions including power, communications and thermal control. The loss of any of these functions will generally result in loss of the mission. These subsystems and their components often consist of unique designs and applications for which various standardized data sources are not applicable for estimating reliability and for assessing risks. In this study, a two stage sequential Bayesian framework for reliability estimation of spacecraft deployment was developed for this purpose. This process was then applied to the James Webb Space Telescope (JWST) Sunshield subsystem, a unique design intended for thermal control of the Optical Telescope Element. Initially, detailed studies of NASA deployment history, "heritage information", were conducted, extending over 45 years of spacecraft launches. This information was then coupled to a non-informative prior and a binomial likelihood function to create a posterior distribution for deployments of various subsystems uSing Monte Carlo Markov Chain sampling. Select distributions were then coupled to a subsequent analysis, using test data and anomaly occurrences on successive ground test deployments of scale model test articles of JWST hardware, to update the NASA heritage data. This allowed for a realistic prediction for the reliability of the complex Sunshield deployment, with credibility limits, within this two stage Bayesian framework.
Bayesian parameter estimation for nonlinear modelling of biological pathways.
Ghasemi, Omid; Lindsey, Merry L; Yang, Tianyi; Nguyen, Nguyen; Huang, Yufei; Jin, Yu-Fang
2011-01-01
The availability of temporal measurements on biological experiments has significantly promoted research areas in systems biology. To gain insight into the interaction and regulation of biological systems, mathematical frameworks such as ordinary differential equations have been widely applied to model biological pathways and interpret the temporal data. Hill equations are the preferred formats to represent the reaction rate in differential equation frameworks, due to their simple structures and their capabilities for easy fitting to saturated experimental measurements. However, Hill equations are highly nonlinearly parameterized functions, and parameters in these functions cannot be measured easily. Additionally, because of its high nonlinearity, adaptive parameter estimation algorithms developed for linear parameterized differential equations cannot be applied. Therefore, parameter estimation in nonlinearly parameterized differential equation models for biological pathways is both challenging and rewarding. In this study, we propose a Bayesian parameter estimation algorithm to estimate parameters in nonlinear mathematical models for biological pathways using time series data. We used the Runge-Kutta method to transform differential equations to difference equations assuming a known structure of the differential equations. This transformation allowed us to generate predictions dependent on previous states and to apply a Bayesian approach, namely, the Markov chain Monte Carlo (MCMC) method. We applied this approach to the biological pathways involved in the left ventricle (LV) response to myocardial infarction (MI) and verified our algorithm by estimating two parameters in a Hill equation embedded in the nonlinear model. We further evaluated our estimation performance with different parameter settings and signal to noise ratios. Our results demonstrated the effectiveness of the algorithm for both linearly and nonlinearly parameterized dynamic systems. Our proposed Bayesian algorithm successfully estimated parameters in nonlinear mathematical models for biological pathways. This method can be further extended to high order systems and thus provides a useful tool to analyze biological dynamics and extract information using temporal data.
NASA Astrophysics Data System (ADS)
Krishnanathan, Kirubhakaran; Anderson, Sean R.; Billings, Stephen A.; Kadirkamanathan, Visakan
2016-11-01
In this paper, we derive a system identification framework for continuous-time nonlinear systems, for the first time using a simulation-focused computational Bayesian approach. Simulation approaches to nonlinear system identification have been shown to outperform regression methods under certain conditions, such as non-persistently exciting inputs and fast-sampling. We use the approximate Bayesian computation (ABC) algorithm to perform simulation-based inference of model parameters. The framework has the following main advantages: (1) parameter distributions are intrinsically generated, giving the user a clear description of uncertainty, (2) the simulation approach avoids the difficult problem of estimating signal derivatives as is common with other continuous-time methods, and (3) as noted above, the simulation approach improves identification under conditions of non-persistently exciting inputs and fast-sampling. Term selection is performed by judging parameter significance using parameter distributions that are intrinsically generated as part of the ABC procedure. The results from a numerical example demonstrate that the method performs well in noisy scenarios, especially in comparison to competing techniques that rely on signal derivative estimation.
Perdikaris, Paris; Karniadakis, George Em
2016-05-01
We present a computational framework for model inversion based on multi-fidelity information fusion and Bayesian optimization. The proposed methodology targets the accurate construction of response surfaces in parameter space, and the efficient pursuit to identify global optima while keeping the number of expensive function evaluations at a minimum. We train families of correlated surrogates on available data using Gaussian processes and auto-regressive stochastic schemes, and exploit the resulting predictive posterior distributions within a Bayesian optimization setting. This enables a smart adaptive sampling procedure that uses the predictive posterior variance to balance the exploration versus exploitation trade-off, and is a key enabler for practical computations under limited budgets. The effectiveness of the proposed framework is tested on three parameter estimation problems. The first two involve the calibration of outflow boundary conditions of blood flow simulations in arterial bifurcations using multi-fidelity realizations of one- and three-dimensional models, whereas the last one aims to identify the forcing term that generated a particular solution to an elliptic partial differential equation. © 2016 The Author(s).
Perdikaris, Paris; Karniadakis, George Em
2016-01-01
We present a computational framework for model inversion based on multi-fidelity information fusion and Bayesian optimization. The proposed methodology targets the accurate construction of response surfaces in parameter space, and the efficient pursuit to identify global optima while keeping the number of expensive function evaluations at a minimum. We train families of correlated surrogates on available data using Gaussian processes and auto-regressive stochastic schemes, and exploit the resulting predictive posterior distributions within a Bayesian optimization setting. This enables a smart adaptive sampling procedure that uses the predictive posterior variance to balance the exploration versus exploitation trade-off, and is a key enabler for practical computations under limited budgets. The effectiveness of the proposed framework is tested on three parameter estimation problems. The first two involve the calibration of outflow boundary conditions of blood flow simulations in arterial bifurcations using multi-fidelity realizations of one- and three-dimensional models, whereas the last one aims to identify the forcing term that generated a particular solution to an elliptic partial differential equation. PMID:27194481
NASA Astrophysics Data System (ADS)
Umehara, Hiroaki; Okada, Masato; Naruse, Yasushi
2018-03-01
The estimation of angular time series data is a widespread issue relating to various situations involving rotational motion and moving objects. There are two kinds of problem settings: the estimation of wrapped angles, which are principal values in a circular coordinate system (e.g., the direction of an object), and the estimation of unwrapped angles in an unbounded coordinate system such as for the positioning and tracking of moving objects measured by the signal-wave phase. Wrapped angles have been estimated in previous studies by sequential Bayesian filtering; however, the hyperparameters that are to be solved and that control the properties of the estimation model were given a priori. The present study establishes a procedure of hyperparameter estimation from the observation data of angles only, using the framework of Bayesian inference completely as the maximum likelihood estimation. Moreover, the filter model is modified to estimate the unwrapped angles. It is proved that without noise our model reduces to the existing algorithm of Itoh's unwrapping transform. It is numerically confirmed that our model is an extension of unwrapping estimation from Itoh's unwrapping transform to the case with noise.
An Uncertainty Quantification Framework for Prognostics and Condition-Based Monitoring
NASA Technical Reports Server (NTRS)
Sankararaman, Shankar; Goebel, Kai
2014-01-01
This paper presents a computational framework for uncertainty quantification in prognostics in the context of condition-based monitoring of aerospace systems. The different sources of uncertainty and the various uncertainty quantification activities in condition-based prognostics are outlined in detail, and it is demonstrated that the Bayesian subjective approach is suitable for interpreting uncertainty in online monitoring. A state-space model-based framework for prognostics, that can rigorously account for the various sources of uncertainty, is presented. Prognostics consists of two important steps. First, the state of the system is estimated using Bayesian tracking, and then, the future states of the system are predicted until failure, thereby computing the remaining useful life of the system. The proposed framework is illustrated using the power system of a planetary rover test-bed, which is being developed and studied at NASA Ames Research Center.
An Adaptive Model of Student Performance Using Inverse Bayes
ERIC Educational Resources Information Center
Lang, Charles
2014-01-01
This article proposes a coherent framework for the use of Inverse Bayesian estimation to summarize and make predictions about student behaviour in adaptive educational settings. The Inverse Bayes Filter utilizes Bayes theorem to estimate the relative impact of contextual factors and internal student factors on student performance using time series…
Bayesian approach for three-dimensional aquifer characterization at the Hanford 300 Area
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murakami, Haruko; Chen, X.; Hahn, Melanie S.
2010-10-21
This study presents a stochastic, three-dimensional characterization of a heterogeneous hydraulic conductivity field within DOE's Hanford 300 Area site, Washington, by assimilating large-scale, constant-rate injection test data with small-scale, three-dimensional electromagnetic borehole flowmeter (EBF) measurement data. We first inverted the injection test data to estimate the transmissivity field, using zeroth-order temporal moments of pressure buildup curves. We applied a newly developed Bayesian geostatistical inversion framework, the method of anchored distributions (MAD), to obtain a joint posterior distribution of geostatistical parameters and local log-transmissivities at multiple locations. The unique aspects of MAD that make it suitable for this purpose are itsmore » ability to integrate multi-scale, multi-type data within a Bayesian framework and to compute a nonparametric posterior distribution. After we combined the distribution of transmissivities with depth-discrete relative-conductivity profile from EBF data, we inferred the three-dimensional geostatistical parameters of the log-conductivity field, using the Bayesian model-based geostatistics. Such consistent use of the Bayesian approach throughout the procedure enabled us to systematically incorporate data uncertainty into the final posterior distribution. The method was tested in a synthetic study and validated using the actual data that was not part of the estimation. Results showed broader and skewed posterior distributions of geostatistical parameters except for the mean, which suggests the importance of inferring the entire distribution to quantify the parameter uncertainty.« less
Applications of Bayesian spectrum representation in acoustics
NASA Astrophysics Data System (ADS)
Botts, Jonathan M.
This dissertation utilizes a Bayesian inference framework to enhance the solution of inverse problems where the forward model maps to acoustic spectra. A Bayesian solution to filter design inverts a acoustic spectra to pole-zero locations of a discrete-time filter model. Spatial sound field analysis with a spherical microphone array is a data analysis problem that requires inversion of spatio-temporal spectra to directions of arrival. As with many inverse problems, a probabilistic analysis results in richer solutions than can be achieved with ad-hoc methods. In the filter design problem, the Bayesian inversion results in globally optimal coefficient estimates as well as an estimate the most concise filter capable of representing the given spectrum, within a single framework. This approach is demonstrated on synthetic spectra, head-related transfer function spectra, and measured acoustic reflection spectra. The Bayesian model-based analysis of spatial room impulse responses is presented as an analogous problem with equally rich solution. The model selection mechanism provides an estimate of the number of arrivals, which is necessary to properly infer the directions of simultaneous arrivals. Although, spectrum inversion problems are fairly ubiquitous, the scope of this dissertation has been limited to these two and derivative problems. The Bayesian approach to filter design is demonstrated on an artificial spectrum to illustrate the model comparison mechanism and then on measured head-related transfer functions to show the potential range of application. Coupled with sampling methods, the Bayesian approach is shown to outperform least-squares filter design methods commonly used in commercial software, confirming the need for a global search of the parameter space. The resulting designs are shown to be comparable to those that result from global optimization methods, but the Bayesian approach has the added advantage of a filter length estimate within the same unified framework. The application to reflection data is useful for representing frequency-dependent impedance boundaries in finite difference acoustic simulations. Furthermore, since the filter transfer function is a parametric model, it can be modified to incorporate arbitrary frequency weighting and account for the band-limited nature of measured reflection spectra. Finally, the model is modified to compensate for dispersive error in the finite difference simulation, from the filter design process. Stemming from the filter boundary problem, the implementation of pressure sources in finite difference simulation is addressed in order to assure that schemes properly converge. A class of parameterized source functions is proposed and shown to offer straightforward control of residual error in the simulation. Guided by the notion that the solution to be approximated affects the approximation error, sources are designed which reduce residual dispersive error to the size of round-off errors. The early part of a room impulse response can be characterized by a series of isolated plane waves. Measured with an array of microphones, plane waves map to a directional response of the array or spatial intensity map. Probabilistic inversion of this response results in estimates of the number and directions of image source arrivals. The model-based inversion is shown to avoid ambiguities associated with peak-finding or inspection of the spatial intensity map. For this problem, determining the number of arrivals in a given frame is critical for properly inferring the state of the sound field. This analysis is effectively compression of the spatial room response, which is useful for analysis or encoding of the spatial sound field. Parametric, model-based formulations of these problems enhance the solution in all cases, and a Bayesian interpretation provides a principled approach to model comparison and parameter estimation. v
Improving chemical species tomography of turbulent flows using covariance estimation.
Grauer, Samuel J; Hadwin, Paul J; Daun, Kyle J
2017-05-01
Chemical species tomography (CST) experiments can be divided into limited-data and full-rank cases. Both require solving ill-posed inverse problems, and thus the measurement data must be supplemented with prior information to carry out reconstructions. The Bayesian framework formalizes the role of additive information, expressed as the mean and covariance of a joint-normal prior probability density function. We present techniques for estimating the spatial covariance of a flow under limited-data and full-rank conditions. Our results show that incorporating a covariance estimate into CST reconstruction via a Bayesian prior increases the accuracy of instantaneous estimates. Improvements are especially dramatic in real-time limited-data CST, which is directly applicable to many industrially relevant experiments.
A Web-Based System for Bayesian Benchmark Dose Estimation.
Shao, Kan; Shapiro, Andrew J
2018-01-11
Benchmark dose (BMD) modeling is an important step in human health risk assessment and is used as the default approach to identify the point of departure for risk assessment. A probabilistic framework for dose-response assessment has been proposed and advocated by various institutions and organizations; therefore, a reliable tool is needed to provide distributional estimates for BMD and other important quantities in dose-response assessment. We developed an online system for Bayesian BMD (BBMD) estimation and compared results from this software with U.S. Environmental Protection Agency's (EPA's) Benchmark Dose Software (BMDS). The system is built on a Bayesian framework featuring the application of Markov chain Monte Carlo (MCMC) sampling for model parameter estimation and BMD calculation, which makes the BBMD system fundamentally different from the currently prevailing BMD software packages. In addition to estimating the traditional BMDs for dichotomous and continuous data, the developed system is also capable of computing model-averaged BMD estimates. A total of 518 dichotomous and 108 continuous data sets extracted from the U.S. EPA's Integrated Risk Information System (IRIS) database (and similar databases) were used as testing data to compare the estimates from the BBMD and BMDS programs. The results suggest that the BBMD system may outperform the BMDS program in a number of aspects, including fewer failed BMD and BMDL calculations and estimates. The BBMD system is a useful alternative tool for estimating BMD with additional functionalities for BMD analysis based on most recent research. Most importantly, the BBMD has the potential to incorporate prior information to make dose-response modeling more reliable and can provide distributional estimates for important quantities in dose-response assessment, which greatly facilitates the current trend for probabilistic risk assessment. https://doi.org/10.1289/EHP1289.
Law, Jane
2016-01-01
Intrinsic conditional autoregressive modeling in a Bayeisan hierarchical framework has been increasingly applied in small-area ecological studies. This study explores the specifications of spatial structure in this Bayesian framework in two aspects: adjacency, i.e., the set of neighbor(s) for each area; and (spatial) weight for each pair of neighbors. Our analysis was based on a small-area study of falling injuries among people age 65 and older in Ontario, Canada, that was aimed to estimate risks and identify risk factors of such falls. In the case study, we observed incorrect adjacencies information caused by deficiencies in the digital map itself. Further, when equal weights was replaced by weights based on a variable of expected count, the range of estimated risks increased, the number of areas with probability of estimated risk greater than one at different probability thresholds increased, and model fit improved. More importantly, significance of a risk factor diminished. Further research to thoroughly investigate different methods of variable weights; quantify the influence of specifications of spatial weights; and develop strategies for better defining spatial structure of a map in small-area analysis in Bayesian hierarchical spatial modeling is recommended. PMID:29546147
ERIC Educational Resources Information Center
Wang, Shiyu; Yang, Yan; Culpepper, Steven Andrew; Douglas, Jeffrey A.
2018-01-01
A family of learning models that integrates a cognitive diagnostic model and a higher-order, hidden Markov model in one framework is proposed. This new framework includes covariates to model skill transition in the learning environment. A Bayesian formulation is adopted to estimate parameters from a learning model. The developed methods are…
Incorporating approximation error in surrogate based Bayesian inversion
NASA Astrophysics Data System (ADS)
Zhang, J.; Zeng, L.; Li, W.; Wu, L.
2015-12-01
There are increasing interests in applying surrogates for inverse Bayesian modeling to reduce repetitive evaluations of original model. In this way, the computational cost is expected to be saved. However, the approximation error of surrogate model is usually overlooked. This is partly because that it is difficult to evaluate the approximation error for many surrogates. Previous studies have shown that, the direct combination of surrogates and Bayesian methods (e.g., Markov Chain Monte Carlo, MCMC) may lead to biased estimations when the surrogate cannot emulate the highly nonlinear original system. This problem can be alleviated by implementing MCMC in a two-stage manner. However, the computational cost is still high since a relatively large number of original model simulations are required. In this study, we illustrate the importance of incorporating approximation error in inverse Bayesian modeling. Gaussian process (GP) is chosen to construct the surrogate for its convenience in approximation error evaluation. Numerical cases of Bayesian experimental design and parameter estimation for contaminant source identification are used to illustrate this idea. It is shown that, once the surrogate approximation error is well incorporated into Bayesian framework, promising results can be obtained even when the surrogate is directly used, and no further original model simulations are required.
Is probabilistic bias analysis approximately Bayesian?
MacLehose, Richard F.; Gustafson, Paul
2011-01-01
Case-control studies are particularly susceptible to differential exposure misclassification when exposure status is determined following incident case status. Probabilistic bias analysis methods have been developed as ways to adjust standard effect estimates based on the sensitivity and specificity of exposure misclassification. The iterative sampling method advocated in probabilistic bias analysis bears a distinct resemblance to a Bayesian adjustment; however, it is not identical. Furthermore, without a formal theoretical framework (Bayesian or frequentist), the results of a probabilistic bias analysis remain somewhat difficult to interpret. We describe, both theoretically and empirically, the extent to which probabilistic bias analysis can be viewed as approximately Bayesian. While the differences between probabilistic bias analysis and Bayesian approaches to misclassification can be substantial, these situations often involve unrealistic prior specifications and are relatively easy to detect. Outside of these special cases, probabilistic bias analysis and Bayesian approaches to exposure misclassification in case-control studies appear to perform equally well. PMID:22157311
Probability-Based Recognition Framework for Underwater Landmarks Using Sonar Images †.
Lee, Yeongjun; Choi, Jinwoo; Ko, Nak Yong; Choi, Hyun-Taek
2017-08-24
This paper proposes a probability-based framework for recognizing underwater landmarks using sonar images. Current recognition methods use a single image, which does not provide reliable results because of weaknesses of the sonar image such as unstable acoustic source, many speckle noises, low resolution images, single channel image, and so on. However, using consecutive sonar images, if the status-i.e., the existence and identity (or name)-of an object is continuously evaluated by a stochastic method, the result of the recognition method is available for calculating the uncertainty, and it is more suitable for various applications. Our proposed framework consists of three steps: (1) candidate selection, (2) continuity evaluation, and (3) Bayesian feature estimation. Two probability methods-particle filtering and Bayesian feature estimation-are used to repeatedly estimate the continuity and feature of objects in consecutive images. Thus, the status of the object is repeatedly predicted and updated by a stochastic method. Furthermore, we develop an artificial landmark to increase detectability by an imaging sonar, which we apply to the characteristics of acoustic waves, such as instability and reflection depending on the roughness of the reflector surface. The proposed method is verified by conducting basin experiments, and the results are presented.
NASA Astrophysics Data System (ADS)
Sheng, Zheng
2013-02-01
The estimation of lower atmospheric refractivity from radar sea clutter (RFC) is a complicated nonlinear optimization problem. This paper deals with the RFC problem in a Bayesian framework. It uses the unbiased Markov Chain Monte Carlo (MCMC) sampling technique, which can provide accurate posterior probability distributions of the estimated refractivity parameters by using an electromagnetic split-step fast Fourier transform terrain parabolic equation propagation model within a Bayesian inversion framework. In contrast to the global optimization algorithm, the Bayesian—MCMC can obtain not only the approximate solutions, but also the probability distributions of the solutions, that is, uncertainty analyses of solutions. The Bayesian—MCMC algorithm is implemented on the simulation radar sea-clutter data and the real radar sea-clutter data. Reference data are assumed to be simulation data and refractivity profiles are obtained using a helicopter. The inversion algorithm is assessed (i) by comparing the estimated refractivity profiles from the assumed simulation and the helicopter sounding data; (ii) the one-dimensional (1D) and two-dimensional (2D) posterior probability distribution of solutions.
Sironi, Emanuele; Pinchi, Vilma; Pradella, Francesco; Focardi, Martina; Bozza, Silvia; Taroni, Franco
2018-04-01
Not only does the Bayesian approach offer a rational and logical environment for evidence evaluation in a forensic framework, but it also allows scientists to coherently deal with uncertainty related to a collection of multiple items of evidence, due to its flexible nature. Such flexibility might come at the expense of elevated computational complexity, which can be handled by using specific probabilistic graphical tools, namely Bayesian networks. In the current work, such probabilistic tools are used for evaluating dental evidence related to the development of third molars. A set of relevant properties characterizing the graphical models are discussed and Bayesian networks are implemented to deal with the inferential process laying beyond the estimation procedure, as well as to provide age estimates. Such properties include operationality, flexibility, coherence, transparence and sensitivity. A data sample composed of Italian subjects was employed for the analysis; results were in agreement with previous studies in terms of point estimate and age classification. The influence of the prior probability elicitation in terms of Bayesian estimate and classifies was also analyzed. Findings also supported the opportunity to take into consideration multiple teeth in the evaluative procedure, since it can be shown this results in an increased robustness towards the prior probability elicitation process, as well as in more favorable outcomes from a forensic perspective. Copyright © 2018 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach
NASA Technical Reports Server (NTRS)
Warner, James E.; Hochhalter, Jacob D.
2016-01-01
This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.
Wavelet-Bayesian inference of cosmic strings embedded in the cosmic microwave background
NASA Astrophysics Data System (ADS)
McEwen, J. D.; Feeney, S. M.; Peiris, H. V.; Wiaux, Y.; Ringeval, C.; Bouchet, F. R.
2017-12-01
Cosmic strings are a well-motivated extension to the standard cosmological model and could induce a subdominant component in the anisotropies of the cosmic microwave background (CMB), in addition to the standard inflationary component. The detection of strings, while observationally challenging, would provide a direct probe of physics at very high-energy scales. We develop a framework for cosmic string inference from observations of the CMB made over the celestial sphere, performing a Bayesian analysis in wavelet space where the string-induced CMB component has distinct statistical properties to the standard inflationary component. Our wavelet-Bayesian framework provides a principled approach to compute the posterior distribution of the string tension Gμ and the Bayesian evidence ratio comparing the string model to the standard inflationary model. Furthermore, we present a technique to recover an estimate of any string-induced CMB map embedded in observational data. Using Planck-like simulations, we demonstrate the application of our framework and evaluate its performance. The method is sensitive to Gμ ∼ 5 × 10-7 for Nambu-Goto string simulations that include an integrated Sachs-Wolfe contribution only and do not include any recombination effects, before any parameters of the analysis are optimized. The sensitivity of the method compares favourably with other techniques applied to the same simulations.
NASA Astrophysics Data System (ADS)
Khawaja, Taimoor Saleem
A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior and any abnormal or novel data during real-time operation. The results of the scheme are interpreted as a posterior probability of health (1 - probability of fault). As shown through two case studies in Chapter 3, the scheme is well suited for diagnosing imminent faults in dynamical non-linear systems. Finally, the failure prognosis scheme is based on an incremental weighted Bayesian LS-SVR machine. It is particularly suited for online deployment given the incremental nature of the algorithm and the quick optimization problem solved in the LS-SVR algorithm. By way of kernelization and a Gaussian Mixture Modeling (GMM) scheme, the algorithm can estimate "possibly" non-Gaussian posterior distributions for complex non-linear systems. An efficient regression scheme associated with the more rigorous core algorithm allows for long-term predictions, fault growth estimation with confidence bounds and remaining useful life (RUL) estimation after a fault is detected. The leading contributions of this thesis are (a) the development of a novel Bayesian Anomaly Detector for efficient and reliable Fault Detection and Identification (FDI) based on Least Squares Support Vector Machines, (b) the development of a data-driven real-time architecture for long-term Failure Prognosis using Least Squares Support Vector Machines, (c) Uncertainty representation and management using Bayesian Inference for posterior distribution estimation and hyper-parameter tuning, and finally (d) the statistical characterization of the performance of diagnosis and prognosis algorithms in order to relate the efficiency and reliability of the proposed schemes.
Spielman, Stephanie J; Wilke, Claus O
2016-11-01
The mutation-selection model of coding sequence evolution has received renewed attention for its use in estimating site-specific amino acid propensities and selection coefficient distributions. Two computationally tractable mutation-selection inference frameworks have been introduced: One framework employs a fixed-effects, highly parameterized maximum likelihood approach, whereas the other employs a random-effects Bayesian Dirichlet Process approach. While both implementations follow the same model, they appear to make distinct predictions about the distribution of selection coefficients. The fixed-effects framework estimates a large proportion of highly deleterious substitutions, whereas the random-effects framework estimates that all substitutions are either nearly neutral or weakly deleterious. It remains unknown, however, how accurately each method infers evolutionary constraints at individual sites. Indeed, selection coefficient distributions pool all site-specific inferences, thereby obscuring a precise assessment of site-specific estimates. Therefore, in this study, we use a simulation-based strategy to determine how accurately each approach recapitulates the selective constraint at individual sites. We find that the fixed-effects approach, despite its extensive parameterization, consistently and accurately estimates site-specific evolutionary constraint. By contrast, the random-effects Bayesian approach systematically underestimates the strength of natural selection, particularly for slowly evolving sites. We also find that, despite the strong differences between their inferred selection coefficient distributions, the fixed- and random-effects approaches yield surprisingly similar inferences of site-specific selective constraint. We conclude that the fixed-effects mutation-selection framework provides the more reliable software platform for model application and future development. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Bayesian calibration for electrochemical thermal model of lithium-ion cells
NASA Astrophysics Data System (ADS)
Tagade, Piyush; Hariharan, Krishnan S.; Basu, Suman; Verma, Mohan Kumar Singh; Kolake, Subramanya Mayya; Song, Taewon; Oh, Dukjin; Yeo, Taejung; Doo, Seokgwang
2016-07-01
Pseudo-two dimensional electrochemical thermal (P2D-ECT) model contains many parameters that are difficult to evaluate experimentally. Estimation of these model parameters is challenging due to computational cost and the transient model. Due to lack of complete physical understanding, this issue gets aggravated at extreme conditions like low temperature (LT) operations. This paper presents a Bayesian calibration framework for estimation of the P2D-ECT model parameters. The framework uses a matrix variate Gaussian process representation to obtain a computationally tractable formulation for calibration of the transient model. Performance of the framework is investigated for calibration of the P2D-ECT model across a range of temperatures (333 Ksbnd 263 K) and operating protocols. In the absence of complete physical understanding, the framework also quantifies structural uncertainty in the calibrated model. This information is used by the framework to test validity of the new physical phenomena before incorporation in the model. This capability is demonstrated by introducing temperature dependence on Bruggeman's coefficient and lithium plating formation at LT. With the incorporation of new physics, the calibrated P2D-ECT model accurately predicts the cell voltage with high confidence. The accurate predictions are used to obtain new insights into the low temperature lithium ion cell behavior.
A Comparison of the β-Substitution Method and a Bayesian Method for Analyzing Left-Censored Data
Huynh, Tran; Quick, Harrison; Ramachandran, Gurumurthy; Banerjee, Sudipto; Stenzel, Mark; Sandler, Dale P.; Engel, Lawrence S.; Kwok, Richard K.; Blair, Aaron; Stewart, Patricia A.
2016-01-01
Classical statistical methods for analyzing exposure data with values below the detection limits are well described in the occupational hygiene literature, but an evaluation of a Bayesian approach for handling such data is currently lacking. Here, we first describe a Bayesian framework for analyzing censored data. We then present the results of a simulation study conducted to compare the β-substitution method with a Bayesian method for exposure datasets drawn from lognormal distributions and mixed lognormal distributions with varying sample sizes, geometric standard deviations (GSDs), and censoring for single and multiple limits of detection. For each set of factors, estimates for the arithmetic mean (AM), geometric mean, GSD, and the 95th percentile (X0.95) of the exposure distribution were obtained. We evaluated the performance of each method using relative bias, the root mean squared error (rMSE), and coverage (the proportion of the computed 95% uncertainty intervals containing the true value). The Bayesian method using non-informative priors and the β-substitution method were generally comparable in bias and rMSE when estimating the AM and GM. For the GSD and the 95th percentile, the Bayesian method with non-informative priors was more biased and had a higher rMSE than the β-substitution method, but use of more informative priors generally improved the Bayesian method’s performance, making both the bias and the rMSE more comparable to the β-substitution method. An advantage of the Bayesian method is that it provided estimates of uncertainty for these parameters of interest and good coverage, whereas the β-substitution method only provided estimates of uncertainty for the AM, and coverage was not as consistent. Selection of one or the other method depends on the needs of the practitioner, the availability of prior information, and the distribution characteristics of the measurement data. We suggest the use of Bayesian methods if the practitioner has the computational resources and prior information, as the method would generally provide accurate estimates and also provides the distributions of all of the parameters, which could be useful for making decisions in some applications. PMID:26209598
ERIC Educational Resources Information Center
de la Torre, Jimmy; Patz, Richard J.
2005-01-01
This article proposes a practical method that capitalizes on the availability of information from multiple tests measuring correlated abilities given in a single test administration. By simultaneously estimating different abilities with the use of a hierarchical Bayesian framework, more precise estimates for each ability dimension are obtained.…
Moving beyond qualitative evaluations of Bayesian models of cognition.
Hemmer, Pernille; Tauber, Sean; Steyvers, Mark
2015-06-01
Bayesian models of cognition provide a powerful way to understand the behavior and goals of individuals from a computational point of view. Much of the focus in the Bayesian cognitive modeling approach has been on qualitative model evaluations, where predictions from the models are compared to data that is often averaged over individuals. In many cognitive tasks, however, there are pervasive individual differences. We introduce an approach to directly infer individual differences related to subjective mental representations within the framework of Bayesian models of cognition. In this approach, Bayesian data analysis methods are used to estimate cognitive parameters and motivate the inference process within a Bayesian cognitive model. We illustrate this integrative Bayesian approach on a model of memory. We apply the model to behavioral data from a memory experiment involving the recall of heights of people. A cross-validation analysis shows that the Bayesian memory model with inferred subjective priors predicts withheld data better than a Bayesian model where the priors are based on environmental statistics. In addition, the model with inferred priors at the individual subject level led to the best overall generalization performance, suggesting that individual differences are important to consider in Bayesian models of cognition.
Grieve, Richard; Nixon, Richard; Thompson, Simon G
2010-01-01
Cost-effectiveness analyses (CEA) may be undertaken alongside cluster randomized trials (CRTs) where randomization is at the level of the cluster (for example, the hospital or primary care provider) rather than the individual. Costs (and outcomes) within clusters may be correlated so that the assumption made by standard bivariate regression models, that observations are independent, is incorrect. This study develops a flexible modeling framework to acknowledge the clustering in CEA that use CRTs. The authors extend previous Bayesian bivariate models for CEA of multicenter trials to recognize the specific form of clustering in CRTs. They develop new Bayesian hierarchical models (BHMs) that allow mean costs and outcomes, and also variances, to differ across clusters. They illustrate how each model can be applied using data from a large (1732 cases, 70 primary care providers) CRT evaluating alternative interventions for reducing postnatal depression. The analyses compare cost-effectiveness estimates from BHMs with standard bivariate regression models that ignore the data hierarchy. The BHMs show high levels of cost heterogeneity across clusters (intracluster correlation coefficient, 0.17). Compared with standard regression models, the BHMs yield substantially increased uncertainty surrounding the cost-effectiveness estimates, and altered point estimates. The authors conclude that ignoring clustering can lead to incorrect inferences. The BHMs that they present offer a flexible modeling framework that can be applied more generally to CEA that use CRTs.
Bayesian estimation of multicomponent relaxation parameters in magnetic resonance fingerprinting.
McGivney, Debra; Deshmane, Anagha; Jiang, Yun; Ma, Dan; Badve, Chaitra; Sloan, Andrew; Gulani, Vikas; Griswold, Mark
2018-07-01
To estimate multiple components within a single voxel in magnetic resonance fingerprinting when the number and types of tissues comprising the voxel are not known a priori. Multiple tissue components within a single voxel are potentially separable with magnetic resonance fingerprinting as a result of differences in signal evolutions of each component. The Bayesian framework for inverse problems provides a natural and flexible setting for solving this problem when the tissue composition per voxel is unknown. Assuming that only a few entries from the dictionary contribute to a mixed signal, sparsity-promoting priors can be placed upon the solution. An iterative algorithm is applied to compute the maximum a posteriori estimator of the posterior probability density to determine the magnetic resonance fingerprinting dictionary entries that contribute most significantly to mixed or pure voxels. Simulation results show that the algorithm is robust in finding the component tissues of mixed voxels. Preliminary in vivo data confirm this result, and show good agreement in voxels containing pure tissue. The Bayesian framework and algorithm shown provide accurate solutions for the partial-volume problem in magnetic resonance fingerprinting. The flexibility of the method will allow further study into different priors and hyperpriors that can be applied in the model. Magn Reson Med 80:159-170, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
Bayesian Modeling of Exposure and Airflow Using Two-Zone Models
Zhang, Yufen; Banerjee, Sudipto; Yang, Rui; Lungu, Claudiu; Ramachandran, Gurumurthy
2009-01-01
Mathematical modeling is being increasingly used as a means for assessing occupational exposures. However, predicting exposure in real settings is constrained by lack of quantitative knowledge of exposure determinants. Validation of models in occupational settings is, therefore, a challenge. Not only do the model parameters need to be known, the models also need to predict the output with some degree of accuracy. In this paper, a Bayesian statistical framework is used for estimating model parameters and exposure concentrations for a two-zone model. The model predicts concentrations in a zone near the source and far away from the source as functions of the toluene generation rate, air ventilation rate through the chamber, and the airflow between near and far fields. The framework combines prior or expert information on the physical model along with the observed data. The framework is applied to simulated data as well as data obtained from the experiments conducted in a chamber. Toluene vapors are generated from a source under different conditions of airflow direction, the presence of a mannequin, and simulated body heat of the mannequin. The Bayesian framework accounts for uncertainty in measurement as well as in the unknown rate of airflow between the near and far fields. The results show that estimates of the interzonal airflow are always close to the estimated equilibrium solutions, which implies that the method works efficiently. The predictions of near-field concentration for both the simulated and real data show nice concordance with the true values, indicating that the two-zone model assumptions agree with the reality to a large extent and the model is suitable for predicting the contaminant concentration. Comparison of the estimated model and its margin of error with the experimental data thus enables validation of the physical model assumptions. The approach illustrates how exposure models and information on model parameters together with the knowledge of uncertainty and variability in these quantities can be used to not only provide better estimates of model outputs but also model parameters. PMID:19403840
Bayesian B-spline mapping for dynamic quantitative traits.
Xing, Jun; Li, Jiahan; Yang, Runqing; Zhou, Xiaojing; Xu, Shizhong
2012-04-01
Owing to their ability and flexibility to describe individual gene expression at different time points, random regression (RR) analyses have become a popular procedure for the genetic analysis of dynamic traits whose phenotypes are collected over time. Specifically, when modelling the dynamic patterns of gene expressions in the RR framework, B-splines have been proved successful as an alternative to orthogonal polynomials. In the so-called Bayesian B-spline quantitative trait locus (QTL) mapping, B-splines are used to characterize the patterns of QTL effects and individual-specific time-dependent environmental errors over time, and the Bayesian shrinkage estimation method is employed to estimate model parameters. Extensive simulations demonstrate that (1) in terms of statistical power, Bayesian B-spline mapping outperforms the interval mapping based on the maximum likelihood; (2) for the simulated dataset with complicated growth curve simulated by B-splines, Legendre polynomial-based Bayesian mapping is not capable of identifying the designed QTLs accurately, even when higher-order Legendre polynomials are considered and (3) for the simulated dataset using Legendre polynomials, the Bayesian B-spline mapping can find the same QTLs as those identified by Legendre polynomial analysis. All simulation results support the necessity and flexibility of B-spline in Bayesian mapping of dynamic traits. The proposed method is also applied to a real dataset, where QTLs controlling the growth trajectory of stem diameters in Populus are located.
Bayesian spatio-temporal modeling of particulate matter concentrations in Peninsular Malaysia
NASA Astrophysics Data System (ADS)
Manga, Edna; Awang, Norhashidah
2016-06-01
This article presents an application of a Bayesian spatio-temporal Gaussian process (GP) model on particulate matter concentrations from Peninsular Malaysia. We analyze daily PM10 concentration levels from 35 monitoring sites in June and July 2011. The spatiotemporal model set in a Bayesian hierarchical framework allows for inclusion of informative covariates, meteorological variables and spatiotemporal interactions. Posterior density estimates of the model parameters are obtained by Markov chain Monte Carlo methods. Preliminary data analysis indicate information on PM10 levels at sites classified as industrial locations could explain part of the space time variations. We include the site-type indicator in our modeling efforts. Results of the parameter estimates for the fitted GP model show significant spatio-temporal structure and positive effect of the location-type explanatory variable. We also compute some validation criteria for the out of sample sites that show the adequacy of the model for predicting PM10 at unmonitored sites.
F-MAP: A Bayesian approach to infer the gene regulatory network using external hints
Shahdoust, Maryam; Mahjub, Hossein; Sadeghi, Mehdi
2017-01-01
The Common topological features of related species gene regulatory networks suggest reconstruction of the network of one species by using the further information from gene expressions profile of related species. We present an algorithm to reconstruct the gene regulatory network named; F-MAP, which applies the knowledge about gene interactions from related species. Our algorithm sets a Bayesian framework to estimate the precision matrix of one species microarray gene expressions dataset to infer the Gaussian Graphical model of the network. The conjugate Wishart prior is used and the information from related species is applied to estimate the hyperparameters of the prior distribution by using the factor analysis. Applying the proposed algorithm on six related species of drosophila shows that the precision of reconstructed networks is improved considerably compared to the precision of networks constructed by other Bayesian approaches. PMID:28938012
The influence of ignoring secondary structure on divergence time estimates from ribosomal RNA genes.
Dohrmann, Martin
2014-02-01
Genes coding for ribosomal RNA molecules (rDNA) are among the most popular markers in molecular phylogenetics and evolution. However, coevolution of sites that code for pairing regions (stems) in the RNA secondary structure can make it challenging to obtain accurate results from such loci. While the influence of ignoring secondary structure on multiple sequence alignment and tree topology has been investigated in numerous studies, its effect on molecular divergence time estimates is still poorly known. Here, I investigate this issue in Bayesian Markov Chain Monte Carlo (BMCMC) and penalized likelihood (PL) frameworks, using empirical datasets from dragonflies (Odonata: Anisoptera) and glass sponges (Porifera: Hexactinellida). My results indicate that highly biased inferences under substitution models that ignore secondary structure only occur if maximum-likelihood estimates of branch lengths are used as input to PL dating, whereas in a BMCMC framework and in PL dating based on Bayesian consensus branch lengths, the effect is far less severe. I conclude that accounting for coevolution of paired sites in molecular dating studies is not as important as previously suggested, as long as the estimates are based on Bayesian consensus branch lengths instead of ML point estimates. This finding is especially relevant for studies where computational limitations do not allow the use of secondary-structure specific substitution models, or where accurate consensus structures cannot be predicted. I also found that the magnitude and direction (over- vs. underestimating node ages) of bias in age estimates when secondary structure is ignored was not distributed randomly across the nodes of the phylogenies, a phenomenon that requires further investigation. Copyright © 2013 Elsevier Inc. All rights reserved.
Bayesian models for comparative analysis integrating phylogenetic uncertainty.
de Villemereuil, Pierre; Wells, Jessie A; Edwards, Robert D; Blomberg, Simon P
2012-06-28
Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language.
Bayesian models for comparative analysis integrating phylogenetic uncertainty
2012-01-01
Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language. PMID:22741602
Bayesian switching factor analysis for estimating time-varying functional connectivity in fMRI.
Taghia, Jalil; Ryali, Srikanth; Chen, Tianwen; Supekar, Kaustubh; Cai, Weidong; Menon, Vinod
2017-07-15
There is growing interest in understanding the dynamical properties of functional interactions between distributed brain regions. However, robust estimation of temporal dynamics from functional magnetic resonance imaging (fMRI) data remains challenging due to limitations in extant multivariate methods for modeling time-varying functional interactions between multiple brain areas. Here, we develop a Bayesian generative model for fMRI time-series within the framework of hidden Markov models (HMMs). The model is a dynamic variant of the static factor analysis model (Ghahramani and Beal, 2000). We refer to this model as Bayesian switching factor analysis (BSFA) as it integrates factor analysis into a generative HMM in a unified Bayesian framework. In BSFA, brain dynamic functional networks are represented by latent states which are learnt from the data. Crucially, BSFA is a generative model which estimates the temporal evolution of brain states and transition probabilities between states as a function of time. An attractive feature of BSFA is the automatic determination of the number of latent states via Bayesian model selection arising from penalization of excessively complex models. Key features of BSFA are validated using extensive simulations on carefully designed synthetic data. We further validate BSFA using fingerprint analysis of multisession resting-state fMRI data from the Human Connectome Project (HCP). Our results show that modeling temporal dependencies in the generative model of BSFA results in improved fingerprinting of individual participants. Finally, we apply BSFA to elucidate the dynamic functional organization of the salience, central-executive, and default mode networks-three core neurocognitive systems with central role in cognitive and affective information processing (Menon, 2011). Across two HCP sessions, we demonstrate a high level of dynamic interactions between these networks and determine that the salience network has the highest temporal flexibility among the three networks. Our proposed methods provide a novel and powerful generative model for investigating dynamic brain connectivity. Copyright © 2017 Elsevier Inc. All rights reserved.
Combining statistical inference and decisions in ecology
Williams, Perry J.; Hooten, Mevin B.
2016-01-01
Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation, and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem.
Fully Bayesian Estimation of Data from Single Case Designs
ERIC Educational Resources Information Center
Rindskopf, David
2013-01-01
Single case designs (SCDs) generally consist of a small number of short time series in two or more phases. The analysis of SCDs statistically fits in the framework of a multilevel model, or hierarchical model. The usual analysis does not take into account the uncertainty in the estimation of the random effects. This not only has an effect on the…
Filtering in Hybrid Dynamic Bayesian Networks
NASA Technical Reports Server (NTRS)
Andersen, Morten Nonboe; Andersen, Rasmus Orum; Wheeler, Kevin
2000-01-01
We implement a 2-time slice dynamic Bayesian network (2T-DBN) framework and make a 1-D state estimation simulation, an extension of the experiment in (v.d. Merwe et al., 2000) and compare different filtering techniques. Furthermore, we demonstrate experimentally that inference in a complex hybrid DBN is possible by simulating fault detection in a watertank system, an extension of the experiment in (Koller & Lerner, 2000) using a hybrid 2T-DBN. In both experiments, we perform approximate inference using standard filtering techniques, Monte Carlo methods and combinations of these. In the watertank simulation, we also demonstrate the use of 'non-strict' Rao-Blackwellisation. We show that the unscented Kalman filter (UKF) and UKF in a particle filtering framework outperform the generic particle filter, the extended Kalman filter (EKF) and EKF in a particle filtering framework with respect to accuracy in terms of estimation RMSE and sensitivity with respect to choice of network structure. Especially we demonstrate the superiority of UKF in a PF framework when our beliefs of how data was generated are wrong. Furthermore, we investigate the influence of data noise in the watertank simulation using UKF and PFUKD and show that the algorithms are more sensitive to changes in the measurement noise level that the process noise level. Theory and implementation is based on (v.d. Merwe et al., 2000).
NASA Astrophysics Data System (ADS)
Kundu, Pradeep; Nath, Tameshwer; Palani, I. A.; Lad, Bhupesh K.
2018-06-01
The present paper tackles an important but unmapped problem of the reliability estimations of smart materials. First, an experimental setup is developed for accelerated life testing of the shape memory alloy (SMA) springs. Generalized log-linear Weibull (GLL-Weibull) distribution-based novel approach is then developed for SMA spring life estimation. Applied stimulus (voltage), elongation and cycles of operation are used as inputs for the life prediction model. The values of the parameter coefficients of the model provide better interpretability compared to artificial intelligence based life prediction approaches. In addition, the model also considers the effect of operating conditions, making it generic for a range of the operating conditions. Moreover, a Bayesian framework is used to continuously update the prediction with the actual degradation value of the springs, thereby reducing the uncertainty in the data and improving the prediction accuracy. In addition, the deterioration of material with number of cycles is also investigated using thermogravimetric analysis and scanning electron microscopy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stewart, Robert N.; Urban, Marie L.; Duchscherer, Samantha E.
Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the artmore » by introducing the Population Data Tables (PDT), a Bayesian based informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT probabilistically estimates ambient occupancy in units of people/1000ft2 for over 50 building types at the national and sub-national level with the goal of providing global coverage. The challenge of global coverage led to the development of an interdisciplinary geospatial informatics system tool that provides the framework for capturing, storing, and managing open source data, handling subject matter expertise, carrying out Bayesian analytics as well as visualizing and exporting occupancy estimation results. We present the PDT project, situate the work within the larger community, and report on the progress of this multi-year project.Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the art by introducing the Population Data Tables (PDT), a Bayesian model and informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT probabilistically estimates ambient occupancy in units of people/1000 ft 2 for over 50 building types at the national and sub-national level with the goal of providing global coverage. The challenge of global coverage led to the development of an interdisciplinary geospatial informatics system tool that provides the framework for capturing, storing, and managing open source data, handling subject matter expertise, carrying out Bayesian analytics as well as visualizing and exporting occupancy estimation results. We present the PDT project, situate the work within the larger community, and report on the progress of this multi-year project.« less
Akita, Yasuyuki; Chen, Jiu-Chiuan; Serre, Marc L
2012-09-01
Geostatistical methods are widely used in estimating long-term exposures for epidemiological studies on air pollution, despite their limited capabilities to handle spatial non-stationarity over large geographic domains and the uncertainty associated with missing monitoring data. We developed a moving-window (MW) Bayesian maximum entropy (BME) method and applied this framework to estimate fine particulate matter (PM(2.5)) yearly average concentrations over the contiguous US. The MW approach accounts for the spatial non-stationarity, while the BME method rigorously processes the uncertainty associated with data missingness in the air-monitoring system. In the cross-validation analyses conducted on a set of randomly selected complete PM(2.5) data in 2003 and on simulated data with different degrees of missing data, we demonstrate that the MW approach alone leads to at least 17.8% reduction in mean square error (MSE) in estimating the yearly PM(2.5). Moreover, the MWBME method further reduces the MSE by 8.4-43.7%, with the proportion of incomplete data increased from 18.3% to 82.0%. The MWBME approach leads to significant reductions in estimation error and thus is recommended for epidemiological studies investigating the effect of long-term exposure to PM(2.5) across large geographical domains with expected spatial non-stationarity.
Alderman, Phillip D.; Stanfill, Bryan
2016-10-06
Recent international efforts have brought renewed emphasis on the comparison of different agricultural systems models. Thus far, analysis of model-ensemble simulated results has not clearly differentiated between ensemble prediction uncertainties due to model structural differences per se and those due to parameter value uncertainties. Additionally, despite increasing use of Bayesian parameter estimation approaches with field-scale crop models, inadequate attention has been given to the full posterior distributions for estimated parameters. The objectives of this study were to quantify the impact of parameter value uncertainty on prediction uncertainty for modeling spring wheat phenology using Bayesian analysis and to assess the relativemore » contributions of model-structure-driven and parameter-value-driven uncertainty to overall prediction uncertainty. This study used a random walk Metropolis algorithm to estimate parameters for 30 spring wheat genotypes using nine phenology models based on multi-location trial data for days to heading and days to maturity. Across all cases, parameter-driven uncertainty accounted for between 19 and 52% of predictive uncertainty, while model-structure-driven uncertainty accounted for between 12 and 64%. Here, this study demonstrated the importance of quantifying both model-structure- and parameter-value-driven uncertainty when assessing overall prediction uncertainty in modeling spring wheat phenology. More generally, Bayesian parameter estimation provided a useful framework for quantifying and analyzing sources of prediction uncertainty.« less
Inverse and forward modeling under uncertainty using MRE-based Bayesian approach
NASA Astrophysics Data System (ADS)
Hou, Z.; Rubin, Y.
2004-12-01
A stochastic inverse approach for subsurface characterization is proposed and applied to shallow vadose zone at a winery field site in north California and to a gas reservoir at the Ormen Lange field site in the North Sea. The approach is formulated in a Bayesian-stochastic framework, whereby the unknown parameters are identified in terms of their statistical moments or their probabilities. Instead of the traditional single-valued estimation /prediction provided by deterministic methods, the approach gives a probability distribution for an unknown parameter. This allows calculating the mean, the mode, and the confidence interval, which is useful for a rational treatment of uncertainty and its consequences. The approach also allows incorporating data of various types and different error levels, including measurements of state variables as well as information such as bounds on or statistical moments of the unknown parameters, which may represent prior information. To obtain minimally subjective prior probabilities required for the Bayesian approach, the principle of Minimum Relative Entropy (MRE) is employed. The approach is tested in field sites for flow parameters identification and soil moisture estimation in the vadose zone and for gas saturation estimation at great depth below the ocean floor. Results indicate the potential of coupling various types of field data within a MRE-based Bayesian formalism for improving the estimation of the parameters of interest.
A Comparison of the β-Substitution Method and a Bayesian Method for Analyzing Left-Censored Data.
Huynh, Tran; Quick, Harrison; Ramachandran, Gurumurthy; Banerjee, Sudipto; Stenzel, Mark; Sandler, Dale P; Engel, Lawrence S; Kwok, Richard K; Blair, Aaron; Stewart, Patricia A
2016-01-01
Classical statistical methods for analyzing exposure data with values below the detection limits are well described in the occupational hygiene literature, but an evaluation of a Bayesian approach for handling such data is currently lacking. Here, we first describe a Bayesian framework for analyzing censored data. We then present the results of a simulation study conducted to compare the β-substitution method with a Bayesian method for exposure datasets drawn from lognormal distributions and mixed lognormal distributions with varying sample sizes, geometric standard deviations (GSDs), and censoring for single and multiple limits of detection. For each set of factors, estimates for the arithmetic mean (AM), geometric mean, GSD, and the 95th percentile (X0.95) of the exposure distribution were obtained. We evaluated the performance of each method using relative bias, the root mean squared error (rMSE), and coverage (the proportion of the computed 95% uncertainty intervals containing the true value). The Bayesian method using non-informative priors and the β-substitution method were generally comparable in bias and rMSE when estimating the AM and GM. For the GSD and the 95th percentile, the Bayesian method with non-informative priors was more biased and had a higher rMSE than the β-substitution method, but use of more informative priors generally improved the Bayesian method's performance, making both the bias and the rMSE more comparable to the β-substitution method. An advantage of the Bayesian method is that it provided estimates of uncertainty for these parameters of interest and good coverage, whereas the β-substitution method only provided estimates of uncertainty for the AM, and coverage was not as consistent. Selection of one or the other method depends on the needs of the practitioner, the availability of prior information, and the distribution characteristics of the measurement data. We suggest the use of Bayesian methods if the practitioner has the computational resources and prior information, as the method would generally provide accurate estimates and also provides the distributions of all of the parameters, which could be useful for making decisions in some applications. © The Author 2015. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
Estimating mountain basin-mean precipitation from streamflow using Bayesian inference
NASA Astrophysics Data System (ADS)
Henn, Brian; Clark, Martyn P.; Kavetski, Dmitri; Lundquist, Jessica D.
2015-10-01
Estimating basin-mean precipitation in complex terrain is difficult due to uncertainty in the topographical representativeness of precipitation gauges relative to the basin. To address this issue, we use Bayesian methodology coupled with a multimodel framework to infer basin-mean precipitation from streamflow observations, and we apply this approach to snow-dominated basins in the Sierra Nevada of California. Using streamflow observations, forcing data from lower-elevation stations, the Bayesian Total Error Analysis (BATEA) methodology and the Framework for Understanding Structural Errors (FUSE), we infer basin-mean precipitation, and compare it to basin-mean precipitation estimated using topographically informed interpolation from gauges (PRISM, the Parameter-elevation Regression on Independent Slopes Model). The BATEA-inferred spatial patterns of precipitation show agreement with PRISM in terms of the rank of basins from wet to dry but differ in absolute values. In some of the basins, these differences may reflect biases in PRISM, because some implied PRISM runoff ratios may be inconsistent with the regional climate. We also infer annual time series of basin precipitation using a two-step calibration approach. Assessment of the precision and robustness of the BATEA approach suggests that uncertainty in the BATEA-inferred precipitation is primarily related to uncertainties in hydrologic model structure. Despite these limitations, time series of inferred annual precipitation under different model and parameter assumptions are strongly correlated with one another, suggesting that this approach is capable of resolving year-to-year variability in basin-mean precipitation.
Bayesian flood forecasting methods: A review
NASA Astrophysics Data System (ADS)
Han, Shasha; Coulibaly, Paulin
2017-08-01
Over the past few decades, floods have been seen as one of the most common and largely distributed natural disasters in the world. If floods could be accurately forecasted in advance, then their negative impacts could be greatly minimized. It is widely recognized that quantification and reduction of uncertainty associated with the hydrologic forecast is of great importance for flood estimation and rational decision making. Bayesian forecasting system (BFS) offers an ideal theoretic framework for uncertainty quantification that can be developed for probabilistic flood forecasting via any deterministic hydrologic model. It provides suitable theoretical structure, empirically validated models and reasonable analytic-numerical computation method, and can be developed into various Bayesian forecasting approaches. This paper presents a comprehensive review on Bayesian forecasting approaches applied in flood forecasting from 1999 till now. The review starts with an overview of fundamentals of BFS and recent advances in BFS, followed with BFS application in river stage forecasting and real-time flood forecasting, then move to a critical analysis by evaluating advantages and limitations of Bayesian forecasting methods and other predictive uncertainty assessment approaches in flood forecasting, and finally discusses the future research direction in Bayesian flood forecasting. Results show that the Bayesian flood forecasting approach is an effective and advanced way for flood estimation, it considers all sources of uncertainties and produces a predictive distribution of the river stage, river discharge or runoff, thus gives more accurate and reliable flood forecasts. Some emerging Bayesian forecasting methods (e.g. ensemble Bayesian forecasting system, Bayesian multi-model combination) were shown to overcome limitations of single model or fixed model weight and effectively reduce predictive uncertainty. In recent years, various Bayesian flood forecasting approaches have been developed and widely applied, but there is still room for improvements. Future research in the context of Bayesian flood forecasting should be on assimilation of various sources of newly available information and improvement of predictive performance assessment methods.
NASA Astrophysics Data System (ADS)
Figueira, P.; Faria, J. P.; Adibekyan, V. Zh.; Oshagh, M.; Santos, N. C.
2016-11-01
We apply the Bayesian framework to assess the presence of a correlation between two quantities. To do so, we estimate the probability distribution of the parameter of interest, ρ, characterizing the strength of the correlation. We provide an implementation of these ideas and concepts using python programming language and the pyMC module in a very short (˜ 130 lines of code, heavily commented) and user-friendly program. We used this tool to assess the presence and properties of the correlation between planetary surface gravity and stellar activity level as measured by the log(R^' }_{ {HK}}) indicator. The results of the Bayesian analysis are qualitatively similar to those obtained via p-value analysis, and support the presence of a correlation in the data. The results are more robust in their derivation and more informative, revealing interesting features such as asymmetric posterior distributions or markedly different credible intervals, and allowing for a deeper exploration. We encourage the reader interested in this kind of problem to apply our code to his/her own scientific problems. The full understanding of what the Bayesian framework is can only be gained through the insight that comes by handling priors, assessing the convergence of Monte Carlo runs, and a multitude of other practical problems. We hope to contribute so that Bayesian analysis becomes a tool in the toolkit of researchers, and they understand by experience its advantages and limitations.
On a full Bayesian inference for force reconstruction problems
NASA Astrophysics Data System (ADS)
Aucejo, M.; De Smet, O.
2018-05-01
In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.
A patient-specific segmentation framework for longitudinal MR images of traumatic brain injury
NASA Astrophysics Data System (ADS)
Wang, Bo; Prastawa, Marcel; Irimia, Andrei; Chambers, Micah C.; Vespa, Paul M.; Van Horn, John D.; Gerig, Guido
2012-02-01
Traumatic brain injury (TBI) is a major cause of death and disability worldwide. Robust, reproducible segmentations of MR images with TBI are crucial for quantitative analysis of recovery and treatment efficacy. However, this is a significant challenge due to severe anatomy changes caused by edema (swelling), bleeding, tissue deformation, skull fracture, and other effects related to head injury. In this paper, we introduce a multi-modal image segmentation framework for longitudinal TBI images. The framework is initialized through manual input of primary lesion sites at each time point, which are then refined by a joint approach composed of Bayesian segmentation and construction of a personalized atlas. The personalized atlas construction estimates the average of the posteriors of the Bayesian segmentation at each time point and warps the average back to each time point to provide the updated priors for Bayesian segmentation. The difference between our approach and segmenting longitudinal images independently is that we use the information from all time points to improve the segmentations. Given a manual initialization, our framework automatically segments healthy structures (white matter, grey matter, cerebrospinal fluid) as well as different lesions such as hemorrhagic lesions and edema. Our framework can handle different sets of modalities at each time point, which provides flexibility in analyzing clinical scans. We show results on three subjects with acute baseline scans and chronic follow-up scans. The results demonstrate that joint analysis of all the points yields improved segmentation compared to independent analysis of the two time points.
A Framework for Inferring Taxonomic Class of Asteroids.
NASA Technical Reports Server (NTRS)
Dotson, J. L.; Mathias, D. L.
2017-01-01
Introduction: Taxonomic classification of asteroids based on their visible / near-infrared spectra or multi band photometry has proven to be a useful tool to infer other properties about asteroids. Meteorite analogs have been identified for several taxonomic classes, permitting detailed inference about asteroid composition. Trends have been identified between taxonomy and measured asteroid density. Thanks to NEOWise (Near-Earth-Object Wide-field Infrared Survey Explorer) and Spitzer (Spitzer Space Telescope), approximately twice as many asteroids have measured albedos than the number with taxonomic classifications. (If one only considers spectroscopically determined classifications, the ratio is greater than 40.) We present a Bayesian framework that provides probabilistic estimates of the taxonomic class of an asteroid based on its albedo. Although probabilistic estimates of taxonomic classes are not a replacement for spectroscopic or photometric determinations, they can be a useful tool for identifying objects for further study or for asteroid threat assessment models. Inputs and Framework: The framework relies upon two inputs: the expected fraction of each taxonomic class in the population and the albedo distribution of each class. Luckily, numerous authors have addressed both of these questions. For example, the taxonomic distribution by number, surface area and mass of the main belt has been estimated and a diameter limited estimate of fractional abundances of the near earth asteroid population was made. Similarly, the albedo distributions for taxonomic classes have been estimated for the combined main belt and NEA (Near Earth Asteroid) populations in different taxonomic systems and for the NEA population specifically. The framework utilizes a Bayesian inference appropriate for categorical data. The population fractions provide the prior while the albedo distributions allow calculation of the likelihood an albedo measurement is consistent with a given taxonomic class. These inputs allows calculation of the probability an asteroid with a specified albedo belongs to any given taxonomic class.
Peressutti, Devis; Penney, Graeme P; Housden, R James; Kolbitsch, Christoph; Gomez, Alberto; Rijkhorst, Erik-Jan; Barratt, Dean C; Rhode, Kawal S; King, Andrew P
2013-05-01
In image-guided cardiac interventions, respiratory motion causes misalignments between the pre-procedure roadmap of the heart used for guidance and the intra-procedure position of the heart, reducing the accuracy of the guidance information and leading to potentially dangerous consequences. We propose a novel technique for motion-correcting the pre-procedural information that combines a probabilistic MRI-derived affine motion model with intra-procedure real-time 3D echocardiography (echo) images in a Bayesian framework. The probabilistic model incorporates a measure of confidence in its motion estimates which enables resolution of the potentially conflicting information supplied by the model and the echo data. Unlike models proposed so far, our method allows the final motion estimate to deviate from the model-produced estimate according to the information provided by the echo images, so adapting to the complex variability of respiratory motion. The proposed method is evaluated using gold-standard MRI-derived motion fields and simulated 3D echo data for nine volunteers and real 3D live echo images for four volunteers. The Bayesian method is compared to 5 other motion estimation techniques and results show mean/max improvements in estimation accuracy of 10.6%/18.9% for simulated echo images and 20.8%/41.5% for real 3D live echo data, over the best comparative estimation method. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Rajaona, Harizo; Septier, François; Armand, Patrick; Delignon, Yves; Olry, Christophe; Albergel, Armand; Moussafir, Jacques
2015-12-01
In the eventuality of an accidental or intentional atmospheric release, the reconstruction of the source term using measurements from a set of sensors is an important and challenging inverse problem. A rapid and accurate estimation of the source allows faster and more efficient action for first-response teams, in addition to providing better damage assessment. This paper presents a Bayesian probabilistic approach to estimate the location and the temporal emission profile of a pointwise source. The release rate is evaluated analytically by using a Gaussian assumption on its prior distribution, and is enhanced with a positivity constraint to improve the estimation. The source location is obtained by the means of an advanced iterative Monte-Carlo technique called Adaptive Multiple Importance Sampling (AMIS), which uses a recycling process at each iteration to accelerate its convergence. The proposed methodology is tested using synthetic and real concentration data in the framework of the Fusion Field Trials 2007 (FFT-07) experiment. The quality of the obtained results is comparable to those coming from the Markov Chain Monte Carlo (MCMC) algorithm, a popular Bayesian method used for source estimation. Moreover, the adaptive processing of the AMIS provides a better sampling efficiency by reusing all the generated samples.
Combining statistical inference and decisions in ecology.
Williams, Perry J; Hooten, Mevin B
2016-09-01
Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods, including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem. © 2016 by the Ecological Society of America.
A Bayesian Framework for Human Body Pose Tracking from Depth Image Sequences
Zhu, Youding; Fujimura, Kikuo
2010-01-01
This paper addresses the problem of accurate and robust tracking of 3D human body pose from depth image sequences. Recovering the large number of degrees of freedom in human body movements from a depth image sequence is challenging due to the need to resolve the depth ambiguity caused by self-occlusions and the difficulty to recover from tracking failure. Human body poses could be estimated through model fitting using dense correspondences between depth data and an articulated human model (local optimization method). Although it usually achieves a high accuracy due to dense correspondences, it may fail to recover from tracking failure. Alternately, human pose may be reconstructed by detecting and tracking human body anatomical landmarks (key-points) based on low-level depth image analysis. While this method (key-point based method) is robust and recovers from tracking failure, its pose estimation accuracy depends solely on image-based localization accuracy of key-points. To address these limitations, we present a flexible Bayesian framework for integrating pose estimation results obtained by methods based on key-points and local optimization. Experimental results are shown and performance comparison is presented to demonstrate the effectiveness of the proposed approach. PMID:22399933
NASA Astrophysics Data System (ADS)
Fijani, E.; Chitsazan, N.; Nadiri, A.; Tsai, F. T.; Asghari Moghaddam, A.
2012-12-01
Artificial Neural Networks (ANNs) have been widely used to estimate concentration of chemicals in groundwater systems. However, estimation uncertainty is rarely discussed in the literature. Uncertainty in ANN output stems from three sources: ANN inputs, ANN parameters (weights and biases), and ANN structures. Uncertainty in ANN inputs may come from input data selection and/or input data error. ANN parameters are naturally uncertain because they are maximum-likelihood estimated. ANN structure is also uncertain because there is no unique ANN model given a specific case. Therefore, multiple plausible AI models are generally resulted for a study. One might ask why good models have to be ignored in favor of the best model in traditional estimation. What is the ANN estimation variance? How do the variances from different ANN models accumulate to the total estimation variance? To answer these questions we propose a Hierarchical Bayesian Model Averaging (HBMA) framework. Instead of choosing one ANN model (the best ANN model) for estimation, HBMA averages outputs of all plausible ANN models. The model weights are based on the evidence of data. Therefore, the HBMA avoids overconfidence on the single best ANN model. In addition, HBMA is able to analyze uncertainty propagation through aggregation of ANN models in a hierarchy framework. This method is applied for estimation of fluoride concentration in the Poldasht plain and the Bazargan plain in Iran. Unusually high fluoride concentration in the Poldasht and Bazargan plains has caused negative effects on the public health. Management of this anomaly requires estimation of fluoride concentration distribution in the area. The results show that the HBMA provides a knowledge-decision-based framework that facilitates analyzing and quantifying ANN estimation uncertainties from different sources. In addition HBMA allows comparative evaluation of the realizations for each source of uncertainty by segregating the uncertainty sources in a hierarchical framework. Fluoride concentration estimation using the HBMA method shows better agreement to the observation data in the test step because they are not based on a single model with a non-dominate weights.
Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming
2011-01-01
Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005–2007. PMID:21776223
Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming
2011-06-01
Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005-2007.
Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Duarte-Carvajalino, Julio M; Sapiro, Guillermo; Lenglet, Christophe
2018-02-15
We present a sparse Bayesian unmixing algorithm BusineX: Bayesian Unmixing for Sparse Inference-based Estimation of Fiber Crossings (X), for estimation of white matter fiber parameters from compressed (under-sampled) diffusion MRI (dMRI) data. BusineX combines compressive sensing with linear unmixing and introduces sparsity to the previously proposed multiresolution data fusion algorithm RubiX, resulting in a method for improved reconstruction, especially from data with lower number of diffusion gradients. We formulate the estimation of fiber parameters as a sparse signal recovery problem and propose a linear unmixing framework with sparse Bayesian learning for the recovery of sparse signals, the fiber orientations and volume fractions. The data is modeled using a parametric spherical deconvolution approach and represented using a dictionary created with the exponential decay components along different possible diffusion directions. Volume fractions of fibers along these directions define the dictionary weights. The proposed sparse inference, which is based on the dictionary representation, considers the sparsity of fiber populations and exploits the spatial redundancy in data representation, thereby facilitating inference from under-sampled q-space. The algorithm improves parameter estimation from dMRI through data-dependent local learning of hyperparameters, at each voxel and for each possible fiber orientation, that moderate the strength of priors governing the parameter variances. Experimental results on synthetic and in-vivo data show improved accuracy with a lower uncertainty in fiber parameter estimates. BusineX resolves a higher number of second and third fiber crossings. For under-sampled data, the algorithm is also shown to produce more reliable estimates. Copyright © 2017 Elsevier Inc. All rights reserved.
True versus Apparent Malaria Infection Prevalence: The Contribution of a Bayesian Approach
Claes, Filip; Van Hong, Nguyen; Torres, Kathy; Mao, Sokny; Van den Eede, Peter; Thi Thinh, Ta; Gamboa, Dioni; Sochantha, Tho; Thang, Ngo Duc; Coosemans, Marc; Büscher, Philippe; D'Alessandro, Umberto; Berkvens, Dirk; Erhart, Annette
2011-01-01
Aims To present a new approach for estimating the “true prevalence” of malaria and apply it to datasets from Peru, Vietnam, and Cambodia. Methods Bayesian models were developed for estimating both the malaria prevalence using different diagnostic tests (microscopy, PCR & ELISA), without the need of a gold standard, and the tests' characteristics. Several sources of information, i.e. data, expert opinions and other sources of knowledge can be integrated into the model. This approach resulting in an optimal and harmonized estimate of malaria infection prevalence, with no conflict between the different sources of information, was tested on data from Peru, Vietnam and Cambodia. Results Malaria sero-prevalence was relatively low in all sites, with ELISA showing the highest estimates. The sensitivity of microscopy and ELISA were statistically lower in Vietnam than in the other sites. Similarly, the specificities of microscopy, ELISA and PCR were significantly lower in Vietnam than in the other sites. In Vietnam and Peru, microscopy was closer to the “true” estimate than the other 2 tests while as expected ELISA, with its lower specificity, usually overestimated the prevalence. Conclusions Bayesian methods are useful for analyzing prevalence results when no gold standard diagnostic test is available. Though some results are expected, e.g. PCR more sensitive than microscopy, a standardized and context-independent quantification of the diagnostic tests' characteristics (sensitivity and specificity) and the underlying malaria prevalence may be useful for comparing different sites. Indeed, the use of a single diagnostic technique could strongly bias the prevalence estimation. This limitation can be circumvented by using a Bayesian framework taking into account the imperfect characteristics of the currently available diagnostic tests. As discussed in the paper, this approach may further support global malaria burden estimation initiatives. PMID:21364745
NASA Astrophysics Data System (ADS)
Maiti, Saumen; Tiwari, Ram Krishna
2010-10-01
A new probabilistic approach based on the concept of Bayesian neural network (BNN) learning theory is proposed for decoding litho-facies boundaries from well-log data. We show that how a multi-layer-perceptron neural network model can be employed in Bayesian framework to classify changes in litho-log successions. The method is then applied to the German Continental Deep Drilling Program (KTB) well-log data for classification and uncertainty estimation in the litho-facies boundaries. In this framework, a posteriori distribution of network parameter is estimated via the principle of Bayesian probabilistic theory, and an objective function is minimized following the scaled conjugate gradient optimization scheme. For the model development, we inflict a suitable criterion, which provides probabilistic information by emulating different combinations of synthetic data. Uncertainty in the relationship between the data and the model space is appropriately taken care by assuming a Gaussian a priori distribution of networks parameters (e.g., synaptic weights and biases). Prior to applying the new method to the real KTB data, we tested the proposed method on synthetic examples to examine the sensitivity of neural network hyperparameters in prediction. Within this framework, we examine stability and efficiency of this new probabilistic approach using different kinds of synthetic data assorted with different level of correlated noise. Our data analysis suggests that the designed network topology based on the Bayesian paradigm is steady up to nearly 40% correlated noise; however, adding more noise (˜50% or more) degrades the results. We perform uncertainty analyses on training, validation, and test data sets with and devoid of intrinsic noise by making the Gaussian approximation of the a posteriori distribution about the peak model. We present a standard deviation error-map at the network output corresponding to the three types of the litho-facies present over the entire litho-section of the KTB. The comparisons of maximum a posteriori geological sections constructed here, based on the maximum a posteriori probability distribution, with the available geological information and the existing geophysical findings suggest that the BNN results reveal some additional finer details in the KTB borehole data at certain depths, which appears to be of some geological significance. We also demonstrate that the proposed BNN approach is superior to the conventional artificial neural network in terms of both avoiding "over-fitting" and aiding uncertainty estimation, which are vital for meaningful interpretation of geophysical records. Our analyses demonstrate that the BNN-based approach renders a robust means for the classification of complex changes in the litho-facies successions and thus could provide a useful guide for understanding the crustal inhomogeneity and the structural discontinuity in many other tectonically complex regions.
Empirical Bayes estimation of proportions with application to cowbird parasitism rates
Link, W.A.; Hahn, D.C.
1996-01-01
Bayesian models provide a structure for studying collections of parameters such as are considered in the investigation of communities, ecosystems, and landscapes. This structure allows for improved estimation of individual parameters, by considering them in the context of a group of related parameters. Individual estimates are differentially adjusted toward an overall mean, with the magnitude of their adjustment based on their precision. Consequently, Bayesian estimation allows for a more credible identification of extreme values in a collection of estimates. Bayesian models regard individual parameters as values sampled from a specified probability distribution, called a prior. The requirement that the prior be known is often regarded as an unattractive feature of Bayesian analysis and may be the reason why Bayesian analyses are not frequently applied in ecological studies. Empirical Bayes methods provide an alternative approach that incorporates the structural advantages of Bayesian models while requiring a less stringent specification of prior knowledge. Rather than requiring that the prior distribution be known, empirical Bayes methods require only that it be in a certain family of distributions, indexed by hyperparameters that can be estimated from the available data. This structure is of interest per se, in addition to its value in allowing for improved estimation of individual parameters; for example, hypotheses regarding the existence of distinct subgroups in a collection of parameters can be considered under the empirical Bayes framework by allowing the hyperparameters to vary among subgroups. Though empirical Bayes methods have been applied in a variety of contexts, they have received little attention in the ecological literature. We describe the empirical Bayes approach in application to estimation of proportions, using data obtained in a community-wide study of cowbird parasitism rates for illustration. Since observed proportions based on small sample sizes are heavily adjusted toward the mean, extreme values among empirical Bayes estimates identify those species for which there is the greatest evidence of extreme parasitism rates. Applying a subgroup analysis to our data on cowbird parasitism rates, we conclude that parasitism rates for Neotropical Migrants as a group are no greater than those of Resident/Short-distance Migrant species in this forest community. Our data and analyses demonstrate that the parasitism rates for certain Neotropical Migrant species are remarkably low (Wood Thrush and Rose-breasted Grosbeak) while those for others are remarkably high (Ovenbird and Red-eyed Vireo).
Inferring on the Intentions of Others by Hierarchical Bayesian Learning
Diaconescu, Andreea O.; Mathys, Christoph; Weber, Lilian A. E.; Daunizeau, Jean; Kasper, Lars; Lomakina, Ekaterina I.; Fehr, Ernst; Stephan, Klaas E.
2014-01-01
Inferring on others' (potentially time-varying) intentions is a fundamental problem during many social transactions. To investigate the underlying mechanisms, we applied computational modeling to behavioral data from an economic game in which 16 pairs of volunteers (randomly assigned to “player” or “adviser” roles) interacted. The player performed a probabilistic reinforcement learning task, receiving information about a binary lottery from a visual pie chart. The adviser, who received more predictive information, issued an additional recommendation. Critically, the game was structured such that the adviser's incentives to provide helpful or misleading information varied in time. Using a meta-Bayesian modeling framework, we found that the players' behavior was best explained by the deployment of hierarchical learning: they inferred upon the volatility of the advisers' intentions in order to optimize their predictions about the validity of their advice. Beyond learning, volatility estimates also affected the trial-by-trial variability of decisions: participants were more likely to rely on their estimates of advice accuracy for making choices when they believed that the adviser's intentions were presently stable. Finally, our model of the players' inference predicted the players' interpersonal reactivity index (IRI) scores, explicit ratings of the advisers' helpfulness and the advisers' self-reports on their chosen strategy. Overall, our results suggest that humans (i) employ hierarchical generative models to infer on the changing intentions of others, (ii) use volatility estimates to inform decision-making in social interactions, and (iii) integrate estimates of advice accuracy with non-social sources of information. The Bayesian framework presented here can quantify individual differences in these mechanisms from simple behavioral readouts and may prove useful in future clinical studies of maladaptive social cognition. PMID:25187943
Rasmussen, Peter M.; Smith, Amy F.; Sakadžić, Sava; Boas, David A.; Pries, Axel R.; Secomb, Timothy W.; Østergaard, Leif
2017-01-01
Objective In vivo imaging of the microcirculation and network-oriented modeling have emerged as powerful means of studying microvascular function and understanding its physiological significance. Network-oriented modeling may provide the means of summarizing vast amounts of data produced by high-throughput imaging techniques in terms of key, physiological indices. To estimate such indices with sufficient certainty, however, network-oriented analysis must be robust to the inevitable presence of uncertainty due to measurement errors as well as model errors. Methods We propose the Bayesian probabilistic data analysis framework as a means of integrating experimental measurements and network model simulations into a combined and statistically coherent analysis. The framework naturally handles noisy measurements and provides posterior distributions of model parameters as well as physiological indices associated with uncertainty. Results We applied the analysis framework to experimental data from three rat mesentery networks and one mouse brain cortex network. We inferred distributions for more than five hundred unknown pressure and hematocrit boundary conditions. Model predictions were consistent with previous analyses, and remained robust when measurements were omitted from model calibration. Conclusion Our Bayesian probabilistic approach may be suitable for optimizing data acquisition and for analyzing and reporting large datasets acquired as part of microvascular imaging studies. PMID:27987383
Hu, Shiang; Yao, Dezhong; Valdes-Sosa, Pedro A
2018-01-01
The choice of reference for the electroencephalogram (EEG) is a long-lasting unsolved issue resulting in inconsistent usages and endless debates. Currently, both the average reference (AR) and the reference electrode standardization technique (REST) are two primary, apparently irreconcilable contenders. We propose a theoretical framework to resolve this reference issue by formulating both (a) estimation of potentials at infinity, and (b) determination of the reference, as a unified Bayesian linear inverse problem, which can be solved by maximum a posterior estimation. We find that AR and REST are very particular cases of this unified framework: AR results from biophysically non-informative prior; while REST utilizes the prior based on the EEG generative model. To allow for simultaneous denoising and reference estimation, we develop the regularized versions of AR and REST, named rAR and rREST, respectively. Both depend on a regularization parameter that is the noise to signal variance ratio. Traditional and new estimators are evaluated with this framework, by both simulations and analysis of real resting EEGs. Toward this end, we leverage the MRI and EEG data from 89 subjects which participated in the Cuban Human Brain Mapping Project. Generated artificial EEGs-with a known ground truth, show that relative error in estimating the EEG potentials at infinity is lowest for rREST. It also reveals that realistic volume conductor models improve the performances of REST and rREST. Importantly, for practical applications, it is shown that an average lead field gives the results comparable to the individual lead field. Finally, it is shown that the selection of the regularization parameter with Generalized Cross-Validation (GCV) is close to the "oracle" choice based on the ground truth. When evaluated with the real 89 resting state EEGs, rREST consistently yields the lowest GCV. This study provides a novel perspective to the EEG reference problem by means of a unified inverse solution framework. It may allow additional principled theoretical formulations and numerical evaluation of performance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ciuca, Razvan; Hernández, Oscar F., E-mail: razvan.ciuca@mail.mcgill.ca, E-mail: oscarh@physics.mcgill.ca
There exists various proposals to detect cosmic strings from Cosmic Microwave Background (CMB) or 21 cm temperature maps. Current proposals do not aim to find the location of strings on sky maps, all of these approaches can be thought of as a statistic on a sky map. We propose a Bayesian interpretation of cosmic string detection and within that framework, we derive a connection between estimates of cosmic string locations and cosmic string tension G μ. We use this Bayesian framework to develop a machine learning framework for detecting strings from sky maps and outline how to implement this frameworkmore » with neural networks. The neural network we trained was able to detect and locate cosmic strings on noiseless CMB temperature map down to a string tension of G μ=5 ×10{sup −9} and when analyzing a CMB temperature map that does not contain strings, the neural network gives a 0.95 probability that G μ≤2.3×10{sup −9}.« less
Bayesian data fusion for spatial prediction of categorical variables in environmental sciences
NASA Astrophysics Data System (ADS)
Gengler, Sarah; Bogaert, Patrick
2014-12-01
First developed to predict continuous variables, Bayesian Maximum Entropy (BME) has become a complete framework in the context of space-time prediction since it has been extended to predict categorical variables and mixed random fields. This method proposes solutions to combine several sources of data whatever the nature of the information. However, the various attempts that were made for adapting the BME methodology to categorical variables and mixed random fields faced some limitations, as a high computational burden. The main objective of this paper is to overcome this limitation by generalizing the Bayesian Data Fusion (BDF) theoretical framework to categorical variables, which is somehow a simplification of the BME method through the convenient conditional independence hypothesis. The BDF methodology for categorical variables is first described and then applied to a practical case study: the estimation of soil drainage classes using a soil map and point observations in the sandy area of Flanders around the city of Mechelen (Belgium). The BDF approach is compared to BME along with more classical approaches, as Indicator CoKringing (ICK) and logistic regression. Estimators are compared using various indicators, namely the Percentage of Correctly Classified locations (PCC) and the Average Highest Probability (AHP). Although BDF methodology for categorical variables is somehow a simplification of BME approach, both methods lead to similar results and have strong advantages compared to ICK and logistic regression.
Akita, Yasuyuki; Chen, Jiu-Chiuan; Serre, Marc L.
2013-01-01
Geostatistical methods are widely used in estimating long-term exposures for air pollution epidemiological studies, despite their limited capabilities to handle spatial non-stationarity over large geographic domains and uncertainty associated with missing monitoring data. We developed a moving-window (MW) Bayesian Maximum Entropy (BME) method and applied this framework to estimate fine particulate matter (PM2.5) yearly average concentrations over the contiguous U.S. The MW approach accounts for the spatial non-stationarity, while the BME method rigorously processes the uncertainty associated with data missingnees in the air monitoring system. In the cross-validation analyses conducted on a set of randomly selected complete PM2.5 data in 2003 and on simulated data with different degrees of missing data, we demonstrate that the MW approach alone leads to at least 17.8% reduction in mean square error (MSE) in estimating the yearly PM2.5. Moreover, the MWBME method further reduces the MSE by 8.4% to 43.7% with the proportion of incomplete data increased from 18.3% to 82.0%. The MWBME approach leads to significant reductions in estimation error and thus is recommended for epidemiological studies investigating the effect of long-term exposure to PM2.5 across large geographical domains with expected spatial non-stationarity. PMID:22739679
Covariance specification and estimation to improve top-down Green House Gas emission estimates
NASA Astrophysics Data System (ADS)
Ghosh, S.; Lopez-Coto, I.; Prasad, K.; Whetstone, J. R.
2015-12-01
The National Institute of Standards and Technology (NIST) operates the North-East Corridor (NEC) project and the Indianapolis Flux Experiment (INFLUX) in order to develop measurement methods to quantify sources of Greenhouse Gas (GHG) emissions as well as their uncertainties in urban domains using a top down inversion method. Top down inversion updates prior knowledge using observations in a Bayesian way. One primary consideration in a Bayesian inversion framework is the covariance structure of (1) the emission prior residuals and (2) the observation residuals (i.e. the difference between observations and model predicted observations). These covariance matrices are respectively referred to as the prior covariance matrix and the model-data mismatch covariance matrix. It is known that the choice of these covariances can have large effect on estimates. The main objective of this work is to determine the impact of different covariance models on inversion estimates and their associated uncertainties in urban domains. We use a pseudo-data Bayesian inversion framework using footprints (i.e. sensitivities of tower measurements of GHGs to surface emissions) and emission priors (based on Hestia project to quantify fossil-fuel emissions) to estimate posterior emissions using different covariance schemes. The posterior emission estimates and uncertainties are compared to the hypothetical truth. We find that, if we correctly specify spatial variability and spatio-temporal variability in prior and model-data mismatch covariances respectively, then we can compute more accurate posterior estimates. We discuss few covariance models to introduce space-time interacting mismatches along with estimation of the involved parameters. We then compare several candidate prior spatial covariance models from the Matern covariance class and estimate their parameters with specified mismatches. We find that best-fitted prior covariances are not always best in recovering the truth. To achieve accuracy, we perform a sensitivity study to further tune covariance parameters. Finally, we introduce a shrinkage based sample covariance estimation technique for both prior and mismatch covariances. This technique allows us to achieve similar accuracy nonparametrically in a more efficient and automated way.
NASA Astrophysics Data System (ADS)
Ghosh, S.; Lopez-Coto, I.; Prasad, K.; Karion, A.; Mueller, K.; Gourdji, S.; Martin, C.; Whetstone, J. R.
2017-12-01
The National Institute of Standards and Technology (NIST) supports the North-East Corridor Baltimore Washington (NEC-B/W) project and Indianapolis Flux Experiment (INFLUX) aiming to quantify sources of Greenhouse Gas (GHG) emissions as well as their uncertainties. These projects employ different flux estimation methods including top-down inversion approaches. The traditional Bayesian inversion method estimates emission distributions by updating prior information using atmospheric observations of Green House Gases (GHG) coupled to an atmospheric and dispersion model. The magnitude of the update is dependent upon the observed enhancement along with the assumed errors such as those associated with prior information and the atmospheric transport and dispersion model. These errors are specified within the inversion covariance matrices. The assumed structure and magnitude of the specified errors can have large impact on the emission estimates from the inversion. The main objective of this work is to build a data-adaptive model for these covariances matrices. We construct a synthetic data experiment using a Kalman Filter inversion framework (Lopez et al., 2017) employing different configurations of transport and dispersion model and an assumed prior. Unlike previous traditional Bayesian approaches, we estimate posterior emissions using regularized sample covariance matrices associated with prior errors to investigate whether the structure of the matrices help to better recover our hypothetical true emissions. To incorporate transport model error, we use ensemble of transport models combined with space-time analytical covariance to construct a covariance that accounts for errors in space and time. A Kalman Filter is then run using these covariances along with Maximum Likelihood Estimates (MLE) of the involved parameters. Preliminary results indicate that specifying sptio-temporally varying errors in the error covariances can improve the flux estimates and uncertainties. We also demonstrate that differences between the modeled and observed meteorology can be used to predict uncertainties associated with atmospheric transport and dispersion modeling which can help improve the skill of an inversion at urban scales.
Bayesian exponential random graph modelling of interhospital patient referral networks.
Caimo, Alberto; Pallotti, Francesca; Lomi, Alessandro
2017-08-15
Using original data that we have collected on referral relations between 110 hospitals serving a large regional community, we show how recently derived Bayesian exponential random graph models may be adopted to illuminate core empirical issues in research on relational coordination among healthcare organisations. We show how a rigorous Bayesian computation approach supports a fully probabilistic analytical framework that alleviates well-known problems in the estimation of model parameters of exponential random graph models. We also show how the main structural features of interhospital patient referral networks that prior studies have described can be reproduced with accuracy by specifying the system of local dependencies that produce - but at the same time are induced by - decentralised collaborative arrangements between hospitals. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Improving the Accuracy of Planet Occurrence Rates from Kepler Using Approximate Bayesian Computation
NASA Astrophysics Data System (ADS)
Hsu, Danley C.; Ford, Eric B.; Ragozzine, Darin; Morehead, Robert C.
2018-05-01
We present a new framework to characterize the occurrence rates of planet candidates identified by Kepler based on hierarchical Bayesian modeling, approximate Bayesian computing (ABC), and sequential importance sampling. For this study, we adopt a simple 2D grid in planet radius and orbital period as our model and apply our algorithm to estimate occurrence rates for Q1–Q16 planet candidates orbiting solar-type stars. We arrive at significantly increased planet occurrence rates for small planet candidates (R p < 1.25 R ⊕) at larger orbital periods (P > 80 day) compared to the rates estimated by the more common inverse detection efficiency method (IDEM). Our improved methodology estimates that the occurrence rate density of small planet candidates in the habitable zone of solar-type stars is {1.6}-0.5+1.2 per factor of 2 in planet radius and orbital period. Additionally, we observe a local minimum in the occurrence rate for strong planet candidates marginalized over orbital period between 1.5 and 2 R ⊕ that is consistent with previous studies. For future improvements, the forward modeling approach of ABC is ideally suited to incorporating multiple populations, such as planets, astrophysical false positives, and pipeline false alarms, to provide accurate planet occurrence rates and uncertainties. Furthermore, ABC provides a practical statistical framework for answering complex questions (e.g., frequency of different planetary architectures) and providing sound uncertainties, even in the face of complex selection effects, observational biases, and follow-up strategies. In summary, ABC offers a powerful tool for accurately characterizing a wide variety of astrophysical populations.
Comments: Causal Interpretations of Mediation Effects
ERIC Educational Resources Information Center
Jo, Booil; Stuart, Elizabeth A.
2012-01-01
The authors thank Dr. Lindsay Page for providing a nice illustration of the use of the principal stratification framework to define causal effects, and a Bayesian model for effect estimation. They hope that her well-written article will help expose education researchers to these concepts and methods, and move the field of mediation analysis in…
A Bayesian Nonparametric Causal Model for Regression Discontinuity Designs
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2013-01-01
The regression discontinuity (RD) design (Thistlewaite & Campbell, 1960; Cook, 2008) provides a framework to identify and estimate causal effects from a non-randomized design. Each subject of a RD design is assigned to the treatment (versus assignment to a non-treatment) whenever her/his observed value of the assignment variable equals or…
Hierarchical models of animal abundance and occurrence
Royle, J. Andrew; Dorazio, R.M.
2006-01-01
Much of animal ecology is devoted to studies of abundance and occurrence of species, based on surveys of spatially referenced sample units. These surveys frequently yield sparse counts that are contaminated by imperfect detection, making direct inference about abundance or occurrence based on observational data infeasible. This article describes a flexible hierarchical modeling framework for estimation and inference about animal abundance and occurrence from survey data that are subject to imperfect detection. Within this framework, we specify models of abundance and detectability of animals at the level of the local populations defined by the sample units. Information at the level of the local population is aggregated by specifying models that describe variation in abundance and detection among sites. We describe likelihood-based and Bayesian methods for estimation and inference under the resulting hierarchical model. We provide two examples of the application of hierarchical models to animal survey data, the first based on removal counts of stream fish and the second based on avian quadrat counts. For both examples, we provide a Bayesian analysis of the models using the software WinBUGS.
A hierarchical Bayesian GEV model for improving local and regional flood quantile estimates
NASA Astrophysics Data System (ADS)
Lima, Carlos H. R.; Lall, Upmanu; Troy, Tara; Devineni, Naresh
2016-10-01
We estimate local and regional Generalized Extreme Value (GEV) distribution parameters for flood frequency analysis in a multilevel, hierarchical Bayesian framework, to explicitly model and reduce uncertainties. As prior information for the model, we assume that the GEV location and scale parameters for each site come from independent log-normal distributions, whose mean parameter scales with the drainage area. From empirical and theoretical arguments, the shape parameter for each site is shrunk towards a common mean. Non-informative prior distributions are assumed for the hyperparameters and the MCMC method is used to sample from the joint posterior distribution. The model is tested using annual maximum series from 20 streamflow gauges located in an 83,000 km2 flood prone basin in Southeast Brazil. The results show a significant reduction of uncertainty estimates of flood quantile estimates over the traditional GEV model, particularly for sites with shorter records. For return periods within the range of the data (around 50 years), the Bayesian credible intervals for the flood quantiles tend to be narrower than the classical confidence limits based on the delta method. As the return period increases beyond the range of the data, the confidence limits from the delta method become unreliable and the Bayesian credible intervals provide a way to estimate satisfactory confidence bands for the flood quantiles considering parameter uncertainties and regional information. In order to evaluate the applicability of the proposed hierarchical Bayesian model for regional flood frequency analysis, we estimate flood quantiles for three randomly chosen out-of-sample sites and compare with classical estimates using the index flood method. The posterior distributions of the scaling law coefficients are used to define the predictive distributions of the GEV location and scale parameters for the out-of-sample sites given only their drainage areas and the posterior distribution of the average shape parameter is taken as the regional predictive distribution for this parameter. While the index flood method does not provide a straightforward way to consider the uncertainties in the index flood and in the regional parameters, the results obtained here show that the proposed Bayesian method is able to produce adequate credible intervals for flood quantiles that are in accordance with empirical estimates.
Bayesian Decision Theoretical Framework for Clustering
ERIC Educational Resources Information Center
Chen, Mo
2011-01-01
In this thesis, we establish a novel probabilistic framework for the data clustering problem from the perspective of Bayesian decision theory. The Bayesian decision theory view justifies the important questions: what is a cluster and what a clustering algorithm should optimize. We prove that the spectral clustering (to be specific, the…
2012-09-01
make end of life ( EOL ) and remaining useful life (RUL) estimations. Model-based prognostics approaches perform these tasks with the help of first...in parameters Degradation Modeling Parameter estimation Prediction Thermal / Electrical Stress Experimental Data State Space model RUL EOL ...distribution at given single time point kP , and use this for multi-step predictions to EOL . There are several methods which exits for selecting the sigma
Improved Bayesian Infrasonic Source Localization for regional infrasound
Blom, Philip S.; Marcillo, Omar; Arrowsmith, Stephen J.
2015-10-20
The Bayesian Infrasonic Source Localization (BISL) methodology is examined and simplified providing a generalized method of estimating the source location and time for an infrasonic event and the mathematical framework is used therein. The likelihood function describing an infrasonic detection used in BISL has been redefined to include the von Mises distribution developed in directional statistics and propagation-based, physically derived celerity-range and azimuth deviation models. Frameworks for constructing propagation-based celerity-range and azimuth deviation statistics are presented to demonstrate how stochastic propagation modelling methods can be used to improve the precision and accuracy of the posterior probability density function describing themore » source localization. Infrasonic signals recorded at a number of arrays in the western United States produced by rocket motor detonations at the Utah Test and Training Range are used to demonstrate the application of the new mathematical framework and to quantify the improvement obtained by using the stochastic propagation modelling methods. Moreover, using propagation-based priors, the spatial and temporal confidence bounds of the source decreased by more than 40 per cent in all cases and by as much as 80 per cent in one case. Further, the accuracy of the estimates remained high, keeping the ground truth within the 99 per cent confidence bounds for all cases.« less
Probability-Based Recognition Framework for Underwater Landmarks Using Sonar Images †
Choi, Jinwoo; Choi, Hyun-Taek
2017-01-01
This paper proposes a probability-based framework for recognizing underwater landmarks using sonar images. Current recognition methods use a single image, which does not provide reliable results because of weaknesses of the sonar image such as unstable acoustic source, many speckle noises, low resolution images, single channel image, and so on. However, using consecutive sonar images, if the status—i.e., the existence and identity (or name)—of an object is continuously evaluated by a stochastic method, the result of the recognition method is available for calculating the uncertainty, and it is more suitable for various applications. Our proposed framework consists of three steps: (1) candidate selection, (2) continuity evaluation, and (3) Bayesian feature estimation. Two probability methods—particle filtering and Bayesian feature estimation—are used to repeatedly estimate the continuity and feature of objects in consecutive images. Thus, the status of the object is repeatedly predicted and updated by a stochastic method. Furthermore, we develop an artificial landmark to increase detectability by an imaging sonar, which we apply to the characteristics of acoustic waves, such as instability and reflection depending on the roughness of the reflector surface. The proposed method is verified by conducting basin experiments, and the results are presented. PMID:28837068
A novel Bayesian framework for discriminative feature extraction in Brain-Computer Interfaces.
Suk, Heung-Il; Lee, Seong-Whan
2013-02-01
As there has been a paradigm shift in the learning load from a human subject to a computer, machine learning has been considered as a useful tool for Brain-Computer Interfaces (BCIs). In this paper, we propose a novel Bayesian framework for discriminative feature extraction for motor imagery classification in an EEG-based BCI in which the class-discriminative frequency bands and the corresponding spatial filters are optimized by means of the probabilistic and information-theoretic approaches. In our framework, the problem of simultaneous spatiospectral filter optimization is formulated as the estimation of an unknown posterior probability density function (pdf) that represents the probability that a single-trial EEG of predefined mental tasks can be discriminated in a state. In order to estimate the posterior pdf, we propose a particle-based approximation method by extending a factored-sampling technique with a diffusion process. An information-theoretic observation model is also devised to measure discriminative power of features between classes. From the viewpoint of classifier design, the proposed method naturally allows us to construct a spectrally weighted label decision rule by linearly combining the outputs from multiple classifiers. We demonstrate the feasibility and effectiveness of the proposed method by analyzing the results and its success on three public databases.
Joint Bayesian inference for near-surface explosion yield
NASA Astrophysics Data System (ADS)
Bulaevskaya, V.; Ford, S. R.; Ramirez, A. L.; Rodgers, A. J.
2016-12-01
A near-surface explosion generates seismo-acoustic motion that is related to its yield. However, the recorded motion is affected by near-source effects such as depth-of-burial, and propagation-path effects such as variable geology. We incorporate these effects in a forward model relating yield to seismo-acoustic motion, and use Bayesian inference to estimate yield given recordings of the seismo-acoustic wavefield. The Bayesian approach to this inverse problem allows us to obtain the probability distribution of plausible yield values and thus quantify the uncertainty in the yield estimate. Moreover, the sensitivity of the acoustic signal falls as a function of the depth-of-burial, while the opposite relationship holds for the seismic signal. Therefore, using both the acoustic and seismic wavefield data allows us to avoid the trade-offs associated with using only one of these signals alone. In addition, our inference framework allows for correlated features of the same data type (seismic or acoustic) to be incorporated in the estimation of yield in order to make use of as much information from the same waveform as possible. We demonstrate our approach with a historical dataset and a contemporary field experiment.
Capturing changes in flood risk with Bayesian approaches for flood damage assessment
NASA Astrophysics Data System (ADS)
Vogel, Kristin; Schröter, Kai; Kreibich, Heidi; Thieken, Annegret; Müller, Meike; Sieg, Tobias; Laudan, Jonas; Kienzler, Sarah; Weise, Laura; Merz, Bruno; Scherbaum, Frank
2016-04-01
Flood risk is a function of hazard as well as of exposure and vulnerability. All three components are under change over space and time and have to be considered for reliable damage estimations and risk analyses, since this is the basis for an efficient, adaptable risk management. Hitherto, models for estimating flood damage are comparatively simple and cannot sufficiently account for changing conditions. The Bayesian network approach allows for a multivariate modeling of complex systems without relying on expert knowledge about physical constraints. In a Bayesian network each model component is considered to be a random variable. The way of interactions between those variables can be learned from observations or be defined by expert knowledge. Even a combination of both is possible. Moreover, the probabilistic framework captures uncertainties related to the prediction and provides a probability distribution for the damage instead of a point estimate. The graphical representation of Bayesian networks helps to study the change of probabilities for changing circumstances and may thus simplify the communication between scientists and public authorities. In the framework of the DFG-Research Training Group "NatRiskChange" we aim to develop Bayesian networks for flood damage and vulnerability assessments of residential buildings and companies under changing conditions. A Bayesian network learned from data, collected over the last 15 years in flooded regions in the Elbe and Danube catchments (Germany), reveals the impact of many variables like building characteristics, precaution and warning situation on flood damage to residential buildings. While the handling of incomplete and hybrid (discrete mixed with continuous) data are the most challenging issues in the study on residential buildings, a similar study, that focuses on the vulnerability of small to medium sized companies, bears new challenges. Relying on a much smaller data set for the determination of the model parameters, overly complex models should be avoided. A so called Markov Blanket approach aims at the identification of the most relevant factors and constructs a Bayesian network based on those findings. With our approach we want to exploit a major advantage of Bayesian networks which is their ability to consider dependencies not only pairwise, but to capture the joint effects and interactions of driving forces. Hence, the flood damage network does not only show the impact of precaution on the building damage separately, but also reveals the mutual effects of precaution and the quality of warning for a variety of flood settings. Thus, it allows for a consideration of changing conditions and different courses of action and forms a novel and valuable tool for decision support. This study is funded by the Deutsche Forschungsgemeinschaft (DFG) within the research training program GRK 2043/1 "NatRiskChange - Natural hazards and risks in a changing world" at the University of Potsdam.
Logarithmic Laplacian Prior Based Bayesian Inverse Synthetic Aperture Radar Imaging.
Zhang, Shuanghui; Liu, Yongxiang; Li, Xiang; Bi, Guoan
2016-04-28
This paper presents a novel Inverse Synthetic Aperture Radar Imaging (ISAR) algorithm based on a new sparse prior, known as the logarithmic Laplacian prior. The newly proposed logarithmic Laplacian prior has a narrower main lobe with higher tail values than the Laplacian prior, which helps to achieve performance improvement on sparse representation. The logarithmic Laplacian prior is used for ISAR imaging within the Bayesian framework to achieve better focused radar image. In the proposed method of ISAR imaging, the phase errors are jointly estimated based on the minimum entropy criterion to accomplish autofocusing. The maximum a posterior (MAP) estimation and the maximum likelihood estimation (MLE) are utilized to estimate the model parameters to avoid manually tuning process. Additionally, the fast Fourier Transform (FFT) and Hadamard product are used to minimize the required computational efficiency. Experimental results based on both simulated and measured data validate that the proposed algorithm outperforms the traditional sparse ISAR imaging algorithms in terms of resolution improvement and noise suppression.
A Bayesian Framework of Uncertainties Integration in 3D Geological Model
NASA Astrophysics Data System (ADS)
Liang, D.; Liu, X.
2017-12-01
3D geological model can describe complicated geological phenomena in an intuitive way while its application may be limited by uncertain factors. Great progress has been made over the years, lots of studies decompose the uncertainties of geological model to analyze separately, while ignored the comprehensive impacts of multi-source uncertainties. Great progress has been made over the years, while lots of studies ignored the comprehensive impacts of multi-source uncertainties when analyzed them item by item from each source. To evaluate the synthetical uncertainty, we choose probability distribution to quantify uncertainty, and propose a bayesian framework of uncertainties integration. With this framework, we integrated data errors, spatial randomness, and cognitive information into posterior distribution to evaluate synthetical uncertainty of geological model. Uncertainties propagate and cumulate in modeling process, the gradual integration of multi-source uncertainty is a kind of simulation of the uncertainty propagation. Bayesian inference accomplishes uncertainty updating in modeling process. Maximum entropy principle makes a good effect on estimating prior probability distribution, which ensures the prior probability distribution subjecting to constraints supplied by the given information with minimum prejudice. In the end, we obtained a posterior distribution to evaluate synthetical uncertainty of geological model. This posterior distribution represents the synthetical impact of all the uncertain factors on the spatial structure of geological model. The framework provides a solution to evaluate synthetical impact on geological model of multi-source uncertainties and a thought to study uncertainty propagation mechanism in geological modeling.
A program for the Bayesian Neural Network in the ROOT framework
NASA Astrophysics Data System (ADS)
Zhong, Jiahang; Huang, Run-Sheng; Lee, Shih-Chang
2011-12-01
We present a Bayesian Neural Network algorithm implemented in the TMVA package (Hoecker et al., 2007 [1]), within the ROOT framework (Brun and Rademakers, 1997 [2]). Comparing to the conventional utilization of Neural Network as discriminator, this new implementation has more advantages as a non-parametric regression tool, particularly for fitting probabilities. It provides functionalities including cost function selection, complexity control and uncertainty estimation. An example of such application in High Energy Physics is shown. The algorithm is available with ROOT release later than 5.29. Program summaryProgram title: TMVA-BNN Catalogue identifier: AEJX_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEJX_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: BSD license No. of lines in distributed program, including test data, etc.: 5094 No. of bytes in distributed program, including test data, etc.: 1,320,987 Distribution format: tar.gz Programming language: C++ Computer: Any computer system or cluster with C++ compiler and UNIX-like operating system Operating system: Most UNIX/Linux systems. The application programs were thoroughly tested under Fedora and Scientific Linux CERN. Classification: 11.9 External routines: ROOT package version 5.29 or higher ( http://root.cern.ch) Nature of problem: Non-parametric fitting of multivariate distributions Solution method: An implementation of Neural Network following the Bayesian statistical interpretation. Uses Laplace approximation for the Bayesian marginalizations. Provides the functionalities of automatic complexity control and uncertainty estimation. Running time: Time consumption for the training depends substantially on the size of input sample, the NN topology, the number of training iterations, etc. For the example in this manuscript, about 7 min was used on a PC/Linux with 2.0 GHz processors.
Sharmin, Sifat; Glass, Kathryn; Viennet, Elvina; Harley, David
2018-04-01
Determining the relation between climate and dengue incidence is challenging due to under-reporting of disease and consequent biased incidence estimates. Non-linear associations between climate and incidence compound this. Here, we introduce a modelling framework to estimate dengue incidence from passive surveillance data while incorporating non-linear climate effects. We estimated the true number of cases per month using a Bayesian generalised linear model, developed in stages to adjust for under-reporting. A semi-parametric thin-plate spline approach was used to quantify non-linear climate effects. The approach was applied to data collected from the national dengue surveillance system of Bangladesh. The model estimated that only 2.8% (95% credible interval 2.7-2.8) of all cases in the capital Dhaka were reported through passive case reporting. The optimal mean monthly temperature for dengue transmission is 29℃ and average monthly rainfall above 15 mm decreases transmission. Our approach provides an estimate of true incidence and an understanding of the effects of temperature and rainfall on dengue transmission in Dhaka, Bangladesh.
Inference of reactive transport model parameters using a Bayesian multivariate approach
NASA Astrophysics Data System (ADS)
Carniato, Luca; Schoups, Gerrit; van de Giesen, Nick
2014-08-01
Parameter estimation of subsurface transport models from multispecies data requires the definition of an objective function that includes different types of measurements. Common approaches are weighted least squares (WLS), where weights are specified a priori for each measurement, and weighted least squares with weight estimation (WLS(we)) where weights are estimated from the data together with the parameters. In this study, we formulate the parameter estimation task as a multivariate Bayesian inference problem. The WLS and WLS(we) methods are special cases in this framework, corresponding to specific prior assumptions about the residual covariance matrix. The Bayesian perspective allows for generalizations to cases where residual correlation is important and for efficient inference by analytically integrating out the variances (weights) and selected covariances from the joint posterior. Specifically, the WLS and WLS(we) methods are compared to a multivariate (MV) approach that accounts for specific residual correlations without the need for explicit estimation of the error parameters. When applied to inference of reactive transport model parameters from column-scale data on dissolved species concentrations, the following results were obtained: (1) accounting for residual correlation between species provides more accurate parameter estimation for high residual correlation levels whereas its influence for predictive uncertainty is negligible, (2) integrating out the (co)variances leads to an efficient estimation of the full joint posterior with a reduced computational effort compared to the WLS(we) method, and (3) in the presence of model structural errors, none of the methods is able to identify the correct parameter values.
Mishra, Arabinda; Anderson, Adam W; Wu, Xi; Gore, John C; Ding, Zhaohua
2010-08-01
The purpose of this work is to design a neuronal fiber tracking algorithm, which will be more suitable for reconstruction of fibers associated with functionally important regions in the human brain. The functional activations in the brain normally occur in the gray matter regions. Hence the fibers bordering these regions are weakly myelinated, resulting in poor performance of conventional tractography methods to trace the fiber links between them. A lower fractional anisotropy in this region makes it even difficult to track the fibers in the presence of noise. In this work, the authors focused on a stochastic approach to reconstruct these fiber pathways based on a Bayesian regularization framework. To estimate the true fiber direction (propagation vector), the a priori and conditional probability density functions are calculated in advance and are modeled as multivariate normal. The variance of the estimated tensor element vector is associated with the uncertainty due to noise and partial volume averaging (PVA). An adaptive and multiple sampling of the estimated tensor element vector, which is a function of the pre-estimated variance, overcomes the effect of noise and PVA in this work. The algorithm has been rigorously tested using a variety of synthetic data sets. The quantitative comparison of the results to standard algorithms motivated the authors to implement it for in vivo DTI data analysis. The algorithm has been implemented to delineate fibers in two major language pathways (Broca's to SMA and Broca's to Wernicke's) across 12 healthy subjects. Though the mean of standard deviation was marginally bigger than conventional (Euler's) approach [P. J. Basser et al., "In vivo fiber tractography using DT-MRI data," Magn. Reson. Med. 44(4), 625-632 (2000)], the number of extracted fibers in this approach was significantly higher. The authors also compared the performance of the proposed method to Lu's method [Y. Lu et al., "Improved fiber tractography with Bayesian tensor regularization," Neuroimage 31(3), 1061-1074 (2006)] and Friman's stochastic approach [O. Friman et al., "A Bayesian approach for stochastic white matter tractography," IEEE Trans. Med. Imaging 25(8), 965-978 (2006)]. Overall performance of the approach is found to be superior to above two methods, particularly when the signal-to-noise ratio was low. The authors observed that an adaptive sampling of the tensor element vectors, estimated as a function of the variance in a Bayesian framework, can effectively delineate neuronal fibers to analyze the structure-function relationship in human brain. The simulated and in vivo results are in good agreement with the theoretical aspects of the algorithm.
Tseng, Shu-Ping; Li, Shou-Hsien; Hsieh, Chia-Hung; Wang, Hurng-Yi; Lin, Si-Min
2014-10-01
Dating the time of divergence and understanding speciation processes are central to the study of the evolutionary history of organisms but are notoriously difficult. The difficulty is largely rooted in variations in the ancestral population size or in the genealogy variation across loci. To depict the speciation processes and divergence histories of three monophyletic Takydromus species endemic to Taiwan, we sequenced 20 nuclear loci and combined with one mitochondrial locus published in GenBank. They were analysed by a multispecies coalescent approach within a Bayesian framework. Divergence dating based on the gene tree approach showed high variation among loci, and the divergence was estimated at an earlier date than when derived by the species-tree approach. To test whether variations in the ancestral population size accounted for the majority of this variation, we conducted computer inferences using isolation-with-migration (IM) and approximate Bayesian computation (ABC) frameworks. The results revealed that gene flow during the early stage of speciation was strongly favoured over the isolation model, and the initiation of the speciation process was far earlier than the dates estimated by gene- and species-based divergence dating. Due to their limited dispersal ability, it is suggested that geographical isolation may have played a major role in the divergence of these Takydromus species. Nevertheless, this study reveals a more complex situation and demonstrates that gene flow during the speciation process cannot be overlooked and may have a great impact on divergence dating. By using multilocus data and incorporating Bayesian coalescence approaches, we provide a more biologically realistic framework for delineating the divergence history of Takydromus. © 2014 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Ndu, Obibobi Kamtochukwu
To ensure that estimates of risk and reliability inform design and resource allocation decisions in the development of complex engineering systems, early engagement in the design life cycle is necessary. An unfortunate constraint on the accuracy of such estimates at this stage of concept development is the limited amount of high fidelity design and failure information available on the actual system under development. Applying the human ability to learn from experience and augment our state of knowledge to evolve better solutions mitigates this limitation. However, the challenge lies in formalizing a methodology that takes this highly abstract, but fundamentally human cognitive, ability and extending it to the field of risk analysis while maintaining the tenets of generalization, Bayesian inference, and probabilistic risk analysis. We introduce an integrated framework for inferring the reliability, or other probabilistic measures of interest, of a new system or a conceptual variant of an existing system. Abstractly, our framework is based on learning from the performance of precedent designs and then applying the acquired knowledge, appropriately adjusted based on degree of relevance, to the inference process. This dissertation presents a method for inferring properties of the conceptual variant using a pseudo-spatial model that describes the spatial configuration of the family of systems to which the concept belongs. Through non-metric multidimensional scaling, we formulate the pseudo-spatial model based on rank-ordered subjective expert perception of design similarity between systems that elucidate the psychological space of the family. By a novel extension of Kriging methods for analysis of geospatial data to our "pseudo-space of comparable engineered systems", we develop a Bayesian inference model that allows prediction of the probabilistic measure of interest.
Wang, Xulong; Philip, Vivek M.; Ananda, Guruprasad; White, Charles C.; Malhotra, Ankit; Michalski, Paul J.; Karuturi, Krishna R. Murthy; Chintalapudi, Sumana R.; Acklin, Casey; Sasner, Michael; Bennett, David A.; De Jager, Philip L.; Howell, Gareth R.; Carter, Gregory W.
2018-01-01
Recent technical and methodological advances have greatly enhanced genome-wide association studies (GWAS). The advent of low-cost, whole-genome sequencing facilitates high-resolution variant identification, and the development of linear mixed models (LMM) allows improved identification of putatively causal variants. While essential for correcting false positive associations due to sample relatedness and population stratification, LMMs have commonly been restricted to quantitative variables. However, phenotypic traits in association studies are often categorical, coded as binary case-control or ordered variables describing disease stages. To address these issues, we have devised a method for genomic association studies that implements a generalized LMM (GLMM) in a Bayesian framework, called Bayes-GLMM. Bayes-GLMM has four major features: (1) support of categorical, binary, and quantitative variables; (2) cohesive integration of previous GWAS results for related traits; (3) correction for sample relatedness by mixed modeling; and (4) model estimation by both Markov chain Monte Carlo sampling and maximal likelihood estimation. We applied Bayes-GLMM to the whole-genome sequencing cohort of the Alzheimer’s Disease Sequencing Project. This study contains 570 individuals from 111 families, each with Alzheimer’s disease diagnosed at one of four confidence levels. Using Bayes-GLMM we identified four variants in three loci significantly associated with Alzheimer’s disease. Two variants, rs140233081 and rs149372995, lie between PRKAR1B and PDGFA. The coded proteins are localized to the glial-vascular unit, and PDGFA transcript levels are associated with Alzheimer’s disease-related neuropathology. In summary, this work provides implementation of a flexible, generalized mixed-model approach in a Bayesian framework for association studies. PMID:29507048
Bucci, Melanie E.; Callahan, Peggy; Koprowski, John L.; Polfus, Jean L.; Krausman, Paul R.
2015-01-01
Stable isotope analysis of diet has become a common tool in conservation research. However, the multiple sources of uncertainty inherent in this analysis framework involve consequences that have not been thoroughly addressed. Uncertainty arises from the choice of trophic discrimination factors, and for Bayesian stable isotope mixing models (SIMMs), the specification of prior information; the combined effect of these aspects has not been explicitly tested. We used a captive feeding study of gray wolves (Canis lupus) to determine the first experimentally-derived trophic discrimination factors of C and N for this large carnivore of broad conservation interest. Using the estimated diet in our controlled system and data from a published study on wild wolves and their prey in Montana, USA, we then investigated the simultaneous effect of discrimination factors and prior information on diet reconstruction with Bayesian SIMMs. Discrimination factors for gray wolves and their prey were 1.97‰ for δ13C and 3.04‰ for δ15N. Specifying wolf discrimination factors, as opposed to the commonly used red fox (Vulpes vulpes) factors, made little practical difference to estimates of wolf diet, but prior information had a strong effect on bias, precision, and accuracy of posterior estimates. Without specifying prior information in our Bayesian SIMM, it was not possible to produce SIMM posteriors statistically similar to the estimated diet in our controlled study or the diet of wild wolves. Our study demonstrates the critical effect of prior information on estimates of animal diets using Bayesian SIMMs, and suggests species-specific trophic discrimination factors are of secondary importance. When using stable isotope analysis to inform conservation decisions researchers should understand the limits of their data. It may be difficult to obtain useful information from SIMMs if informative priors are omitted and species-specific discrimination factors are unavailable. PMID:25803664
Derbridge, Jonathan J; Merkle, Jerod A; Bucci, Melanie E; Callahan, Peggy; Koprowski, John L; Polfus, Jean L; Krausman, Paul R
2015-01-01
Stable isotope analysis of diet has become a common tool in conservation research. However, the multiple sources of uncertainty inherent in this analysis framework involve consequences that have not been thoroughly addressed. Uncertainty arises from the choice of trophic discrimination factors, and for Bayesian stable isotope mixing models (SIMMs), the specification of prior information; the combined effect of these aspects has not been explicitly tested. We used a captive feeding study of gray wolves (Canis lupus) to determine the first experimentally-derived trophic discrimination factors of C and N for this large carnivore of broad conservation interest. Using the estimated diet in our controlled system and data from a published study on wild wolves and their prey in Montana, USA, we then investigated the simultaneous effect of discrimination factors and prior information on diet reconstruction with Bayesian SIMMs. Discrimination factors for gray wolves and their prey were 1.97‰ for δ13C and 3.04‰ for δ15N. Specifying wolf discrimination factors, as opposed to the commonly used red fox (Vulpes vulpes) factors, made little practical difference to estimates of wolf diet, but prior information had a strong effect on bias, precision, and accuracy of posterior estimates. Without specifying prior information in our Bayesian SIMM, it was not possible to produce SIMM posteriors statistically similar to the estimated diet in our controlled study or the diet of wild wolves. Our study demonstrates the critical effect of prior information on estimates of animal diets using Bayesian SIMMs, and suggests species-specific trophic discrimination factors are of secondary importance. When using stable isotope analysis to inform conservation decisions researchers should understand the limits of their data. It may be difficult to obtain useful information from SIMMs if informative priors are omitted and species-specific discrimination factors are unavailable.
NASA Astrophysics Data System (ADS)
Freni, Gabriele; Mannina, Giorgio
In urban drainage modelling, uncertainty analysis is of undoubted necessity. However, uncertainty analysis in urban water-quality modelling is still in its infancy and only few studies have been carried out. Therefore, several methodological aspects still need to be experienced and clarified especially regarding water quality modelling. The use of the Bayesian approach for uncertainty analysis has been stimulated by its rigorous theoretical framework and by the possibility of evaluating the impact of new knowledge on the modelling predictions. Nevertheless, the Bayesian approach relies on some restrictive hypotheses that are not present in less formal methods like the Generalised Likelihood Uncertainty Estimation (GLUE). One crucial point in the application of Bayesian method is the formulation of a likelihood function that is conditioned by the hypotheses made regarding model residuals. Statistical transformations, such as the use of Box-Cox equation, are generally used to ensure the homoscedasticity of residuals. However, this practice may affect the reliability of the analysis leading to a wrong uncertainty estimation. The present paper aims to explore the influence of the Box-Cox equation for environmental water quality models. To this end, five cases were considered one of which was the “real” residuals distributions (i.e. drawn from available data). The analysis was applied to the Nocella experimental catchment (Italy) which is an agricultural and semi-urbanised basin where two sewer systems, two wastewater treatment plants and a river reach were monitored during both dry and wet weather periods. The results show that the uncertainty estimation is greatly affected by residual transformation and a wrong assumption may also affect the evaluation of model uncertainty. The use of less formal methods always provide an overestimation of modelling uncertainty with respect to Bayesian method but such effect is reduced if a wrong assumption is made regarding the residuals distribution. If residuals are not normally distributed, the uncertainty is over-estimated if Box-Cox transformation is not applied or non-calibrated parameter is used.
Bayesian phylogenetic estimation of fossil ages.
Drummond, Alexei J; Stadler, Tanja
2016-07-19
Recent advances have allowed for both morphological fossil evidence and molecular sequences to be integrated into a single combined inference of divergence dates under the rule of Bayesian probability. In particular, the fossilized birth-death tree prior and the Lewis-Mk model of discrete morphological evolution allow for the estimation of both divergence times and phylogenetic relationships between fossil and extant taxa. We exploit this statistical framework to investigate the internal consistency of these models by producing phylogenetic estimates of the age of each fossil in turn, within two rich and well-characterized datasets of fossil and extant species (penguins and canids). We find that the estimation accuracy of fossil ages is generally high with credible intervals seldom excluding the true age and median relative error in the two datasets of 5.7% and 13.2%, respectively. The median relative standard error (RSD) was 9.2% and 7.2%, respectively, suggesting good precision, although with some outliers. In fact, in the two datasets we analyse, the phylogenetic estimate of fossil age is on average less than 2 Myr from the mid-point age of the geological strata from which it was excavated. The high level of internal consistency found in our analyses suggests that the Bayesian statistical model employed is an adequate fit for both the geological and morphological data, and provides evidence from real data that the framework used can accurately model the evolution of discrete morphological traits coded from fossil and extant taxa. We anticipate that this approach will have diverse applications beyond divergence time dating, including dating fossils that are temporally unconstrained, testing of the 'morphological clock', and for uncovering potential model misspecification and/or data errors when controversial phylogenetic hypotheses are obtained based on combined divergence dating analyses.This article is part of the themed issue 'Dating species divergences using rocks and clocks'. © 2016 The Authors.
Bayesian phylogenetic estimation of fossil ages
Drummond, Alexei J.; Stadler, Tanja
2016-01-01
Recent advances have allowed for both morphological fossil evidence and molecular sequences to be integrated into a single combined inference of divergence dates under the rule of Bayesian probability. In particular, the fossilized birth–death tree prior and the Lewis-Mk model of discrete morphological evolution allow for the estimation of both divergence times and phylogenetic relationships between fossil and extant taxa. We exploit this statistical framework to investigate the internal consistency of these models by producing phylogenetic estimates of the age of each fossil in turn, within two rich and well-characterized datasets of fossil and extant species (penguins and canids). We find that the estimation accuracy of fossil ages is generally high with credible intervals seldom excluding the true age and median relative error in the two datasets of 5.7% and 13.2%, respectively. The median relative standard error (RSD) was 9.2% and 7.2%, respectively, suggesting good precision, although with some outliers. In fact, in the two datasets we analyse, the phylogenetic estimate of fossil age is on average less than 2 Myr from the mid-point age of the geological strata from which it was excavated. The high level of internal consistency found in our analyses suggests that the Bayesian statistical model employed is an adequate fit for both the geological and morphological data, and provides evidence from real data that the framework used can accurately model the evolution of discrete morphological traits coded from fossil and extant taxa. We anticipate that this approach will have diverse applications beyond divergence time dating, including dating fossils that are temporally unconstrained, testing of the ‘morphological clock', and for uncovering potential model misspecification and/or data errors when controversial phylogenetic hypotheses are obtained based on combined divergence dating analyses. This article is part of the themed issue ‘Dating species divergences using rocks and clocks’. PMID:27325827
Integrated survival analysis using an event-time approach in a Bayesian framework
Walsh, Daniel P.; Dreitz, VJ; Heisey, Dennis M.
2015-01-01
Event-time or continuous-time statistical approaches have been applied throughout the biostatistical literature and have led to numerous scientific advances. However, these techniques have traditionally relied on knowing failure times. This has limited application of these analyses, particularly, within the ecological field where fates of marked animals may be unknown. To address these limitations, we developed an integrated approach within a Bayesian framework to estimate hazard rates in the face of unknown fates. We combine failure/survival times from individuals whose fates are known and times of which are interval-censored with information from those whose fates are unknown, and model the process of detecting animals with unknown fates. This provides the foundation for our integrated model and permits necessary parameter estimation. We provide the Bayesian model, its derivation, and use simulation techniques to investigate the properties and performance of our approach under several scenarios. Lastly, we apply our estimation technique using a piece-wise constant hazard function to investigate the effects of year, age, chick size and sex, sex of the tending adult, and nesting habitat on mortality hazard rates of the endangered mountain plover (Charadrius montanus) chicks. Traditional models were inappropriate for this analysis because fates of some individual chicks were unknown due to failed radio transmitters. Simulations revealed biases of posterior mean estimates were minimal (≤ 4.95%), and posterior distributions behaved as expected with RMSE of the estimates decreasing as sample sizes, detection probability, and survival increased. We determined mortality hazard rates for plover chicks were highest at <5 days old and were lower for chicks with larger birth weights and/or whose nest was within agricultural habitats. Based on its performance, our approach greatly expands the range of problems for which event-time analyses can be used by eliminating the need for having completely known fate data.
Integrated survival analysis using an event-time approach in a Bayesian framework.
Walsh, Daniel P; Dreitz, Victoria J; Heisey, Dennis M
2015-02-01
Event-time or continuous-time statistical approaches have been applied throughout the biostatistical literature and have led to numerous scientific advances. However, these techniques have traditionally relied on knowing failure times. This has limited application of these analyses, particularly, within the ecological field where fates of marked animals may be unknown. To address these limitations, we developed an integrated approach within a Bayesian framework to estimate hazard rates in the face of unknown fates. We combine failure/survival times from individuals whose fates are known and times of which are interval-censored with information from those whose fates are unknown, and model the process of detecting animals with unknown fates. This provides the foundation for our integrated model and permits necessary parameter estimation. We provide the Bayesian model, its derivation, and use simulation techniques to investigate the properties and performance of our approach under several scenarios. Lastly, we apply our estimation technique using a piece-wise constant hazard function to investigate the effects of year, age, chick size and sex, sex of the tending adult, and nesting habitat on mortality hazard rates of the endangered mountain plover (Charadrius montanus) chicks. Traditional models were inappropriate for this analysis because fates of some individual chicks were unknown due to failed radio transmitters. Simulations revealed biases of posterior mean estimates were minimal (≤ 4.95%), and posterior distributions behaved as expected with RMSE of the estimates decreasing as sample sizes, detection probability, and survival increased. We determined mortality hazard rates for plover chicks were highest at <5 days old and were lower for chicks with larger birth weights and/or whose nest was within agricultural habitats. Based on its performance, our approach greatly expands the range of problems for which event-time analyses can be used by eliminating the need for having completely known fate data.
Displacement data assimilation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rosenthal, W. Steven; Venkataramani, Shankar; Mariano, Arthur J.
We show that modifying a Bayesian data assimilation scheme by incorporating kinematically-consistent displacement corrections produces a scheme that is demonstrably better at estimating partially observed state vectors in a setting where feature information is important. While the displacement transformation is generic, here we implement it within an ensemble Kalman Filter framework and demonstrate its effectiveness in tracking stochastically perturbed vortices.
Whose statistical reasoning is facilitated by a causal structure intervention?
McNair, Simon; Feeney, Aidan
2015-02-01
People often struggle when making Bayesian probabilistic estimates on the basis of competing sources of statistical evidence. Recently, Krynski and Tenenbaum (Journal of Experimental Psychology: General, 136, 430-450, 2007) proposed that a causal Bayesian framework accounts for peoples' errors in Bayesian reasoning and showed that, by clarifying the causal relations among the pieces of evidence, judgments on a classic statistical reasoning problem could be significantly improved. We aimed to understand whose statistical reasoning is facilitated by the causal structure intervention. In Experiment 1, although we observed causal facilitation effects overall, the effect was confined to participants high in numeracy. We did not find an overall facilitation effect in Experiment 2 but did replicate the earlier interaction between numerical ability and the presence or absence of causal content. This effect held when we controlled for general cognitive ability and thinking disposition. Our results suggest that clarifying causal structure facilitates Bayesian judgments, but only for participants with sufficient understanding of basic concepts in probability and statistics.
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula
NASA Astrophysics Data System (ADS)
Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.
2016-03-01
A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
Meta-analysis of the effect of natural frequencies on Bayesian reasoning.
McDowell, Michelle; Jacobs, Perke
2017-12-01
The natural frequency facilitation effect describes the finding that people are better able to solve descriptive Bayesian inference tasks when represented as joint frequencies obtained through natural sampling, known as natural frequencies, than as conditional probabilities. The present meta-analysis reviews 20 years of research seeking to address when, why, and for whom natural frequency formats are most effective. We review contributions from research associated with the 2 dominant theoretical perspectives, the ecological rationality framework and nested-sets theory, and test potential moderators of the effect. A systematic review of relevant literature yielded 35 articles representing 226 performance estimates. These estimates were statistically integrated using a bivariate mixed-effects model that yields summary estimates of average performances across the 2 formats and estimates of the effects of different study characteristics on performance. These study characteristics range from moderators representing individual characteristics (e.g., numeracy, expertise), to methodological differences (e.g., use of incentives, scoring criteria) and features of problem representation (e.g., short menu format, visual aid). Short menu formats (less computationally complex representations showing joint-events) and visual aids demonstrated some of the strongest moderation effects, improving performance for both conditional probability and natural frequency formats. A number of methodological factors (e.g., exposure to both problem formats) were also found to affect performance rates, emphasizing the importance of a systematic approach. We suggest how research on Bayesian reasoning can be strengthened by broadening the definition of successful Bayesian reasoning to incorporate choice and process and by applying different research methodologies. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty
Baele, Guy; Lemey, Philippe; Suchard, Marc A.
2016-01-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of “working distributions” to facilitate—or shorten—the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a “working” distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different “working” distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
CytoBayesJ: software tools for Bayesian analysis of cytogenetic radiation dosimetry data.
Ainsbury, Elizabeth A; Vinnikov, Volodymyr; Puig, Pedro; Maznyk, Nataliya; Rothkamm, Kai; Lloyd, David C
2013-08-30
A number of authors have suggested that a Bayesian approach may be most appropriate for analysis of cytogenetic radiation dosimetry data. In the Bayesian framework, probability of an event is described in terms of previous expectations and uncertainty. Previously existing, or prior, information is used in combination with experimental results to infer probabilities or the likelihood that a hypothesis is true. It has been shown that the Bayesian approach increases both the accuracy and quality assurance of radiation dose estimates. New software entitled CytoBayesJ has been developed with the aim of bringing Bayesian analysis to cytogenetic biodosimetry laboratory practice. CytoBayesJ takes a number of Bayesian or 'Bayesian like' methods that have been proposed in the literature and presents them to the user in the form of simple user-friendly tools, including testing for the most appropriate model for distribution of chromosome aberrations and calculations of posterior probability distributions. The individual tools are described in detail and relevant examples of the use of the methods and the corresponding CytoBayesJ software tools are given. In this way, the suitability of the Bayesian approach to biological radiation dosimetry is highlighted and its wider application encouraged by providing a user-friendly software interface and manual in English and Russian. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Koch, Wolfgang
1996-05-01
Sensor data processing in a dense target/dense clutter environment is inevitably confronted with data association conflicts which correspond with the multiple hypothesis character of many modern approaches (MHT: multiple hypothesis tracking). In this paper we analyze the efficiency of retrodictive techniques that generalize standard fixed interval smoothing to MHT applications. 'Delayed estimation' based on retrodiction provides uniquely interpretable and accurate trajectories from ambiguous MHT output if a certain time delay is tolerated. In a Bayesian framework the theoretical background of retrodiction and its intimate relation to Bayesian MHT is sketched. By a simulated example with two closely-spaced targets, relatively low detection probabilities, and rather high false return densities, we demonstrate the benefits of retrodiction and quantitatively discuss the achievable track accuracies and the time delays involved for typical radar parameters.
Lawson, Andrew B; Choi, Jungsoon; Cai, Bo; Hossain, Monir; Kirby, Russell S; Liu, Jihong
2012-09-01
We develop a new Bayesian two-stage space-time mixture model to investigate the effects of air pollution on asthma. The two-stage mixture model proposed allows for the identification of temporal latent structure as well as the estimation of the effects of covariates on health outcomes. In the paper, we also consider spatial misalignment of exposure and health data. A simulation study is conducted to assess the performance of the 2-stage mixture model. We apply our statistical framework to a county-level ambulatory care asthma data set in the US state of Georgia for the years 1999-2008.
NASA Astrophysics Data System (ADS)
Sahai, Swupnil
This thesis includes three parts. The overarching theme is how to analyze structured hierarchical data, with applications to astronomy and sociology. The first part discusses how expectation propagation can be used to parallelize the computation when fitting big hierarchical bayesian models. This methodology is then used to fit a novel, nonlinear mixture model to ultraviolet radiation from various regions of the observable universe. The second part discusses how the Stan probabilistic programming language can be used to numerically integrate terms in a hierarchical bayesian model. This technique is demonstrated on supernovae data to significantly speed up convergence to the posterior distribution compared to a previous study that used a Gibbs-type sampler. The third part builds a formal latent kernel representation for aggregate relational data as a way to more robustly estimate the mixing characteristics of agents in a network. In particular, the framework is applied to sociology surveys to estimate, as a function of ego age, the age and sex composition of the personal networks of individuals in the United States.
Markov Chain Monte Carlo Inference of Parametric Dictionaries for Sparse Bayesian Approximations
Chaspari, Theodora; Tsiartas, Andreas; Tsilifis, Panagiotis; Narayanan, Shrikanth
2016-01-01
Parametric dictionaries can increase the ability of sparse representations to meaningfully capture and interpret the underlying signal information, such as encountered in biomedical problems. Given a mapping function from the atom parameter space to the actual atoms, we propose a sparse Bayesian framework for learning the atom parameters, because of its ability to provide full posterior estimates, take uncertainty into account and generalize on unseen data. Inference is performed with Markov Chain Monte Carlo, that uses block sampling to generate the variables of the Bayesian problem. Since the parameterization of dictionary atoms results in posteriors that cannot be analytically computed, we use a Metropolis-Hastings-within-Gibbs framework, according to which variables with closed-form posteriors are generated with the Gibbs sampler, while the remaining ones with the Metropolis Hastings from appropriate candidate-generating densities. We further show that the corresponding Markov Chain is uniformly ergodic ensuring its convergence to a stationary distribution independently of the initial state. Results on synthetic data and real biomedical signals indicate that our approach offers advantages in terms of signal reconstruction compared to previously proposed Steepest Descent and Equiangular Tight Frame methods. This paper demonstrates the ability of Bayesian learning to generate parametric dictionaries that can reliably represent the exemplar data and provides the foundation towards inferring the entire variable set of the sparse approximation problem for signal denoising, adaptation and other applications. PMID:28649173
NASA Astrophysics Data System (ADS)
Kim, Jin-Young; Kwon, Hyun-Han; Kim, Hung-Soo
2015-04-01
The existing regional frequency analysis has disadvantages in that it is difficult to consider geographical characteristics in estimating areal rainfall. In this regard, this study aims to develop a hierarchical Bayesian model based nonstationary regional frequency analysis in that spatial patterns of the design rainfall with geographical information (e.g. latitude, longitude and altitude) are explicitly incorporated. This study assumes that the parameters of Gumbel (or GEV distribution) are a function of geographical characteristics within a general linear regression framework. Posterior distribution of the regression parameters are estimated by Bayesian Markov Chain Monte Carlo (MCMC) method, and the identified functional relationship is used to spatially interpolate the parameters of the distributions by using digital elevation models (DEM) as inputs. The proposed model is applied to derive design rainfalls over the entire Han-river watershed. It was found that the proposed Bayesian regional frequency analysis model showed similar results compared to L-moment based regional frequency analysis. In addition, the model showed an advantage in terms of quantifying uncertainty of the design rainfall and estimating the area rainfall considering geographical information. Finally, comprehensive discussion on design rainfall in the context of nonstationary will be presented. KEYWORDS: Regional frequency analysis, Nonstationary, Spatial information, Bayesian Acknowledgement This research was supported by a grant (14AWMP-B082564-01) from Advanced Water Management Research Program funded by Ministry of Land, Infrastructure and Transport of Korean government.
NASA Astrophysics Data System (ADS)
Raj, Rahul; van der Tol, Christiaan; Hamm, Nicholas Alexander Samuel; Stein, Alfred
2018-01-01
Parameters of a process-based forest growth simulator are difficult or impossible to obtain from field observations. Reliable estimates can be obtained using calibration against observations of output and state variables. In this study, we present a Bayesian framework to calibrate the widely used process-based simulator Biome-BGC against estimates of gross primary production (GPP) data. We used GPP partitioned from flux tower measurements of a net ecosystem exchange over a 55-year-old Douglas fir stand as an example. The uncertainties of both the Biome-BGC parameters and the simulated GPP values were estimated. The calibrated parameters leaf and fine root turnover (LFRT), ratio of fine root carbon to leaf carbon (FRC : LC), ratio of carbon to nitrogen in leaf (C : Nleaf), canopy water interception coefficient (Wint), fraction of leaf nitrogen in RuBisCO (FLNR), and effective soil rooting depth (SD) characterize the photosynthesis and carbon and nitrogen allocation in the forest. The calibration improved the root mean square error and enhanced Nash-Sutcliffe efficiency between simulated and flux tower daily GPP compared to the uncalibrated Biome-BGC. Nevertheless, the seasonal cycle for flux tower GPP was not reproduced exactly and some overestimation in spring and underestimation in summer remained after calibration. We hypothesized that the phenology exhibited a seasonal cycle that was not accurately reproduced by the simulator. We investigated this by calibrating the Biome-BGC to each month's flux tower GPP separately. As expected, the simulated GPP improved, but the calibrated parameter values suggested that the seasonal cycle of state variables in the simulator could be improved. It was concluded that the Bayesian framework for calibration can reveal features of the modelled physical processes and identify aspects of the process simulator that are too rigid.
Hierarchical Bayesian Modeling of Fluid-Induced Seismicity
NASA Astrophysics Data System (ADS)
Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.
2017-11-01
In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.
Bayesian Estimation and Inference Using Stochastic Electronics
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M.; Hamilton, Tara J.; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream. PMID:27047326
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Xuesong; Liang, Faming; Yu, Beibei
2011-11-09
Estimating uncertainty of hydrologic forecasting is valuable to water resources and other relevant decision making processes. Recently, Bayesian Neural Networks (BNNs) have been proved powerful tools for quantifying uncertainty of streamflow forecasting. In this study, we propose a Markov Chain Monte Carlo (MCMC) framework to incorporate the uncertainties associated with input, model structure, and parameter into BNNs. This framework allows the structure of the neural networks to change by removing or adding connections between neurons and enables scaling of input data by using rainfall multipliers. The results show that the new BNNs outperform the BNNs that only consider uncertainties associatedmore » with parameter and model structure. Critical evaluation of posterior distribution of neural network weights, number of effective connections, rainfall multipliers, and hyper-parameters show that the assumptions held in our BNNs are not well supported. Further understanding of characteristics of different uncertainty sources and including output error into the MCMC framework are expected to enhance the application of neural networks for uncertainty analysis of hydrologic forecasting.« less
Shiklomanov, Alexey N.; Dietze, Michael C.; Viskari, Toni; ...
2016-06-09
The remote monitoring of plant canopies is critically needed for understanding of terrestrial ecosystem mechanics and biodiversity as well as capturing the short- to long-term responses of vegetation to disturbance and climate change. A variety of orbital, sub-orbital, and field instruments have been used to retrieve optical spectral signals and to study different vegetation properties such as plant biochemistry, nutrient cycling, physiology, water status, and stress. Radiative transfer models (RTMs) provide a mechanistic link between vegetation properties and observed spectral features, and RTM spectral inversion is a useful framework for estimating these properties from spectral data. However, existing approaches tomore » RTM spectral inversion are typically limited by the inability to characterize uncertainty in parameter estimates. Here, we introduce a Bayesian algorithm for the spectral inversion of the PROSPECT 5 leaf RTM that is distinct from past approaches in two important ways: First, the algorithm only uses reflectance and does not require transmittance observations, which have been plagued by a variety of measurement and equipment challenges. Second, the output is not a point estimate for each parameter but rather the joint probability distribution that includes estimates of parameter uncertainties and covariance structure. We validated our inversion approach using a database of leaf spectra together with measurements of equivalent water thickness (EWT) and leaf dry mass per unit area (LMA). The parameters estimated by our inversion were able to accurately reproduce the observed reflectance (RMSE VIS = 0.0063, RMSE NIR-SWIR = 0.0098) and transmittance (RMSE VIS = 0.0404, RMSE NIR-SWIR = 0.0551) for both broadleaved and conifer species. Inversion estimates of EWT and LMA for broadleaved species agreed well with direct measurements (CV EWT = 18.8%, CV LMA = 24.5%), while estimates for conifer species were less accurate (CV EWT = 53.2%, CV LMA = 63.3%). To examine the influence of spectral resolution on parameter uncertainty, we simulated leaf reflectance as observed by ten common remote sensing platforms with varying spectral configurations and performed a Bayesian inversion on the resulting spectra. We found that full-range hyperspectral platforms were able to retrieve all parameters accurately and precisely, while the parameter estimates of multispectral platforms were much less precise and prone to bias at high and low values. We also observed that variations in the width and location of spectral bands influenced the shape of the covariance structure of parameter estimates. Lastly, our Bayesian spectral inversion provides a powerful and versatile framework for future RTM development and single- and multi-instrumental remote sensing of vegetation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shiklomanov, Alexey N.; Dietze, Michael C.; Viskari, Toni
The remote monitoring of plant canopies is critically needed for understanding of terrestrial ecosystem mechanics and biodiversity as well as capturing the short- to long-term responses of vegetation to disturbance and climate change. A variety of orbital, sub-orbital, and field instruments have been used to retrieve optical spectral signals and to study different vegetation properties such as plant biochemistry, nutrient cycling, physiology, water status, and stress. Radiative transfer models (RTMs) provide a mechanistic link between vegetation properties and observed spectral features, and RTM spectral inversion is a useful framework for estimating these properties from spectral data. However, existing approaches tomore » RTM spectral inversion are typically limited by the inability to characterize uncertainty in parameter estimates. Here, we introduce a Bayesian algorithm for the spectral inversion of the PROSPECT 5 leaf RTM that is distinct from past approaches in two important ways: First, the algorithm only uses reflectance and does not require transmittance observations, which have been plagued by a variety of measurement and equipment challenges. Second, the output is not a point estimate for each parameter but rather the joint probability distribution that includes estimates of parameter uncertainties and covariance structure. We validated our inversion approach using a database of leaf spectra together with measurements of equivalent water thickness (EWT) and leaf dry mass per unit area (LMA). The parameters estimated by our inversion were able to accurately reproduce the observed reflectance (RMSE VIS = 0.0063, RMSE NIR-SWIR = 0.0098) and transmittance (RMSE VIS = 0.0404, RMSE NIR-SWIR = 0.0551) for both broadleaved and conifer species. Inversion estimates of EWT and LMA for broadleaved species agreed well with direct measurements (CV EWT = 18.8%, CV LMA = 24.5%), while estimates for conifer species were less accurate (CV EWT = 53.2%, CV LMA = 63.3%). To examine the influence of spectral resolution on parameter uncertainty, we simulated leaf reflectance as observed by ten common remote sensing platforms with varying spectral configurations and performed a Bayesian inversion on the resulting spectra. We found that full-range hyperspectral platforms were able to retrieve all parameters accurately and precisely, while the parameter estimates of multispectral platforms were much less precise and prone to bias at high and low values. We also observed that variations in the width and location of spectral bands influenced the shape of the covariance structure of parameter estimates. Lastly, our Bayesian spectral inversion provides a powerful and versatile framework for future RTM development and single- and multi-instrumental remote sensing of vegetation.« less
A probabilistic model framework for evaluating year-to-year variation in crop productivity
NASA Astrophysics Data System (ADS)
Yokozawa, M.; Iizumi, T.; Tao, F.
2008-12-01
Most models describing the relation between crop productivity and weather condition have so far been focused on mean changes of crop yield. For keeping stable food supply against abnormal weather as well as climate change, evaluating the year-to-year variations in crop productivity rather than the mean changes is more essential. We here propose a new framework of probabilistic model based on Bayesian inference and Monte Carlo simulation. As an example, we firstly introduce a model on paddy rice production in Japan. It is called PRYSBI (Process- based Regional rice Yield Simulator with Bayesian Inference; Iizumi et al., 2008). The model structure is the same as that of SIMRIW, which was developed and used widely in Japan. The model includes three sub- models describing phenological development, biomass accumulation and maturing of rice crop. These processes are formulated to include response nature of rice plant to weather condition. This model inherently was developed to predict rice growth and yield at plot paddy scale. We applied it to evaluate the large scale rice production with keeping the same model structure. Alternatively, we assumed the parameters as stochastic variables. In order to let the model catch up actual yield at larger scale, model parameters were determined based on agricultural statistical data of each prefecture of Japan together with weather data averaged over the region. The posterior probability distribution functions (PDFs) of parameters included in the model were obtained using Bayesian inference. The MCMC (Markov Chain Monte Carlo) algorithm was conducted to numerically solve the Bayesian theorem. For evaluating the year-to-year changes in rice growth/yield under this framework, we firstly iterate simulations with set of parameter values sampled from the estimated posterior PDF of each parameter and then take the ensemble mean weighted with the posterior PDFs. We will also present another example for maize productivity in China. The framework proposed here provides us information on uncertainties, possibilities and limitations on future improvements in crop model as well.
Walker, Martin; Basáñez, María-Gloria; Ouédraogo, André Lin; Hermsen, Cornelus; Bousema, Teun; Churcher, Thomas S
2015-01-16
Quantitative molecular methods (QMMs) such as quantitative real-time polymerase chain reaction (q-PCR), reverse-transcriptase PCR (qRT-PCR) and quantitative nucleic acid sequence-based amplification (QT-NASBA) are increasingly used to estimate pathogen density in a variety of clinical and epidemiological contexts. These methods are often classified as semi-quantitative, yet estimates of reliability or sensitivity are seldom reported. Here, a statistical framework is developed for assessing the reliability (uncertainty) of pathogen densities estimated using QMMs and the associated diagnostic sensitivity. The method is illustrated with quantification of Plasmodium falciparum gametocytaemia by QT-NASBA. The reliability of pathogen (e.g. gametocyte) densities, and the accompanying diagnostic sensitivity, estimated by two contrasting statistical calibration techniques, are compared; a traditional method and a mixed model Bayesian approach. The latter accounts for statistical dependence of QMM assays run under identical laboratory protocols and permits structural modelling of experimental measurements, allowing precision to vary with pathogen density. Traditional calibration cannot account for inter-assay variability arising from imperfect QMMs and generates estimates of pathogen density that have poor reliability, are variable among assays and inaccurately reflect diagnostic sensitivity. The Bayesian mixed model approach assimilates information from replica QMM assays, improving reliability and inter-assay homogeneity, providing an accurate appraisal of quantitative and diagnostic performance. Bayesian mixed model statistical calibration supersedes traditional techniques in the context of QMM-derived estimates of pathogen density, offering the potential to improve substantially the depth and quality of clinical and epidemiological inference for a wide variety of pathogens.
Nagasaki, Masao; Yamaguchi, Rui; Yoshida, Ryo; Imoto, Seiya; Doi, Atsushi; Tamada, Yoshinori; Matsuno, Hiroshi; Miyano, Satoru; Higuchi, Tomoyuki
2006-01-01
We propose an automatic construction method of the hybrid functional Petri net as a simulation model of biological pathways. The problems we consider are how we choose the values of parameters and how we set the network structure. Usually, we tune these unknown factors empirically so that the simulation results are consistent with biological knowledge. Obviously, this approach has the limitation in the size of network of interest. To extend the capability of the simulation model, we propose the use of data assimilation approach that was originally established in the field of geophysical simulation science. We provide genomic data assimilation framework that establishes a link between our simulation model and observed data like microarray gene expression data by using a nonlinear state space model. A key idea of our genomic data assimilation is that the unknown parameters in simulation model are converted as the parameter of the state space model and the estimates are obtained as the maximum a posteriori estimators. In the parameter estimation process, the simulation model is used to generate the system model in the state space model. Such a formulation enables us to handle both the model construction and the parameter tuning within a framework of the Bayesian statistical inferences. In particular, the Bayesian approach provides us a way of controlling overfitting during the parameter estimations that is essential for constructing a reliable biological pathway. We demonstrate the effectiveness of our approach using synthetic data. As a result, parameter estimation using genomic data assimilation works very well and the network structure is suitably selected.
Uncertainty Quantification using Exponential Epi-Splines
2013-06-01
Leibler divergence. The choice of κ in applications can be informed by the fact that the Kullback - Leibler divergence between two normal densities, ϕ1... of ran- dom output quantities of interests. The framework systematically incorporates hard information derived from physics-based sensors, field test ... information , and determines the ‘best’ estimate within that family. Bayesian estima- tion makes use of prior soft information
A 3-Component Mixture of Rayleigh Distributions: Properties and Estimation in Bayesian Framework
Aslam, Muhammad; Tahir, Muhammad; Hussain, Zawar; Al-Zahrani, Bander
2015-01-01
To study lifetimes of certain engineering processes, a lifetime model which can accommodate the nature of such processes is desired. The mixture models of underlying lifetime distributions are intuitively more appropriate and appealing to model the heterogeneous nature of process as compared to simple models. This paper is about studying a 3-component mixture of the Rayleigh distributionsin Bayesian perspective. The censored sampling environment is considered due to its popularity in reliability theory and survival analysis. The expressions for the Bayes estimators and their posterior risks are derived under different scenarios. In case the case that no or little prior information is available, elicitation of hyperparameters is given. To examine, numerically, the performance of the Bayes estimators using non-informative and informative priors under different loss functions, we have simulated their statistical properties for different sample sizes and test termination times. In addition, to highlight the practical significance, an illustrative example based on a real-life engineering data is also given. PMID:25993475
Bayesian inversion analysis of nonlinear dynamics in surface heterogeneous reactions.
Omori, Toshiaki; Kuwatani, Tatsu; Okamoto, Atsushi; Hukushima, Koji
2016-09-01
It is essential to extract nonlinear dynamics from time-series data as an inverse problem in natural sciences. We propose a Bayesian statistical framework for extracting nonlinear dynamics of surface heterogeneous reactions from sparse and noisy observable data. Surface heterogeneous reactions are chemical reactions with conjugation of multiple phases, and they have the intrinsic nonlinearity of their dynamics caused by the effect of surface-area between different phases. We adapt a belief propagation method and an expectation-maximization (EM) algorithm to partial observation problem, in order to simultaneously estimate the time course of hidden variables and the kinetic parameters underlying dynamics. The proposed belief propagation method is performed by using sequential Monte Carlo algorithm in order to estimate nonlinear dynamical system. Using our proposed method, we show that the rate constants of dissolution and precipitation reactions, which are typical examples of surface heterogeneous reactions, as well as the temporal changes of solid reactants and products, were successfully estimated only from the observable temporal changes in the concentration of the dissolved intermediate product.
Sequential bearings-only-tracking initiation with particle filtering method.
Liu, Bin; Hao, Chengpeng
2013-01-01
The tracking initiation problem is examined in the context of autonomous bearings-only-tracking (BOT) of a single appearing/disappearing target in the presence of clutter measurements. In general, this problem suffers from a combinatorial explosion in the number of potential tracks resulted from the uncertainty in the linkage between the target and the measurement (a.k.a the data association problem). In addition, the nonlinear measurements lead to a non-Gaussian posterior probability density function (pdf) in the optimal Bayesian sequential estimation framework. The consequence of this nonlinear/non-Gaussian context is the absence of a closed-form solution. This paper models the linkage uncertainty and the nonlinear/non-Gaussian estimation problem jointly with solid Bayesian formalism. A particle filtering (PF) algorithm is derived for estimating the model's parameters in a sequential manner. Numerical results show that the proposed solution provides a significant benefit over the most commonly used methods, IPDA and IMMPDA. The posterior Cramér-Rao bounds are also involved for performance evaluation.
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
Hu, Shiang; Yao, Dezhong; Valdes-Sosa, Pedro A.
2018-01-01
The choice of reference for the electroencephalogram (EEG) is a long-lasting unsolved issue resulting in inconsistent usages and endless debates. Currently, both the average reference (AR) and the reference electrode standardization technique (REST) are two primary, apparently irreconcilable contenders. We propose a theoretical framework to resolve this reference issue by formulating both (a) estimation of potentials at infinity, and (b) determination of the reference, as a unified Bayesian linear inverse problem, which can be solved by maximum a posterior estimation. We find that AR and REST are very particular cases of this unified framework: AR results from biophysically non-informative prior; while REST utilizes the prior based on the EEG generative model. To allow for simultaneous denoising and reference estimation, we develop the regularized versions of AR and REST, named rAR and rREST, respectively. Both depend on a regularization parameter that is the noise to signal variance ratio. Traditional and new estimators are evaluated with this framework, by both simulations and analysis of real resting EEGs. Toward this end, we leverage the MRI and EEG data from 89 subjects which participated in the Cuban Human Brain Mapping Project. Generated artificial EEGs—with a known ground truth, show that relative error in estimating the EEG potentials at infinity is lowest for rREST. It also reveals that realistic volume conductor models improve the performances of REST and rREST. Importantly, for practical applications, it is shown that an average lead field gives the results comparable to the individual lead field. Finally, it is shown that the selection of the regularization parameter with Generalized Cross-Validation (GCV) is close to the “oracle” choice based on the ground truth. When evaluated with the real 89 resting state EEGs, rREST consistently yields the lowest GCV. This study provides a novel perspective to the EEG reference problem by means of a unified inverse solution framework. It may allow additional principled theoretical formulations and numerical evaluation of performance. PMID:29780302
A comment on priors for Bayesian occupancy models.
Northrup, Joseph M; Gerber, Brian D
2018-01-01
Understanding patterns of species occurrence and the processes underlying these patterns is fundamental to the study of ecology. One of the more commonly used approaches to investigate species occurrence patterns is occupancy modeling, which can account for imperfect detection of a species during surveys. In recent years, there has been a proliferation of Bayesian modeling in ecology, which includes fitting Bayesian occupancy models. The Bayesian framework is appealing to ecologists for many reasons, including the ability to incorporate prior information through the specification of prior distributions on parameters. While ecologists almost exclusively intend to choose priors so that they are "uninformative" or "vague", such priors can easily be unintentionally highly informative. Here we report on how the specification of a "vague" normally distributed (i.e., Gaussian) prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation. Using both simulated data and empirical examples, we illustrate how this issue likely compromises inference about species-habitat relationships. While the extent to which these informative priors influence inference depends on the data set, researchers fitting Bayesian occupancy models should conduct sensitivity analyses to ensure intended inference, or employ less commonly used priors that are less informative (e.g., logistic or t prior distributions). We provide suggestions for addressing this issue in occupancy studies, and an online tool for exploring this issue under different contexts.
NASA Astrophysics Data System (ADS)
Wen, Fang-Qing; Zhang, Gong; Ben, De
2015-11-01
This paper addresses the direction of arrival (DOA) estimation problem for the co-located multiple-input multiple-output (MIMO) radar with random arrays. The spatially distributed sparsity of the targets in the background makes compressive sensing (CS) desirable for DOA estimation. A spatial CS framework is presented, which links the DOA estimation problem to support recovery from a known over-complete dictionary. A modified statistical model is developed to accurately represent the intra-block correlation of the received signal. A structural sparsity Bayesian learning algorithm is proposed for the sparse recovery problem. The proposed algorithm, which exploits intra-signal correlation, is capable being applied to limited data support and low signal-to-noise ratio (SNR) scene. Furthermore, the proposed algorithm has less computation load compared to the classical Bayesian algorithm. Simulation results show that the proposed algorithm has a more accurate DOA estimation than the traditional multiple signal classification (MUSIC) algorithm and other CS recovery algorithms. Project supported by the National Natural Science Foundation of China (Grant Nos. 61071163, 61271327, and 61471191), the Funding for Outstanding Doctoral Dissertation in Nanjing University of Aeronautics and Astronautics, China (Grant No. BCXJ14-08), the Funding of Innovation Program for Graduate Education of Jiangsu Province, China (Grant No. KYLX 0277), the Fundamental Research Funds for the Central Universities, China (Grant No. 3082015NP2015504), and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PADA), China.
Bayesian Computation for Log-Gaussian Cox Processes: A Comparative Analysis of Methods
Teng, Ming; Nathoo, Farouk S.; Johnson, Timothy D.
2017-01-01
The Log-Gaussian Cox Process is a commonly used model for the analysis of spatial point pattern data. Fitting this model is difficult because of its doubly-stochastic property, i.e., it is an hierarchical combination of a Poisson process at the first level and a Gaussian Process at the second level. Various methods have been proposed to estimate such a process, including traditional likelihood-based approaches as well as Bayesian methods. We focus here on Bayesian methods and several approaches that have been considered for model fitting within this framework, including Hamiltonian Monte Carlo, the Integrated nested Laplace approximation, and Variational Bayes. We consider these approaches and make comparisons with respect to statistical and computational efficiency. These comparisons are made through several simulation studies as well as through two applications, the first examining ecological data and the second involving neuroimaging data. PMID:29200537
Bayesian median regression for temporal gene expression data
NASA Astrophysics Data System (ADS)
Yu, Keming; Vinciotti, Veronica; Liu, Xiaohui; 't Hoen, Peter A. C.
2007-09-01
Most of the existing methods for the identification of biologically interesting genes in a temporal expression profiling dataset do not fully exploit the temporal ordering in the dataset and are based on normality assumptions for the gene expression. In this paper, we introduce a Bayesian median regression model to detect genes whose temporal profile is significantly different across a number of biological conditions. The regression model is defined by a polynomial function where both time and condition effects as well as interactions between the two are included. MCMC-based inference returns the posterior distribution of the polynomial coefficients. From this a simple Bayes factor test is proposed to test for significance. The estimation of the median rather than the mean, and within a Bayesian framework, increases the robustness of the method compared to a Hotelling T2-test previously suggested. This is shown on simulated data and on muscular dystrophy gene expression data.
Cruz-Marcelo, Alejandro; Ensor, Katherine B; Rosner, Gary L
2011-06-01
The term structure of interest rates is used to price defaultable bonds and credit derivatives, as well as to infer the quality of bonds for risk management purposes. We introduce a model that jointly estimates term structures by means of a Bayesian hierarchical model with a prior probability model based on Dirichlet process mixtures. The modeling methodology borrows strength across term structures for purposes of estimation. The main advantage of our framework is its ability to produce reliable estimators at the company level even when there are only a few bonds per company. After describing the proposed model, we discuss an empirical application in which the term structure of 197 individual companies is estimated. The sample of 197 consists of 143 companies with only one or two bonds. In-sample and out-of-sample tests are used to quantify the improvement in accuracy that results from approximating the term structure of corporate bonds with estimators by company rather than by credit rating, the latter being a popular choice in the financial literature. A complete description of a Markov chain Monte Carlo (MCMC) scheme for the proposed model is available as Supplementary Material.
Cruz-Marcelo, Alejandro; Ensor, Katherine B.; Rosner, Gary L.
2011-01-01
The term structure of interest rates is used to price defaultable bonds and credit derivatives, as well as to infer the quality of bonds for risk management purposes. We introduce a model that jointly estimates term structures by means of a Bayesian hierarchical model with a prior probability model based on Dirichlet process mixtures. The modeling methodology borrows strength across term structures for purposes of estimation. The main advantage of our framework is its ability to produce reliable estimators at the company level even when there are only a few bonds per company. After describing the proposed model, we discuss an empirical application in which the term structure of 197 individual companies is estimated. The sample of 197 consists of 143 companies with only one or two bonds. In-sample and out-of-sample tests are used to quantify the improvement in accuracy that results from approximating the term structure of corporate bonds with estimators by company rather than by credit rating, the latter being a popular choice in the financial literature. A complete description of a Markov chain Monte Carlo (MCMC) scheme for the proposed model is available as Supplementary Material. PMID:21765566
Informative priors on fetal fraction increase power of the noninvasive prenatal screen.
Xu, Hanli; Wang, Shaowei; Ma, Lin-Lin; Huang, Shuai; Liang, Lin; Liu, Qian; Liu, Yang-Yang; Liu, Ke-Di; Tan, Ze-Min; Ban, Hao; Guan, Yongtao; Lu, Zuhong
2017-11-09
PurposeNoninvasive prenatal screening (NIPS) sequences a mixture of the maternal and fetal cell-free DNA. Fetal trisomy can be detected by examining chromosomal dosages estimated from sequencing reads. The traditional method uses the Z-test, which compares a subject against a set of euploid controls, where the information of fetal fraction is not fully utilized. Here we present a Bayesian method that leverages informative priors on the fetal fraction.MethodOur Bayesian method combines the Z-test likelihood and informative priors of the fetal fraction, which are learned from the sex chromosomes, to compute Bayes factors. Bayesian framework can account for nongenetic risk factors through the prior odds, and our method can report individual positive/negative predictive values.ResultsOur Bayesian method has more power than the Z-test method. We analyzed 3,405 NIPS samples and spotted at least 9 (of 51) possible Z-test false positives.ConclusionBayesian NIPS is more powerful than the Z-test method, is able to account for nongenetic risk factors through prior odds, and can report individual positive/negative predictive values.Genetics in Medicine advance online publication, 9 November 2017; doi:10.1038/gim.2017.186.
Zhang, J L; Li, Y P; Huang, G H; Baetz, B W; Liu, J
2017-06-01
In this study, a Bayesian estimation-based simulation-optimization modeling approach (BESMA) is developed for identifying effluent trading strategies. BESMA incorporates nutrient fate modeling with soil and water assessment tool (SWAT), Bayesian estimation, and probabilistic-possibilistic interval programming with fuzzy random coefficients (PPI-FRC) within a general framework. Based on the water quality protocols provided by SWAT, posterior distributions of parameters can be analyzed through Bayesian estimation; stochastic characteristic of nutrient loading can be investigated which provides the inputs for the decision making. PPI-FRC can address multiple uncertainties in the form of intervals with fuzzy random boundaries and the associated system risk through incorporating the concept of possibility and necessity measures. The possibility and necessity measures are suitable for optimistic and pessimistic decision making, respectively. BESMA is applied to a real case of effluent trading planning in the Xiangxihe watershed, China. A number of decision alternatives can be obtained under different trading ratios and treatment rates. The results can not only facilitate identification of optimal effluent-trading schemes, but also gain insight into the effects of trading ratio and treatment rate on decision making. The results also reveal that decision maker's preference towards risk would affect decision alternatives on trading scheme as well as system benefit. Compared with the conventional optimization methods, it is proved that BESMA is advantageous in (i) dealing with multiple uncertainties associated with randomness and fuzziness in effluent-trading planning within a multi-source, multi-reach and multi-period context; (ii) reflecting uncertainties existing in nutrient transport behaviors to improve the accuracy in water quality prediction; and (iii) supporting pessimistic and optimistic decision making for effluent trading as well as promoting diversity of decision alternatives. Copyright © 2017 Elsevier Ltd. All rights reserved.
A systematic review of Bayesian articles in psychology: The last 25 years.
van de Schoot, Rens; Winter, Sonja D; Ryan, Oisín; Zondervan-Zwijnenburg, Mariëlle; Depaoli, Sarah
2017-06-01
Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Estimation of gross land-use change and its uncertainty using a Bayesian data assimilation approach
NASA Astrophysics Data System (ADS)
Levy, Peter; van Oijen, Marcel; Buys, Gwen; Tomlinson, Sam
2018-03-01
We present a method for estimating land-use change using a Bayesian data assimilation approach. The approach provides a general framework for combining multiple disparate data sources with a simple model. This allows us to constrain estimates of gross land-use change with reliable national-scale census data, whilst retaining the detailed information available from several other sources. Eight different data sources, with three different data structures, were combined in our posterior estimate of land use and land-use change, and other data sources could easily be added in future. The tendency for observations to underestimate gross land-use change is accounted for by allowing for a skewed distribution in the likelihood function. The data structure produced has high temporal and spatial resolution, and is appropriate for dynamic process-based modelling. Uncertainty is propagated appropriately into the output, so we have a full posterior distribution of output and parameters. The data are available in the widely used netCDF file format from http://eidc.ceh.ac.uk/.
Bayesian Parameter Inference and Model Selection by Population Annealing in Systems Biology
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named “posterior parameter ensemble”. We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor. PMID:25089832
Multiscale Bayesian neural networks for soil water content estimation
NASA Astrophysics Data System (ADS)
Jana, Raghavendra B.; Mohanty, Binayak P.; Springer, Everett P.
2008-08-01
Artificial neural networks (ANN) have been used for some time now to estimate soil hydraulic parameters from other available or more easily measurable soil properties. However, most such uses of ANNs as pedotransfer functions (PTFs) have been at matching spatial scales (1:1) of inputs and outputs. This approach assumes that the outputs are only required at the same scale as the input data. Unfortunately, this is rarely true. Different hydrologic, hydroclimatic, and contaminant transport models require soil hydraulic parameter data at different spatial scales, depending upon their grid sizes. While conventional (deterministic) ANNs have been traditionally used in these studies, the use of Bayesian training of ANNs is a more recent development. In this paper, we develop a Bayesian framework to derive soil water retention function including its uncertainty at the point or local scale using PTFs trained with coarser-scale Soil Survey Geographic (SSURGO)-based soil data. The approach includes an ANN trained with Bayesian techniques as a PTF tool with training and validation data collected across spatial extents (scales) in two different regions in the United States. The two study areas include the Las Cruces Trench site in the Rio Grande basin of New Mexico, and the Southern Great Plains 1997 (SGP97) hydrology experimental region in Oklahoma. Each region-specific Bayesian ANN is trained using soil texture and bulk density data from the SSURGO database (scale 1:24,000), and predictions of the soil water contents at different pressure heads with point scale data (1:1) inputs are made. The resulting outputs are corrected for bias using both linear and nonlinear correction techniques. The results show good agreement between the soil water content values measured at the point scale and those predicted by the Bayesian ANN-based PTFs for both the study sites. Overall, Bayesian ANNs coupled with nonlinear bias correction are found to be very suitable tools for deriving soil hydraulic parameters at the local/fine scale from soil physical properties at coarser-scale and across different spatial extents. This approach could potentially be used for soil hydraulic properties estimation and downscaling.
Phylogenetic Analyses: A Toolbox Expanding towards Bayesian Methods
Aris-Brosou, Stéphane; Xia, Xuhua
2008-01-01
The reconstruction of phylogenies is becoming an increasingly simple activity. This is mainly due to two reasons: the democratization of computing power and the increased availability of sophisticated yet user-friendly software. This review describes some of the latest additions to the phylogenetic toolbox, along with some of their theoretical and practical limitations. It is shown that Bayesian methods are under heavy development, as they offer the possibility to solve a number of long-standing issues and to integrate several steps of the phylogenetic analyses into a single framework. Specific topics include not only phylogenetic reconstruction, but also the comparison of phylogenies, the detection of adaptive evolution, and the estimation of divergence times between species. PMID:18483574
Bayesian Community Detection in the Space of Group-Level Functional Differences
Venkataraman, Archana; Yang, Daniel Y.-J.; Pelphrey, Kevin A.; Duncan, James S.
2017-01-01
We propose a unified Bayesian framework to detect both hyper- and hypo-active communities within whole-brain fMRI data. Specifically, our model identifies dense subgraphs that exhibit population-level differences in functional synchrony between a control and clinical group. We derive a variational EM algorithm to solve for the latent posterior distributions and parameter estimates, which subsequently inform us about the afflicted network topology. We demonstrate that our method provides valuable insights into the neural mechanisms underlying social dysfunction in autism, as verified by the Neurosynth meta-analytic database. In contrast, both univariate testing and community detection via recursive edge elimination fail to identify stable functional communities associated with the disorder. PMID:26955022
Bayesian Community Detection in the Space of Group-Level Functional Differences.
Venkataraman, Archana; Yang, Daniel Y-J; Pelphrey, Kevin A; Duncan, James S
2016-08-01
We propose a unified Bayesian framework to detect both hyper- and hypo-active communities within whole-brain fMRI data. Specifically, our model identifies dense subgraphs that exhibit population-level differences in functional synchrony between a control and clinical group. We derive a variational EM algorithm to solve for the latent posterior distributions and parameter estimates, which subsequently inform us about the afflicted network topology. We demonstrate that our method provides valuable insights into the neural mechanisms underlying social dysfunction in autism, as verified by the Neurosynth meta-analytic database. In contrast, both univariate testing and community detection via recursive edge elimination fail to identify stable functional communities associated with the disorder.
Wu, Hao; Noé, Frank
2011-03-01
Diffusion processes are relevant for a variety of phenomena in the natural sciences, including diffusion of cells or biomolecules within cells, diffusion of molecules on a membrane or surface, and diffusion of a molecular conformation within a complex energy landscape. Many experimental tools exist now to track such diffusive motions in single cells or molecules, including high-resolution light microscopy, optical tweezers, fluorescence quenching, and Förster resonance energy transfer (FRET). Experimental observations are most often indirect and incomplete: (1) They do not directly reveal the potential or diffusion constants that govern the diffusion process, (2) they have limited time and space resolution, and (3) the highest-resolution experiments do not track the motion directly but rather probe it stochastically by recording single events, such as photons, whose properties depend on the state of the system under investigation. Here, we propose a general Bayesian framework to model diffusion processes with nonlinear drift based on incomplete observations as generated by various types of experiments. A maximum penalized likelihood estimator is given as well as a Gibbs sampling method that allows to estimate the trajectories that have caused the measurement, the nonlinear drift or potential function and the noise or diffusion matrices, as well as uncertainty estimates of these properties. The approach is illustrated on numerical simulations of FRET experiments where it is shown that trajectories, potentials, and diffusion constants can be efficiently and reliably estimated even in cases with little statistics or nonequilibrium measurement conditions.
Estimating National-scale Emissions using Dense Monitoring Networks
NASA Astrophysics Data System (ADS)
Ganesan, A.; Manning, A.; Grant, A.; Young, D.; Oram, D.; Sturges, W. T.; Moncrieff, J. B.; O'Doherty, S.
2014-12-01
The UK's DECC (Deriving Emissions linked to Climate Change) network consists of four greenhouse gas measurement stations that are situated to constrain emissions from the UK and Northwest Europe. These four stations are located in Mace Head (West Coast of Ireland), and on telecommunication towers at Ridge Hill (Western England), Tacolneston (Eastern England) and Angus (Eastern Scotland). With the exception of Angus, which currently only measures carbon dioxide (CO2) and methane (CH4), the remaining sites are additionally equipped to monitor nitrous oxide (N2O). We present an analysis of the network's CH4 and N2O observations from 2011-2013 and compare derived top-down regional emissions with bottom-up inventories, including a recently produced high-resolution inventory (UK National Atmospheric Emissions Inventory). As countries are moving toward national-level emissions estimation, we also address some of the considerations that need to be made when designing these national networks. One of the novel aspects of this work is that we use a hierarchical Bayesian inversion framework. This methodology, which has newly been applied to greenhouse gas emissions estimation, is designed to estimate temporally and spatially varying model-measurement uncertainties and correlation scales, in addition to fluxes. Through this analysis, we demonstrate the importance of characterizing these covariance parameters in order to properly use data from high-density monitoring networks. This UK case study highlights the ways in which this new inverse framework can be used to address some of the limitations of traditional Bayesian inverse methods.
On the use of Bayesian Monte-Carlo in evaluation of nuclear data
NASA Astrophysics Data System (ADS)
De Saint Jean, Cyrille; Archier, Pascal; Privas, Edwin; Noguere, Gilles
2017-09-01
As model parameters, necessary ingredients of theoretical models, are not always predicted by theory, a formal mathematical framework associated to the evaluation work is needed to obtain the best set of parameters (resonance parameters, optical models, fission barrier, average width, multigroup cross sections) with Bayesian statistical inference by comparing theory to experiment. The formal rule related to this methodology is to estimate the posterior density probability function of a set of parameters by solving an equation of the following type: pdf(posterior) ˜ pdf(prior) × a likelihood function. A fitting procedure can be seen as an estimation of the posterior density probability of a set of parameters (referred as x→?) knowing a prior information on these parameters and a likelihood which gives the probability density function of observing a data set knowing x→?. To solve this problem, two major paths could be taken: add approximations and hypothesis and obtain an equation to be solved numerically (minimum of a cost function or Generalized least Square method, referred as GLS) or use Monte-Carlo sampling of all prior distributions and estimate the final posterior distribution. Monte Carlo methods are natural solution for Bayesian inference problems. They avoid approximations (existing in traditional adjustment procedure based on chi-square minimization) and propose alternative in the choice of probability density distribution for priors and likelihoods. This paper will propose the use of what we are calling Bayesian Monte Carlo (referred as BMC in the rest of the manuscript) in the whole energy range from thermal, resonance and continuum range for all nuclear reaction models at these energies. Algorithms will be presented based on Monte-Carlo sampling and Markov chain. The objectives of BMC are to propose a reference calculation for validating the GLS calculations and approximations, to test probability density distributions effects and to provide the framework of finding global minimum if several local minimums exist. Application to resolved resonance, unresolved resonance and continuum evaluation as well as multigroup cross section data assimilation will be presented.
Bayesian Inversion of 2D Models from Airborne Transient EM Data
NASA Astrophysics Data System (ADS)
Blatter, D. B.; Key, K.; Ray, A.
2016-12-01
The inherent non-uniqueness in most geophysical inverse problems leads to an infinite number of Earth models that fit observed data to within an adequate tolerance. To resolve this ambiguity, traditional inversion methods based on optimization techniques such as the Gauss-Newton and conjugate gradient methods rely on an additional regularization constraint on the properties that an acceptable model can possess, such as having minimal roughness. While allowing such an inversion scheme to converge on a solution, regularization makes it difficult to estimate the uncertainty associated with the model parameters. This is because regularization biases the inversion process toward certain models that satisfy the regularization constraint and away from others that don't, even when both may suitably fit the data. By contrast, a Bayesian inversion framework aims to produce not a single `most acceptable' model but an estimate of the posterior likelihood of the model parameters, given the observed data. In this work, we develop a 2D Bayesian framework for the inversion of transient electromagnetic (TEM) data. Our method relies on a reversible-jump Markov Chain Monte Carlo (RJ-MCMC) Bayesian inverse method with parallel tempering. Previous gradient-based inversion work in this area used a spatially constrained scheme wherein individual (1D) soundings were inverted together and non-uniqueness was tackled by using lateral and vertical smoothness constraints. By contrast, our work uses a 2D model space of Voronoi cells whose parameterization (including number of cells) is fully data-driven. To make the problem work practically, we approximate the forward solution for each TEM sounding using a local 1D approximation where the model is obtained from the 2D model by retrieving a vertical profile through the Voronoi cells. The implicit parsimony of the Bayesian inversion process leads to the simplest models that adequately explain the data, obviating the need for explicit smoothness constraints. In addition, credible intervals in model space are directly obtained, resolving some of the uncertainty introduced by regularization. An example application shows how the method can be used to quantify the uncertainty in airborne EM soundings for imaging subglacial brine channels and groundwater systems.
Ander, Bradley P.; Zhang, Xiaoshuai; Xue, Fuzhong; Sharp, Frank R.; Yang, Xiaowei
2013-01-01
The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with ‘large p, small n’ problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed. PMID:23844055
Peng, Bin; Zhu, Dianwen; Ander, Bradley P; Zhang, Xiaoshuai; Xue, Fuzhong; Sharp, Frank R; Yang, Xiaowei
2013-01-01
The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with 'large p, small n' problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed.
Smsynth: AN Imagery Synthesis System for Soil Moisture Retrieval
NASA Astrophysics Data System (ADS)
Cao, Y.; Xu, L.; Peng, J.
2018-04-01
Soil moisture (SM) is a important variable in various research areas, such as weather and climate forecasting, agriculture, drought and flood monitoring and prediction, and human health. An ongoing challenge in estimating SM via synthetic aperture radar (SAR) is the development of the retrieval SM methods, especially the empirical models needs as training samples a lot of measurements of SM and soil roughness parameters which are very difficult to acquire. As such, it is difficult to develop empirical models using realistic SAR imagery and it is necessary to develop methods to synthesis SAR imagery. To tackle this issue, a SAR imagery synthesis system based on the SM named SMSynth is presented, which can simulate radar signals that are realistic as far as possible to the real SAR imagery. In SMSynth, SAR backscatter coefficients for each soil type are simulated via the Oh model under the Bayesian framework, where the spatial correlation is modeled by the Markov random field (MRF) model. The backscattering coefficients simulated based on the designed soil parameters and sensor parameters are added into the Bayesian framework through the data likelihood where the soil parameters and sensor parameters are set as realistic as possible to the circumstances on the ground and in the validity range of the Oh model. In this way, a complete and coherent Bayesian probabilistic framework is established. Experimental results show that SMSynth is capable of generating realistic SAR images that suit the needs of a large amount of training samples of empirical models.
Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times.
dos Reis, Mario; Yang, Ziheng
2011-07-01
The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.
DOE Office of Scientific and Technical Information (OSTI.GOV)
La Russa, D
Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributionsmore » found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.« less
NASA Astrophysics Data System (ADS)
Cucchi, K.; Kawa, N.; Hesse, F.; Rubin, Y.
2017-12-01
In order to reduce uncertainty in the prediction of subsurface flow and transport processes, practitioners should use all data available. However, classic inverse modeling frameworks typically only make use of information contained in in-situ field measurements to provide estimates of hydrogeological parameters. Such hydrogeological information about an aquifer is difficult and costly to acquire. In this data-scarce context, the transfer of ex-situ information coming from previously investigated sites can be critical for improving predictions by better constraining the estimation procedure. Bayesian inverse modeling provides a coherent framework to represent such ex-situ information by virtue of the prior distribution and combine them with in-situ information from the target site. In this study, we present an innovative data-driven approach for defining such informative priors for hydrogeological parameters at the target site. Our approach consists in two steps, both relying on statistical and machine learning methods. The first step is data selection; it consists in selecting sites similar to the target site. We use clustering methods for selecting similar sites based on observable hydrogeological features. The second step is data assimilation; it consists in assimilating data from the selected similar sites into the informative prior. We use a Bayesian hierarchical model to account for inter-site variability and to allow for the assimilation of multiple types of site-specific data. We present the application and validation of the presented methods on an established database of hydrogeological parameters. Data and methods are implemented in the form of an open-source R-package and therefore facilitate easy use by other practitioners.
Tipping point analysis of atmospheric oxygen concentration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Livina, V. N.; Forbes, A. B.; Vaz Martins, T. M.
2015-03-15
We apply tipping point analysis to nine observational oxygen concentration records around the globe, analyse their dynamics and perform projections under possible future scenarios, leading to oxygen deficiency in the atmosphere. The analysis is based on statistical physics framework with stochastic modelling, where we represent the observed data as a composition of deterministic and stochastic components estimated from the observed data using Bayesian and wavelet techniques.
Depaoli, Sarah; van de Schoot, Rens; van Loey, Nancy; Sijbrandij, Marit
2015-01-01
After traumatic events, such as disaster, war trauma, and injuries including burns (which is the focus here), the risk to develop posttraumatic stress disorder (PTSD) is approximately 10% (Breslau & Davis, 1992). Latent Growth Mixture Modeling can be used to classify individuals into distinct groups exhibiting different patterns of PTSD (Galatzer-Levy, 2015). Currently, empirical evidence points to four distinct trajectories of PTSD patterns in those who have experienced burn trauma. These trajectories are labeled as: resilient, recovery, chronic, and delayed onset trajectories (e.g., Bonanno, 2004; Bonanno, Brewin, Kaniasty, & Greca, 2010; Maercker, Gäbler, O'Neil, Schützwohl, & Müller, 2013; Pietrzak et al., 2013). The delayed onset trajectory affects only a small group of individuals, that is, about 4-5% (O'Donnell, Elliott, Lau, & Creamer, 2007). In addition to its low frequency, the later onset of this trajectory may contribute to the fact that these individuals can be easily overlooked by professionals. In this special symposium on Estimating PTSD trajectories (Van de Schoot, 2015a), we illustrate how to properly identify this small group of individuals through the Bayesian estimation framework using previous knowledge through priors (see, e.g., Depaoli & Boyajian, 2014; Van de Schoot, Broere, Perryck, Zondervan-Zwijnenburg, & Van Loey, 2015). We used latent growth mixture modeling (LGMM) (Van de Schoot, 2015b) to estimate PTSD trajectories across 4 years that followed a traumatic burn. We demonstrate and compare results from traditional (maximum likelihood) and Bayesian estimation using priors (see, Depaoli, 2012, 2013). Further, we discuss where priors come from and how to define them in the estimation process. We demonstrate that only the Bayesian approach results in the desired theory-driven solution of PTSD trajectories. Since the priors are chosen subjectively, we also present a sensitivity analysis of the Bayesian results to illustrate how to check the impact of the prior knowledge integrated into the model. We conclude with recommendations and guidelines for researchers looking to implement theory-driven LGMM, and we tailor this discussion to the context of PTSD research.
Comparing interval estimates for small sample ordinal CFA models
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research. PMID:26579002
Comparing interval estimates for small sample ordinal CFA models.
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research.
Incorporating Resilience into Dynamic Social Models
2016-07-20
solved by simply using the information provided by the scenario. Instead, additional knowledge is required from relevant fields that study these...resilience function by leveraging Bayesian Knowledge Bases (BKBs), a probabilistic reasoning network framework[5],[6]. BKBs allow for inferencing...reasoning network framework based on Bayesian Knowledge Bases (BKBs). BKBs are central to our social resilience framework as they are used to
A Bayesian Multilevel Model for Microcystin Prediction in ...
The frequency of cyanobacteria blooms in North American lakes is increasing. A major concern with rising cyanobacteria blooms is microcystin, a common cyanobacterial hepatotoxin. To explore the conditions that promote high microcystin concentrations, we analyzed the US EPA National Lake Assessment (NLA) dataset collected in the summer of 2007. The NLA dataset is reported for nine eco-regions. We used the results of random forest modeling as a means ofvariable selection from which we developed a Bayesian multilevel model of microcystin concentrations. Model parameters under a multilevel modeling framework are eco-region specific, butthey are also assumed to be exchangeable across eco-regions for broad continental scaling. The exchangeability assumption ensures that both the common patterns and eco-region specific features will be reflected in the model. Furthermore, the method incorporates appropriate estimates of uncertainty. Our preliminary results show associations between microcystin and turbidity, total nutrients, and N:P ratios. Upon release of a comparable 2012 NLA dataset, we will apply Bayesian updating. The results will help develop management strategies to alleviate microcystin impacts and improve lake quality. This work provides a probabilistic framework for predicting microcystin presences in lakes. It would allow for insights to be made about how changes in nutrient concentrations could potentially change toxin levels.
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Universal Darwinism As a Process of Bayesian Inference
Campbell, John O.
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an “experiment” in the external world environment, and the results of that “experiment” or the “surprise” entailed by predicted and actual outcomes of the “experiment.” Minimization of free energy implies that the implicit measure of “surprise” experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature. PMID:27375438
A data-drive analysis for heavy quark diffusion coefficient
NASA Astrophysics Data System (ADS)
Xu, Yingru; Nahrgang, Marlene; Cao, Shanshan; Bernhard, Jonah E.; Bass, Steffen A.
2018-02-01
We apply a Bayesian model-to-data analysis on an improved Langevin framework to estimate the temperature and momentum dependence of the heavy quark diffusion coefficient in the quark-gluon plasma (QGP). The spatial diffusion coefficient is found to have a minimum around 1-3 near Tc in the zero momentum limit, and has a non-trivial momentum dependence. With the estimated diffusion coefficient, our improved Langevin model is able to simultaneously describe the D-meson RAA and v2 in three different systems at RHIC and the LHC.
Bayesian methods for estimating GEBVs of threshold traits
Wang, C-L; Ding, X-D; Wang, J-Y; Liu, J-F; Fu, W-X; Zhang, Z; Yin, Z-J; Zhang, Q
2013-01-01
Estimation of genomic breeding values is the key step in genomic selection (GS). Many methods have been proposed for continuous traits, but methods for threshold traits are still scarce. Here we introduced threshold model to the framework of GS, and specifically, we extended the three Bayesian methods BayesA, BayesB and BayesCπ on the basis of threshold model for estimating genomic breeding values of threshold traits, and the extended methods are correspondingly termed BayesTA, BayesTB and BayesTCπ. Computing procedures of the three BayesT methods using Markov Chain Monte Carlo algorithm were derived. A simulation study was performed to investigate the benefit of the presented methods in accuracy with the genomic estimated breeding values (GEBVs) for threshold traits. Factors affecting the performance of the three BayesT methods were addressed. As expected, the three BayesT methods generally performed better than the corresponding normal Bayesian methods, in particular when the number of phenotypic categories was small. In the standard scenario (number of categories=2, incidence=30%, number of quantitative trait loci=50, h2=0.3), the accuracies were improved by 30.4%, 2.4%, and 5.7% points, respectively. In most scenarios, BayesTB and BayesTCπ generated similar accuracies and both performed better than BayesTA. In conclusion, our work proved that threshold model fits well for predicting GEBVs of threshold traits, and BayesTCπ is supposed to be the method of choice for GS of threshold traits. PMID:23149458
Zhu, Xiang; Stephens, Matthew
2017-01-01
Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors, they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Here we provide a framework for performing these analyses without individual-level data. Specifically, we introduce a “Regression with Summary Statistics” (RSS) likelihood, which relates the multiple regression coefficients to univariate regression results that are often easily available. The RSS likelihood requires estimates of correlations among covariates (SNPs), which also can be obtained from public databases. We perform Bayesian multiple regression analysis by combining the RSS likelihood with previously proposed prior distributions, sampling posteriors by Markov chain Monte Carlo. In a wide range of simulations RSS performs similarly to analyses using the individual data, both for estimating heritability and detecting associations. We apply RSS to a GWAS of human height that contains 253,288 individuals typed at 1.06 million SNPs, for which analyses of individual-level data are practically impossible. Estimates of heritability (52%) are consistent with, but more precise, than previous results using subsets of these data. We also identify many previously unreported loci that show evidence for association with height in our analyses. Software is available at https://github.com/stephenslab/rss. PMID:29399241
Nonlinear and non-Gaussian Bayesian based handwriting beautification
NASA Astrophysics Data System (ADS)
Shi, Cao; Xiao, Jianguo; Xu, Canhui; Jia, Wenhua
2013-03-01
A framework is proposed in this paper to effectively and efficiently beautify handwriting by means of a novel nonlinear and non-Gaussian Bayesian algorithm. In the proposed framework, format and size of handwriting image are firstly normalized, and then typeface in computer system is applied to optimize vision effect of handwriting. The Bayesian statistics is exploited to characterize the handwriting beautification process as a Bayesian dynamic model. The model parameters to translate, rotate and scale typeface in computer system are controlled by state equation, and the matching optimization between handwriting and transformed typeface is employed by measurement equation. Finally, the new typeface, which is transformed from the original one and gains the best nonlinear and non-Gaussian optimization, is the beautification result of handwriting. Experimental results demonstrate the proposed framework provides a creative handwriting beautification methodology to improve visual acceptance.
Self-evaluation of decision-making: A general Bayesian framework for metacognitive computation.
Fleming, Stephen M; Daw, Nathaniel D
2017-01-01
People are often aware of their mistakes, and report levels of confidence in their choices that correlate with objective performance. These metacognitive assessments of decision quality are important for the guidance of behavior, particularly when external feedback is absent or sporadic. However, a computational framework that accounts for both confidence and error detection is lacking. In addition, accounts of dissociations between performance and metacognition have often relied on ad hoc assumptions, precluding a unified account of intact and impaired self-evaluation. Here we present a general Bayesian framework in which self-evaluation is cast as a "second-order" inference on a coupled but distinct decision system, computationally equivalent to inferring the performance of another actor. Second-order computation may ensue whenever there is a separation between internal states supporting decisions and confidence estimates over space and/or time. We contrast second-order computation against simpler first-order models in which the same internal state supports both decisions and confidence estimates. Through simulations we show that second-order computation provides a unified account of different types of self-evaluation often considered in separate literatures, such as confidence and error detection, and generates novel predictions about the contribution of one's own actions to metacognitive judgments. In addition, the model provides insight into why subjects' metacognition may sometimes be better or worse than task performance. We suggest that second-order computation may underpin self-evaluative judgments across a range of domains. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Self-Evaluation of Decision-Making: A General Bayesian Framework for Metacognitive Computation
2017-01-01
People are often aware of their mistakes, and report levels of confidence in their choices that correlate with objective performance. These metacognitive assessments of decision quality are important for the guidance of behavior, particularly when external feedback is absent or sporadic. However, a computational framework that accounts for both confidence and error detection is lacking. In addition, accounts of dissociations between performance and metacognition have often relied on ad hoc assumptions, precluding a unified account of intact and impaired self-evaluation. Here we present a general Bayesian framework in which self-evaluation is cast as a “second-order” inference on a coupled but distinct decision system, computationally equivalent to inferring the performance of another actor. Second-order computation may ensue whenever there is a separation between internal states supporting decisions and confidence estimates over space and/or time. We contrast second-order computation against simpler first-order models in which the same internal state supports both decisions and confidence estimates. Through simulations we show that second-order computation provides a unified account of different types of self-evaluation often considered in separate literatures, such as confidence and error detection, and generates novel predictions about the contribution of one’s own actions to metacognitive judgments. In addition, the model provides insight into why subjects’ metacognition may sometimes be better or worse than task performance. We suggest that second-order computation may underpin self-evaluative judgments across a range of domains. PMID:28004960
Using Bayesian Population Viability Analysis to Define Relevant Conservation Objectives.
Green, Adam W; Bailey, Larissa L
2015-01-01
Adaptive management provides a useful framework for managing natural resources in the face of uncertainty. An important component of adaptive management is identifying clear, measurable conservation objectives that reflect the desired outcomes of stakeholders. A common objective is to have a sustainable population, or metapopulation, but it can be difficult to quantify a threshold above which such a population is likely to persist. We performed a Bayesian metapopulation viability analysis (BMPVA) using a dynamic occupancy model to quantify the characteristics of two wood frog (Lithobates sylvatica) metapopulations resulting in sustainable populations, and we demonstrate how the results could be used to define meaningful objectives that serve as the basis of adaptive management. We explored scenarios involving metapopulations with different numbers of patches (pools) using estimates of breeding occurrence and successful metamorphosis from two study areas to estimate the probability of quasi-extinction and calculate the proportion of vernal pools producing metamorphs. Our results suggest that ≥50 pools are required to ensure long-term persistence with approximately 16% of pools producing metamorphs in stable metapopulations. We demonstrate one way to incorporate the BMPVA results into a utility function that balances the trade-offs between ecological and financial objectives, which can be used in an adaptive management framework to make optimal, transparent decisions. Our approach provides a framework for using a standard method (i.e., PVA) and available information to inform a formal decision process to determine optimal and timely management policies.
A comment on priors for Bayesian occupancy models
Gerber, Brian D.
2018-01-01
Understanding patterns of species occurrence and the processes underlying these patterns is fundamental to the study of ecology. One of the more commonly used approaches to investigate species occurrence patterns is occupancy modeling, which can account for imperfect detection of a species during surveys. In recent years, there has been a proliferation of Bayesian modeling in ecology, which includes fitting Bayesian occupancy models. The Bayesian framework is appealing to ecologists for many reasons, including the ability to incorporate prior information through the specification of prior distributions on parameters. While ecologists almost exclusively intend to choose priors so that they are “uninformative” or “vague”, such priors can easily be unintentionally highly informative. Here we report on how the specification of a “vague” normally distributed (i.e., Gaussian) prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation. Using both simulated data and empirical examples, we illustrate how this issue likely compromises inference about species-habitat relationships. While the extent to which these informative priors influence inference depends on the data set, researchers fitting Bayesian occupancy models should conduct sensitivity analyses to ensure intended inference, or employ less commonly used priors that are less informative (e.g., logistic or t prior distributions). We provide suggestions for addressing this issue in occupancy studies, and an online tool for exploring this issue under different contexts. PMID:29481554
NASA Astrophysics Data System (ADS)
Dittes, Beatrice; Špačková, Olga; Ebrahimian, Negin; Kaiser, Maria; Rieger, Wolfgang; Disse, Markus; Straub, Daniel
2017-04-01
Flood risk estimates are subject to significant uncertainties, e.g. due to limited records of historic flood events, uncertainty in flood modeling, uncertain impact of climate change or uncertainty in the exposure and loss estimates. In traditional design of flood protection systems, these uncertainties are typically just accounted for implicitly, based on engineering judgment. In the AdaptRisk project, we develop a fully quantitative framework for planning of flood protection systems under current and future uncertainties using quantitative pre-posterior Bayesian decision analysis. In this contribution, we focus on the quantification of the uncertainties and study their relative influence on the flood risk estimate and on the planning of flood protection systems. The following uncertainty components are included using a Bayesian approach: 1) inherent and statistical (i.e. limited record length) uncertainty; 2) climate uncertainty that can be learned from an ensemble of GCM-RCM models; 3) estimates of climate uncertainty components not covered in 2), such as bias correction, incomplete ensemble, local specifics not captured by the GCM-RCM models; 4) uncertainty in the inundation modelling; 5) uncertainty in damage estimation. We also investigate how these uncertainties are possibly reduced in the future when new evidence - such as new climate models, observed extreme events, and socio-economic data - becomes available. Finally, we look into how this new evidence influences the risk assessment and effectivity of flood protection systems. We demonstrate our methodology for a pre-alpine catchment in southern Germany: the Mangfall catchment in Bavaria that includes the city of Rosenheim, which suffered significant losses during the 2013 flood event.
NASA Astrophysics Data System (ADS)
Wang, L.; Davis, J. L.; Tamisiea, M. E.
2017-12-01
The Antarctic ice sheet (AIS) holds about 60% of all fresh water on the Earth, an amount equivalent to about 58 m of sea-level rise. Observation of AIS mass change is thus essential in determining and predicting its contribution to sea level. While the ice mass loss estimates for West Antarctica (WA) and the Antarctic Peninsula (AP) are in good agreement, what the mass balance over East Antarctica (EA) is, and whether or not it compensates for the mass loss is under debate. Besides the different error sources and sensitivities of different measurement types, complex spatial and temporal variabilities would be another factor complicating the accurate estimation of the AIS mass balance. Therefore, a model that allows for variabilities in both melting rate and seasonal signals would seem appropriate in the estimation of present-day AIS melting. We present a stochastic filter technique, which enables the Bayesian separation of the systematic stripe noise and mass signal in decade-length GRACE monthly gravity series, and allows the estimation of time-variable seasonal and inter-annual components in the signals. One of the primary advantages of this Bayesian method is that it yields statistically rigorous uncertainty estimates reflecting the inherent spatial resolution of the data. By applying the stochastic filter to the decade-long GRACE observations, we present the temporal variabilities of the AIS mass balance at basin scale, particularly over East Antarctica, and decipher the EA mass variations in the past decade, and their role in affecting overall AIS mass balance and sea level.
Brase, Gary L; Hill, W Trey
2015-01-01
Bayesian reasoning, defined here as the updating of a posterior probability following new information, has historically been problematic for humans. Classic psychology experiments have tested human Bayesian reasoning through the use of word problems and have evaluated each participant's performance against the normatively correct answer provided by Bayes' theorem. The standard finding is of generally poor performance. Over the past two decades, though, progress has been made on how to improve Bayesian reasoning. Most notably, research has demonstrated that the use of frequencies in a natural sampling framework-as opposed to single-event probabilities-can improve participants' Bayesian estimates. Furthermore, pictorial aids and certain individual difference factors also can play significant roles in Bayesian reasoning success. The mechanics of how to build tasks which show these improvements is not under much debate. The explanations for why naturally sampled frequencies and pictures help Bayesian reasoning remain hotly contested, however, with many researchers falling into ingrained "camps" organized around two dominant theoretical perspectives. The present paper evaluates the merits of these theoretical perspectives, including the weight of empirical evidence, theoretical coherence, and predictive power. By these criteria, the ecological rationality approach is clearly better than the heuristics and biases view. Progress in the study of Bayesian reasoning will depend on continued research that honestly, vigorously, and consistently engages across these different theoretical accounts rather than staying "siloed" within one particular perspective. The process of science requires an understanding of competing points of view, with the ultimate goal being integration.
Yu, Wenxi; Liu, Yang; Ma, Zongwei; Bi, Jun
2017-08-01
Using satellite-based aerosol optical depth (AOD) measurements and statistical models to estimate ground-level PM 2.5 is a promising way to fill the areas that are not covered by ground PM 2.5 monitors. The statistical models used in previous studies are primarily Linear Mixed Effects (LME) and Geographically Weighted Regression (GWR) models. In this study, we developed a new regression model between PM 2.5 and AOD using Gaussian processes in a Bayesian hierarchical setting. Gaussian processes model the stochastic nature of the spatial random effects, where the mean surface and the covariance function is specified. The spatial stochastic process is incorporated under the Bayesian hierarchical framework to explain the variation of PM 2.5 concentrations together with other factors, such as AOD, spatial and non-spatial random effects. We evaluate the results of our model and compare them with those of other, conventional statistical models (GWR and LME) by within-sample model fitting and out-of-sample validation (cross validation, CV). The results show that our model possesses a CV result (R 2 = 0.81) that reflects higher accuracy than that of GWR and LME (0.74 and 0.48, respectively). Our results indicate that Gaussian process models have the potential to improve the accuracy of satellite-based PM 2.5 estimates.
Akita, Yasuyuki; Baldasano, Jose M; Beelen, Rob; Cirach, Marta; de Hoogh, Kees; Hoek, Gerard; Nieuwenhuijsen, Mark; Serre, Marc L; de Nazelle, Audrey
2014-04-15
In recognition that intraurban exposure gradients may be as large as between-city variations, recent air pollution epidemiologic studies have become increasingly interested in capturing within-city exposure gradients. In addition, because of the rapidly accumulating health data, recent studies also need to handle large study populations distributed over large geographic domains. Even though several modeling approaches have been introduced, a consistent modeling framework capturing within-city exposure variability and applicable to large geographic domains is still missing. To address these needs, we proposed a modeling framework based on the Bayesian Maximum Entropy method that integrates monitoring data and outputs from existing air quality models based on Land Use Regression (LUR) and Chemical Transport Models (CTM). The framework was applied to estimate the yearly average NO2 concentrations over the region of Catalunya in Spain. By jointly accounting for the global scale variability in the concentration from the output of CTM and the intraurban scale variability through LUR model output, the proposed framework outperformed more conventional approaches.
Xu, Yadong; Serre, Marc L; Reyes, Jeanette; Vizuete, William
2016-04-19
To improve ozone exposure estimates for ambient concentrations at a national scale, we introduce our novel Regionalized Air Quality Model Performance (RAMP) approach to integrate chemical transport model (CTM) predictions with the available ozone observations using the Bayesian Maximum Entropy (BME) framework. The framework models the nonlinear and nonhomoscedastic relation between air pollution observations and CTM predictions and for the first time accounts for variability in CTM model performance. A validation analysis using only noncollocated data outside of a validation radius rv was performed and the R(2) between observations and re-estimated values for two daily metrics, the daily maximum 8-h average (DM8A) and the daily 24-h average (D24A) ozone concentrations, were obtained with the OBS scenario using ozone observations only in contrast with the RAMP and a Constant Air Quality Model Performance (CAMP) scenarios. We show that, by accounting for the spatial and temporal variability in model performance, our novel RAMP approach is able to extract more information in terms of R(2) increase percentage, with over 12 times for the DM8A and over 3.5 times for the D24A ozone concentrations, from CTM predictions than the CAMP approach assuming that model performance does not change across space and time.
Esfahani, Mohammad Shahrokh; Dougherty, Edward R
2015-01-01
Phenotype classification via genomic data is hampered by small sample sizes that negatively impact classifier design. Utilization of prior biological knowledge in conjunction with training data can improve both classifier design and error estimation via the construction of the optimal Bayesian classifier. In the genomic setting, gene/protein signaling pathways provide a key source of biological knowledge. Although these pathways are neither complete, nor regulatory, with no timing associated with them, they are capable of constraining the set of possible models representing the underlying interaction between molecules. The aim of this paper is to provide a framework and the mathematical tools to transform signaling pathways to prior probabilities governing uncertainty classes of feature-label distributions used in classifier design. Structural motifs extracted from the signaling pathways are mapped to a set of constraints on a prior probability on a Multinomial distribution. Being the conjugate prior for the Multinomial distribution, we propose optimization paradigms to estimate the parameters of a Dirichlet distribution in the Bayesian setting. The performance of the proposed methods is tested on two widely studied pathways: mammalian cell cycle and a p53 pathway model.
Estimates of CO2 fluxes over the city of Cape Town, South Africa, through Bayesian inverse modelling
NASA Astrophysics Data System (ADS)
Nickless, Alecia; Rayner, Peter J.; Engelbrecht, Francois; Brunke, Ernst-Günther; Erni, Birgit; Scholes, Robert J.
2018-04-01
We present a city-scale inversion over Cape Town, South Africa. Measurement sites for atmospheric CO2 concentrations were installed at Robben Island and Hangklip lighthouses, located downwind and upwind of the metropolis. Prior estimates of the fossil fuel fluxes were obtained from a bespoke inventory analysis where emissions were spatially and temporally disaggregated and uncertainty estimates determined by means of error propagation techniques. Net ecosystem exchange (NEE) fluxes from biogenic processes were obtained from the land atmosphere exchange model CABLE (Community Atmosphere Biosphere Land Exchange). Uncertainty estimates were based on the estimates of net primary productivity. CABLE was dynamically coupled to the regional climate model CCAM (Conformal Cubic Atmospheric Model), which provided the climate inputs required to drive the Lagrangian particle dispersion model. The Bayesian inversion framework included a control vector where fossil fuel and NEE fluxes were solved for separately.Due to the large prior uncertainty prescribed to the NEE fluxes, the current inversion framework was unable to adequately distinguish between the fossil fuel and NEE fluxes, but the inversion was able to obtain improved estimates of the total fluxes within pixels and across the domain. The median of the uncertainty reductions of the total weekly flux estimates for the inversion domain of Cape Town was 28 %, but reach as high as 50 %. At the pixel level, uncertainty reductions of the total weekly flux reached up to 98 %, but these large uncertainty reductions were for NEE-dominated pixels. Improved corrections to the fossil fuel fluxes would be possible if the uncertainty around the prior NEE fluxes could be reduced. In order for this inversion framework to be operationalised for monitoring, reporting, and verification (MRV) of emissions from Cape Town, the NEE component of the CO2 budget needs to be better understood. Additional measurements of Δ14C and δ13C isotope measurements would be a beneficial component of an atmospheric monitoring programme aimed at MRV of CO2 for any city which has significant biogenic influence, allowing improved separation of contributions from NEE and fossil fuel fluxes to the observed CO2 concentration.
Radiation dose reduction in computed tomography perfusion using spatial-temporal Bayesian methods
NASA Astrophysics Data System (ADS)
Fang, Ruogu; Raj, Ashish; Chen, Tsuhan; Sanelli, Pina C.
2012-03-01
In current computed tomography (CT) examinations, the associated X-ray radiation dose is of significant concern to patients and operators, especially CT perfusion (CTP) imaging that has higher radiation dose due to its cine scanning technique. A simple and cost-effective means to perform the examinations is to lower the milliampere-seconds (mAs) parameter as low as reasonably achievable in data acquisition. However, lowering the mAs parameter will unavoidably increase data noise and degrade CT perfusion maps greatly if no adequate noise control is applied during image reconstruction. To capture the essential dynamics of CT perfusion, a simple spatial-temporal Bayesian method that uses a piecewise parametric model of the residual function is used, and then the model parameters are estimated from a Bayesian formulation of prior smoothness constraints on perfusion parameters. From the fitted residual function, reliable CTP parameter maps are obtained from low dose CT data. The merit of this scheme exists in the combination of analytical piecewise residual function with Bayesian framework using a simpler prior spatial constrain for CT perfusion application. On a dataset of 22 patients, this dynamic spatial-temporal Bayesian model yielded an increase in signal-tonoise-ratio (SNR) of 78% and a decrease in mean-square-error (MSE) of 40% at low dose radiation of 43mA.
Paz-Linares, Deirel; Vega-Hernández, Mayrim; Rojas-López, Pedro A.; Valdés-Hernández, Pedro A.; Martínez-Montes, Eduardo; Valdés-Sosa, Pedro A.
2017-01-01
The estimation of EEG generating sources constitutes an Inverse Problem (IP) in Neuroscience. This is an ill-posed problem due to the non-uniqueness of the solution and regularization or prior information is needed to undertake Electrophysiology Source Imaging. Structured Sparsity priors can be attained through combinations of (L1 norm-based) and (L2 norm-based) constraints such as the Elastic Net (ENET) and Elitist Lasso (ELASSO) models. The former model is used to find solutions with a small number of smooth nonzero patches, while the latter imposes different degrees of sparsity simultaneously along different dimensions of the spatio-temporal matrix solutions. Both models have been addressed within the penalized regression approach, where the regularization parameters are selected heuristically, leading usually to non-optimal and computationally expensive solutions. The existing Bayesian formulation of ENET allows hyperparameter learning, but using the computationally intensive Monte Carlo/Expectation Maximization methods, which makes impractical its application to the EEG IP. While the ELASSO have not been considered before into the Bayesian context. In this work, we attempt to solve the EEG IP using a Bayesian framework for ENET and ELASSO models. We propose a Structured Sparse Bayesian Learning algorithm based on combining the Empirical Bayes and the iterative coordinate descent procedures to estimate both the parameters and hyperparameters. Using realistic simulations and avoiding the inverse crime we illustrate that our methods are able to recover complicated source setups more accurately and with a more robust estimation of the hyperparameters and behavior under different sparsity scenarios than classical LORETA, ENET and LASSO Fusion solutions. We also solve the EEG IP using data from a visual attention experiment, finding more interpretable neurophysiological patterns with our methods. The Matlab codes used in this work, including Simulations, Methods, Quality Measures and Visualization Routines are freely available in a public website. PMID:29200994
Paz-Linares, Deirel; Vega-Hernández, Mayrim; Rojas-López, Pedro A; Valdés-Hernández, Pedro A; Martínez-Montes, Eduardo; Valdés-Sosa, Pedro A
2017-01-01
The estimation of EEG generating sources constitutes an Inverse Problem (IP) in Neuroscience. This is an ill-posed problem due to the non-uniqueness of the solution and regularization or prior information is needed to undertake Electrophysiology Source Imaging. Structured Sparsity priors can be attained through combinations of (L1 norm-based) and (L2 norm-based) constraints such as the Elastic Net (ENET) and Elitist Lasso (ELASSO) models. The former model is used to find solutions with a small number of smooth nonzero patches, while the latter imposes different degrees of sparsity simultaneously along different dimensions of the spatio-temporal matrix solutions. Both models have been addressed within the penalized regression approach, where the regularization parameters are selected heuristically, leading usually to non-optimal and computationally expensive solutions. The existing Bayesian formulation of ENET allows hyperparameter learning, but using the computationally intensive Monte Carlo/Expectation Maximization methods, which makes impractical its application to the EEG IP. While the ELASSO have not been considered before into the Bayesian context. In this work, we attempt to solve the EEG IP using a Bayesian framework for ENET and ELASSO models. We propose a Structured Sparse Bayesian Learning algorithm based on combining the Empirical Bayes and the iterative coordinate descent procedures to estimate both the parameters and hyperparameters. Using realistic simulations and avoiding the inverse crime we illustrate that our methods are able to recover complicated source setups more accurately and with a more robust estimation of the hyperparameters and behavior under different sparsity scenarios than classical LORETA, ENET and LASSO Fusion solutions. We also solve the EEG IP using data from a visual attention experiment, finding more interpretable neurophysiological patterns with our methods. The Matlab codes used in this work, including Simulations, Methods, Quality Measures and Visualization Routines are freely available in a public website.
A Bayesian approach to multi-messenger astronomy: identification of gravitational-wave host galaxies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fan, XiLong; Messenger, Christopher; Heng, Ik Siong
We present a general framework for incorporating astrophysical information into Bayesian parameter estimation techniques used by gravitational wave data analysis to facilitate multi-messenger astronomy. Since the progenitors of transient gravitational wave events, such as compact binary coalescences, are likely to be associated with a host galaxy, improvements to the source sky location estimates through the use of host galaxy information are explored. To demonstrate how host galaxy properties can be included, we simulate a population of compact binary coalescences and show that for ∼8.5% of simulations within 200 Mpc, the top 10 most likely galaxies account for a ∼50% ofmore » the total probability of hosting a gravitational wave source. The true gravitational wave source host galaxy is in the top 10 galaxy candidates ∼10% of the time. Furthermore, we show that by including host galaxy information, a better estimate of the inclination angle of a compact binary gravitational wave source can be obtained. We also demonstrate the flexibility of our method by incorporating the use of either the B or K band into our analysis.« less
Sidibé, Cheick Abou Kounta; Grosbois, Vladimir; Thiaucourt, François; Niang, Mamadou; Lesnoff, Matthieu; Roger, François
2012-08-01
A Bayesian approach, allowing for conditional dependence between two tests was used to estimate without gold standard the sensitivities of complement fixation test (CFT) and competitive enzyme-linked immunosorbent assay test (cELISA) and the serological prevalence of CBPP in a cattle population of the Central Delta of the Niger River in Mali, where CBPP is enzootic and the true prevalence and animals serological state were unknown. A significant difference (P = 0.99) was observed between the sensitivities of the two tests, estimated at 73.7% (95% probability interval [PI], 63.4-82.7) for cELISA and 42.3% (95% PI, 33.3-53.7) for CFT. Individual-level serological prevalence in the study population was estimated at 14.1% (95% PI, 10.8-16.9). Our results indicate that in enzootic areas, cELISA performs better in terms of sensitivity than CFT. However, negative conditional sensitivity dependence between the two tests was detected, implying that to achieve maximum sensitivity, the two tests should be applied in parallel.
NASA Astrophysics Data System (ADS)
Liu, Yaoze; Engel, Bernard A.; Flanagan, Dennis C.; Gitau, Margaret W.; McMillan, Sara K.; Chaubey, Indrajeet; Singh, Shweta
2018-05-01
Best management practices (BMPs) are popular approaches used to improve hydrology and water quality. Uncertainties in BMP effectiveness over time may result in overestimating long-term efficiency in watershed planning strategies. To represent varying long-term BMP effectiveness in hydrologic/water quality models, a high level and forward-looking modeling framework was developed. The components in the framework consist of establishment period efficiency, starting efficiency, efficiency for each storm event, efficiency between maintenance, and efficiency over the life cycle. Combined, they represent long-term efficiency for a specific type of practice and specific environmental concern (runoff/pollutant). An approach for possible implementation of the framework was discussed. The long-term impacts of grass buffer strips (agricultural BMP) and bioretention systems (urban BMP) in reducing total phosphorus were simulated to demonstrate the framework. Data gaps were captured in estimating the long-term performance of the BMPs. A Bayesian method was used to match the simulated distribution of long-term BMP efficiencies with the observed distribution with the assumption that the observed data represented long-term BMP efficiencies. The simulated distribution matched the observed distribution well with only small total predictive uncertainties. With additional data, the same method can be used to further improve the simulation results. The modeling framework and results of this study, which can be adopted in hydrologic/water quality models to better represent long-term BMP effectiveness, can help improve decision support systems for creating long-term stormwater management strategies for watershed management projects.
NASA Astrophysics Data System (ADS)
Farhadi, L.; Abdolghafoorian, A.
2015-12-01
The land surface is a key component of climate system. It controls the partitioning of available energy at the surface between sensible and latent heat, and partitioning of available water between evaporation and runoff. Water and energy cycle are intrinsically coupled through evaporation, which represents a heat exchange as latent heat flux. Accurate estimation of fluxes of heat and moisture are of significant importance in many fields such as hydrology, climatology and meteorology. In this study we develop and apply a Bayesian framework for estimating the key unknown parameters of terrestrial water and energy balance equations (i.e. moisture and heat diffusion) and their uncertainty in land surface models. These equations are coupled through flux of evaporation. The estimation system is based on the adjoint method for solving a least-squares optimization problem. The cost function consists of aggregated errors on state (i.e. moisture and temperature) with respect to observation and parameters estimation with respect to prior values over the entire assimilation period. This cost function is minimized with respect to parameters to identify models of sensible heat, latent heat/evaporation and drainage and runoff. Inverse of Hessian of the cost function is an approximation of the posterior uncertainty of parameter estimates. Uncertainty of estimated fluxes is estimated by propagating the uncertainty for linear and nonlinear function of key parameters through the method of First Order Second Moment (FOSM). Uncertainty analysis is used in this method to guide the formulation of a well-posed estimation problem. Accuracy of the method is assessed at point scale using surface energy and water fluxes generated by the Simultaneous Heat and Water (SHAW) model at the selected AmeriFlux stations. This method can be applied to diverse climates and land surface conditions with different spatial scales, using remotely sensed measurements of surface moisture and temperature states
Caudek, Corrado; Fantoni, Carlo; Domini, Fulvio
2011-01-01
We measured perceived depth from the optic flow (a) when showing a stationary physical or virtual object to observers who moved their head at a normal or slower speed, and (b) when simulating the same optic flow on a computer and presenting it to stationary observers. Our results show that perceived surface slant is systematically distorted, for both the active and the passive viewing of physical or virtual surfaces. These distortions are modulated by head translation speed, with perceived slant increasing directly with the local velocity gradient of the optic flow. This empirical result allows us to determine the relative merits of two alternative approaches aimed at explaining perceived surface slant in active vision: an “inverse optics” model that takes head motion information into account, and a probabilistic model that ignores extra-retinal signals. We compare these two approaches within the framework of the Bayesian theory. The “inverse optics” Bayesian model produces veridical slant estimates if the optic flow and the head translation velocity are measured with no error; because of the influence of a “prior” for flatness, the slant estimates become systematically biased as the measurement errors increase. The Bayesian model, which ignores the observer's motion, always produces distorted estimates of surface slant. Interestingly, the predictions of this second model, not those of the first one, are consistent with our empirical findings. The present results suggest that (a) in active vision perceived surface slant may be the product of probabilistic processes which do not guarantee the correct solution, and (b) extra-retinal signals may be mainly used for a better measurement of retinal information. PMID:21533197
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Perreault Levasseur, Laurence; Hezaveh, Yashar D.; Wechsler, Risa H.
2017-11-01
In Hezaveh et al. we showed that deep learning can be used for model parameter estimation and trained convolutional neural networks to determine the parameters of strong gravitational-lensing systems. Here we demonstrate a method for obtaining the uncertainties of these parameters. We review the framework of variational inference to obtain approximate posteriors of Bayesian neural networks and apply it to a network trained to estimate the parameters of the Singular Isothermal Ellipsoid plus external shear and total flux magnification. We show that the method can capture the uncertainties due to different levels of noise in the input data, as well as training and architecture-related errors made by the network. To evaluate the accuracy of the resulting uncertainties, we calculate the coverage probabilities of marginalized distributions for each lensing parameter. By tuning a single variational parameter, the dropout rate, we obtain coverage probabilities approximately equal to the confidence levels for which they were calculated, resulting in accurate and precise uncertainty estimates. Our results suggest that the application of approximate Bayesian neural networks to astrophysical modeling problems can be a fast alternative to Monte Carlo Markov Chains, allowing orders of magnitude improvement in speed.
Time-varying Concurrent Risk of Extreme Droughts and Heatwaves in California
NASA Astrophysics Data System (ADS)
Sarhadi, A.; Diffenbaugh, N. S.; Ausin, M. C.
2016-12-01
Anthropogenic global warming has changed the nature and the risk of extreme climate phenomena such as droughts and heatwaves. The concurrent of these nature-changing climatic extremes may result in intensifying undesirable consequences in terms of human health and destructive effects in water resources. The present study assesses the risk of concurrent extreme droughts and heatwaves under dynamic nonstationary conditions arising from climate change in California. For doing so, a generalized fully Bayesian time-varying multivariate risk framework is proposed evolving through time under dynamic human-induced environment. In this methodology, an extreme, Bayesian, dynamic copula (Gumbel) is developed to model the time-varying dependence structure between the two different climate extremes. The time-varying extreme marginals are previously modeled using a Generalized Extreme Value (GEV) distribution. Bayesian Markov Chain Monte Carlo (MCMC) inference is integrated to estimate parameters of the nonstationary marginals and copula using a Gibbs sampling method. Modelled marginals and copula are then used to develop a fully Bayesian, time-varying joint return period concept for the estimation of concurrent risk. Here we argue that climate change has increased the chance of concurrent droughts and heatwaves over decades in California. It is also demonstrated that a time-varying multivariate perspective should be incorporated to assess realistic concurrent risk of the extremes for water resources planning and management in a changing climate in this area. The proposed generalized methodology can be applied for other stochastic nature-changing compound climate extremes that are under the influence of climate change.
Liang, Li-Jung; Huang, David; Brecht, Mary-Lynn; Hser, Yih-ing
2010-01-01
Studies examining differences in mortality among long-term drug users have been limited. In this paper, we introduce a Bayesian framework that jointly models survival data using a Weibull proportional hazard model with frailty, and substance and alcohol data using mixed-effects models, to examine differences in mortality among heroin, cocaine, and methamphetamine users from five long-term follow-up studies. The traditional approach to analyzing combined survival data from numerous studies assumes that the studies are homogeneous, thus the estimates may be biased due to unobserved heterogeneity among studies. Our approach allows us to structurally combine the data from different studies while accounting for correlation among subjects within each study. Markov chain Monte Carlo facilitates the implementation of Bayesian analyses. Despite the complexity of the model, our approach is relatively straightforward to implement using WinBUGS. We demonstrate our joint modeling approach to the combined data and discuss the results from both approaches. PMID:21052518
A Monte Carlo–Based Bayesian Approach for Measuring Agreement in a Qualitative Scale
Pérez Sánchez, Carlos Javier
2014-01-01
Agreement analysis has been an active research area whose techniques have been widely applied in psychology and other fields. However, statistical agreement among raters has been mainly considered from a classical statistics point of view. Bayesian methodology is a viable alternative that allows the inclusion of subjective initial information coming from expert opinions, personal judgments, or historical data. A Bayesian approach is proposed by providing a unified Monte Carlo–based framework to estimate all types of measures of agreement in a qualitative scale of response. The approach is conceptually simple and it has a low computational cost. Both informative and non-informative scenarios are considered. In case no initial information is available, the results are in line with the classical methodology, but providing more information on the measures of agreement. For the informative case, some guidelines are presented to elicitate the prior distribution. The approach has been applied to two applications related to schizophrenia diagnosis and sensory analysis. PMID:29881002
Bayesian image reconstruction for improving detection performance of muon tomography.
Wang, Guobao; Schultz, Larry J; Qi, Jinyi
2009-05-01
Muon tomography is a novel technology that is being developed for detecting high-Z materials in vehicles or cargo containers. Maximum likelihood methods have been developed for reconstructing the scattering density image from muon measurements. However, the instability of maximum likelihood estimation often results in noisy images and low detectability of high-Z targets. In this paper, we propose using regularization to improve the image quality of muon tomography. We formulate the muon reconstruction problem in a Bayesian framework by introducing a prior distribution on scattering density images. An iterative shrinkage algorithm is derived to maximize the log posterior distribution. At each iteration, the algorithm obtains the maximum a posteriori update by shrinking an unregularized maximum likelihood update. Inverse quadratic shrinkage functions are derived for generalized Laplacian priors and inverse cubic shrinkage functions are derived for generalized Gaussian priors. Receiver operating characteristic studies using simulated data demonstrate that the Bayesian reconstruction can greatly improve the detection performance of muon tomography.
Is Bayesian Estimation Proper for Estimating the Individual's Ability? Research Report 80-3.
ERIC Educational Resources Information Center
Samejima, Fumiko
The effect of prior information in Bayesian estimation is considered, mainly from the standpoint of objective testing. In the estimation of a parameter belonging to an individual, the prior information is, in most cases, the density function of the population to which the individual belongs. Bayesian estimation was compared with maximum likelihood…
Tamura, Koichiro; Tao, Qiqing; Kumar, Sudhir
2018-01-01
Abstract RelTime estimates divergence times by relaxing the assumption of a strict molecular clock in a phylogeny. It shows excellent performance in estimating divergence times for both simulated and empirical molecular sequence data sets in which evolutionary rates varied extensively throughout the tree. RelTime is computationally efficient and scales well with increasing size of data sets. Until now, however, RelTime has not had a formal mathematical foundation. Here, we show that the basis of the RelTime approach is a relative rate framework (RRF) that combines comparisons of evolutionary rates in sister lineages with the principle of minimum rate change between evolutionary lineages and their respective descendants. We present analytical solutions for estimating relative lineage rates and divergence times under RRF. We also discuss the relationship of RRF with other approaches, including the Bayesian framework. We conclude that RelTime will be useful for phylogenies with branch lengths derived not only from molecular data, but also morphological and biochemical traits. PMID:29893954
Estimating the rate of biological introductions: Lessepsian fishes in the Mediterranean.
Belmaker, Jonathan; Brokovich, Eran; China, Victor; Golani, Daniel; Kiflawi, Moshe
2009-04-01
Sampling issues preclude the direct use of the discovery rate of exotic species as a robust estimate of their rate of introduction. Recently, a method was advanced that allows maximum-likelihood estimation of both the observational probability and the introduction rate from the discovery record. Here, we propose an alternative approach that utilizes the discovery record of native species to control for sampling effort. Implemented in a Bayesian framework using Markov chain Monte Carlo simulations, the approach provides estimates of the rate of introduction of the exotic species, and of additional parameters such as the size of the species pool from which they are drawn. We illustrate the approach using Red Sea fishes recorded in the eastern Mediterranean, after crossing the Suez Canal, and show that the two approaches may lead to different conclusions. The analytical framework is highly flexible and could provide a basis for easy modification to other systems for which first-sighting data on native and introduced species are available.
Kruschke, John K; Liddell, Torrin M
2018-02-01
In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.
The fossilized birth–death process for coherent calibration of divergence-time estimates
Heath, Tracy A.; Huelsenbeck, John P.; Stadler, Tanja
2014-01-01
Time-calibrated species phylogenies are critical for addressing a wide range of questions in evolutionary biology, such as those that elucidate historical biogeography or uncover patterns of coevolution and diversification. Because molecular sequence data are not informative on absolute time, external data—most commonly, fossil age estimates—are required to calibrate estimates of species divergence dates. For Bayesian divergence time methods, the common practice for calibration using fossil information involves placing arbitrarily chosen parametric distributions on internal nodes, often disregarding most of the information in the fossil record. We introduce the “fossilized birth–death” (FBD) process—a model for calibrating divergence time estimates in a Bayesian framework, explicitly acknowledging that extant species and fossils are part of the same macroevolutionary process. Under this model, absolute node age estimates are calibrated by a single diversification model and arbitrary calibration densities are not necessary. Moreover, the FBD model allows for inclusion of all available fossils. We performed analyses of simulated data and show that node age estimation under the FBD model results in robust and accurate estimates of species divergence times with realistic measures of statistical uncertainty, overcoming major limitations of standard divergence time estimation methods. We used this model to estimate the speciation times for a dataset composed of all living bears, indicating that the genus Ursus diversified in the Late Miocene to Middle Pliocene. PMID:25009181
Liu, Kai; Cui, Meng-Ying; Cao, Peng; Wang, Jiang-Bo
2016-01-01
On urban arterials, travel time estimation is challenging especially from various data sources. Typically, fusing loop detector data and probe vehicle data to estimate travel time is a troublesome issue while considering the data issue of uncertain, imprecise and even conflicting. In this paper, we propose an improved data fusing methodology for link travel time estimation. Link travel times are simultaneously pre-estimated using loop detector data and probe vehicle data, based on which Bayesian fusion is then applied to fuse the estimated travel times. Next, Iterative Bayesian estimation is proposed to improve Bayesian fusion by incorporating two strategies: 1) substitution strategy which replaces the lower accurate travel time estimation from one sensor with the current fused travel time; and 2) specially-designed conditions for convergence which restrict the estimated travel time in a reasonable range. The estimation results show that, the proposed method outperforms probe vehicle data based method, loop detector based method and single Bayesian fusion, and the mean absolute percentage error is reduced to 4.8%. Additionally, iterative Bayesian estimation performs better for lighter traffic flows when the variability of travel time is practically higher than other periods.
Cui, Meng-Ying; Cao, Peng; Wang, Jiang-Bo
2016-01-01
On urban arterials, travel time estimation is challenging especially from various data sources. Typically, fusing loop detector data and probe vehicle data to estimate travel time is a troublesome issue while considering the data issue of uncertain, imprecise and even conflicting. In this paper, we propose an improved data fusing methodology for link travel time estimation. Link travel times are simultaneously pre-estimated using loop detector data and probe vehicle data, based on which Bayesian fusion is then applied to fuse the estimated travel times. Next, Iterative Bayesian estimation is proposed to improve Bayesian fusion by incorporating two strategies: 1) substitution strategy which replaces the lower accurate travel time estimation from one sensor with the current fused travel time; and 2) specially-designed conditions for convergence which restrict the estimated travel time in a reasonable range. The estimation results show that, the proposed method outperforms probe vehicle data based method, loop detector based method and single Bayesian fusion, and the mean absolute percentage error is reduced to 4.8%. Additionally, iterative Bayesian estimation performs better for lighter traffic flows when the variability of travel time is practically higher than other periods. PMID:27362654
The number of seizures needed in the EMU.
Struck, Aaron F; Cole, Andrew J; Cash, Sydney S; Westover, M Brandon
2015-11-01
The purpose of this study was to develop a quantitative framework to estimate the likelihood of multifocal epilepsy based on the number of unifocal seizures observed in the epilepsy monitoring unit (EMU). Patient records from the EMU at Massachusetts General Hospital (MGH) from 2012 to 2014 were assessed for the presence of multifocal seizures as well the presence of multifocal interictal discharges and multifocal structural imaging abnormalities during the course of the EMU admission. Risk factors for multifocal seizures were assessed using sensitivity and specificity analysis. A Kaplan-Meier survival analysis was used to estimate the risk of multifocal epilepsy for a given number of consecutive seizures. To overcome the limits of the Kaplan-Meier analysis, a parametric survival function was fit to the EMU subjects with multifocal seizures and this was used to develop a Bayesian model to estimate the risk of multifocal seizures during an EMU admission. Multifocal interictal discharges were a significant predictor of multifocal seizures within an EMU admission with a p < 0.01, albeit with only modest sensitivity 0.74 and specificity 0.69. Multifocal potentially epileptogenic lesions on MRI were not a significant predictor p = 0.44. Kaplan-Meier analysis was limited by wide confidence intervals secondary to significant patient dropout and concern for informative censoring. The Bayesian framework provided estimates for the number of unifocal seizures needed to predict absence of multifocal seizures. To achieve 90% confidence for the absence of multifocal seizure, three seizures are needed when the pretest probability for multifocal epilepsy is 20%, seven seizures for a pretest probability of 50%, and nine seizures for a pretest probability of 80%. These results provide a framework to assist clinicians in determining the utility of trying to capture a specific number of seizures in EMU evaluations of candidates for epilepsy surgery. Wiley Periodicals, Inc. © 2015 International League Against Epilepsy.
The number of seizures needed in the EMU
Struck, Aaron F.; Cole, Andrew J.; Cash, Sydney S.; Westover, M. Brandon
2016-01-01
Summary Objective The purpose of this study was to develop a quantitative framework to estimate the likelihood of multifocal epilepsy based on the number of unifocal seizures observed in the epilepsy monitoring unit (EMU). Methods Patient records from the EMU at Massachusetts General Hospital (MGH) from 2012 to 2014 were assessed for the presence of multifocal seizures as well the presence of multifocal interictal discharges and multifocal structural imaging abnormalities during the course of the EMU admission. Risk factors for multifocal seizures were assessed using sensitivity and specificity analysis. A Kaplan-Meier survival analysis was used to estimate the risk of multifocal epilepsy for a given number of consecutive seizures. To overcome the limits of the Kaplan-Meier analysis, a parametric survival function was fit to the EMU subjects with multifocal seizures and this was used to develop a Bayesian model to estimate the risk of multifocal seizures during an EMU admission. Results Multifocal interictal discharges were a significant predictor of multifocal seizures within an EMU admission with a p < 0.01, albeit with only modest sensitivity 0.74 and specificity 0.69. Multifocal potentially epileptogenic lesions on MRI were not a significant predictor p = 0.44. Kaplan-Meier analysis was limited by wide confidence intervals secondary to significant patient dropout and concern for informative censoring. The Bayesian framework provided estimates for the number of unifocal seizures needed to predict absence of multifocal seizures. To achieve 90% confidence for the absence of multifocal seizure, three seizures are needed when the pretest probability for multifocal epilepsy is 20%, seven seizures for a pretest probability of 50%, and nine seizures for a pretest probability of 80%. Significance These results provide a framework to assist clinicians in determining the utility of trying to capture a specific number of seizures in EMU evaluations of candidates for epilepsy surgery. PMID:26222350
Nested Sampling for Bayesian Model Comparison in the Context of Salmonella Disease Dynamics
Dybowski, Richard; McKinley, Trevelyan J.; Mastroeni, Pietro; Restif, Olivier
2013-01-01
Understanding the mechanisms underlying the observed dynamics of complex biological systems requires the statistical assessment and comparison of multiple alternative models. Although this has traditionally been done using maximum likelihood-based methods such as Akaike's Information Criterion (AIC), Bayesian methods have gained in popularity because they provide more informative output in the form of posterior probability distributions. However, comparison between multiple models in a Bayesian framework is made difficult by the computational cost of numerical integration over large parameter spaces. A new, efficient method for the computation of posterior probabilities has recently been proposed and applied to complex problems from the physical sciences. Here we demonstrate how nested sampling can be used for inference and model comparison in biological sciences. We present a reanalysis of data from experimental infection of mice with Salmonella enterica showing the distribution of bacteria in liver cells. In addition to confirming the main finding of the original analysis, which relied on AIC, our approach provides: (a) integration across the parameter space, (b) estimation of the posterior parameter distributions (with visualisations of parameter correlations), and (c) estimation of the posterior predictive distributions for goodness-of-fit assessments of the models. The goodness-of-fit results suggest that alternative mechanistic models and a relaxation of the quasi-stationary assumption should be considered. PMID:24376528
Conditional adaptive Bayesian spectral analysis of nonstationary biomedical time series.
Bruce, Scott A; Hall, Martica H; Buysse, Daniel J; Krafty, Robert T
2018-03-01
Many studies of biomedical time series signals aim to measure the association between frequency-domain properties of time series and clinical and behavioral covariates. However, the time-varying dynamics of these associations are largely ignored due to a lack of methods that can assess the changing nature of the relationship through time. This article introduces a method for the simultaneous and automatic analysis of the association between the time-varying power spectrum and covariates, which we refer to as conditional adaptive Bayesian spectrum analysis (CABS). The procedure adaptively partitions the grid of time and covariate values into an unknown number of approximately stationary blocks and nonparametrically estimates local spectra within blocks through penalized splines. CABS is formulated in a fully Bayesian framework, in which the number and locations of partition points are random, and fit using reversible jump Markov chain Monte Carlo techniques. Estimation and inference averaged over the distribution of partitions allows for the accurate analysis of spectra with both smooth and abrupt changes. The proposed methodology is used to analyze the association between the time-varying spectrum of heart rate variability and self-reported sleep quality in a study of older adults serving as the primary caregiver for their ill spouse. © 2017, The International Biometric Society.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, Justin; Hund, Lauren
2017-02-01
Dynamic compression experiments are being performed on complicated materials using increasingly complex drivers. The data produced in these experiments are beginning to reach a regime where traditional analysis techniques break down; requiring the solution of an inverse problem. A common measurement in dynamic experiments is an interface velocity as a function of time, and often this functional output can be simulated using a hydrodynamics code. Bayesian model calibration is a statistical framework to estimate inputs into a computational model in the presence of multiple uncertainties, making it well suited to measurements of this type. In this article, we apply Bayesianmore » model calibration to high pressure (250 GPa) ramp compression measurements in tantalum. We address several issues speci c to this calibration including the functional nature of the output as well as parameter and model discrepancy identi ability. Speci cally, we propose scaling the likelihood function by an e ective sample size rather than modeling the autocorrelation function to accommodate the functional output and propose sensitivity analyses using the notion of `modularization' to assess the impact of experiment-speci c nuisance input parameters on estimates of material properties. We conclude that the proposed Bayesian model calibration procedure results in simple, fast, and valid inferences on the equation of state parameters for tantalum.« less
A Bayesian approach to estimate the biomass of anchovies off the coast of Perú.
Quiroz, Zaida C; Prates, Marcos O; Rue, Håvard
2015-03-01
The Northern Humboldt Current System (NHCS) is the world's most productive ecosystem in terms of fish. In particular, the Peruvian anchovy (Engraulis ringens) is the major prey of the main top predators, like seabirds, fish, humans, and other mammals. In this context, it is important to understand the dynamics of the anchovy distribution to preserve it as well as to exploit its economic capacities. Using the data collected by the "Instituto del Mar del Perú" (IMARPE) during a scientific survey in 2005, we present a statistical analysis that has as main goals: (i) to adapt to the characteristics of the sampled data, such as spatial dependence, high proportions of zeros and big size of samples; (ii) to provide important insights on the dynamics of the anchovy population; and (iii) to propose a model for estimation and prediction of anchovy biomass in the NHCS offshore from Perú. These data were analyzed in a Bayesian framework using the integrated nested Laplace approximation (INLA) method. Further, to select the best model and to study the predictive power of each model, we performed model comparisons and predictive checks, respectively. Finally, we carried out a Bayesian spatial influence diagnostic for the preferred model. © 2014, The International Biometric Society.
Bayesian Total-Evidence Dating Reveals the Recent Crown Radiation of Penguins
Heath, Tracy A.; Ksepka, Daniel T.; Stadler, Tanja; Welch, David; Drummond, Alexei J.
2017-01-01
The total-evidence approach to divergence time dating uses molecular and morphological data from extant and fossil species to infer phylogenetic relationships, species divergence times, and macroevolutionary parameters in a single coherent framework. Current model-based implementations of this approach lack an appropriate model for the tree describing the diversification and fossilization process and can produce estimates that lead to erroneous conclusions. We address this shortcoming by providing a total-evidence method implemented in a Bayesian framework. This approach uses a mechanistic tree prior to describe the underlying diversification process that generated the tree of extant and fossil taxa. Previous attempts to apply the total-evidence approach have used tree priors that do not account for the possibility that fossil samples may be direct ancestors of other samples, that is, ancestors of fossil or extant species or of clades. The fossilized birth–death (FBD) process explicitly models the diversification, fossilization, and sampling processes and naturally allows for sampled ancestors. This model was recently applied to estimate divergence times based on molecular data and fossil occurrence dates. We incorporate the FBD model and a model of morphological trait evolution into a Bayesian total-evidence approach to dating species phylogenies. We apply this method to extant and fossil penguins and show that the modern penguins radiated much more recently than has been previously estimated, with the basal divergence in the crown clade occurring at \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${\\sim}12.7$\\end{document} Ma and most splits leading to extant species occurring in the last 2 myr. Our results demonstrate that including stem-fossil diversity can greatly improve the estimates of the divergence times of crown taxa. The method is available in BEAST2 (version 2.4) software www.beast2.org with packages SA (version at least 1.1.4) and morph-models (version at least 1.0.4) installed. [Birth–death process; calibration; divergence times; MCMC; phylogenetics.] PMID:28173531
Improving default risk prediction using Bayesian model uncertainty techniques.
Kazemi, Reza; Mosleh, Ali
2012-11-01
Credit risk is the potential exposure of a creditor to an obligor's failure or refusal to repay the debt in principal or interest. The potential of exposure is measured in terms of probability of default. Many models have been developed to estimate credit risk, with rating agencies dating back to the 19th century. They provide their assessment of probability of default and transition probabilities of various firms in their annual reports. Regulatory capital requirements for credit risk outlined by the Basel Committee on Banking Supervision have made it essential for banks and financial institutions to develop sophisticated models in an attempt to measure credit risk with higher accuracy. The Bayesian framework proposed in this article uses the techniques developed in physical sciences and engineering for dealing with model uncertainty and expert accuracy to obtain improved estimates of credit risk and associated uncertainties. The approach uses estimates from one or more rating agencies and incorporates their historical accuracy (past performance data) in estimating future default risk and transition probabilities. Several examples demonstrate that the proposed methodology can assess default probability with accuracy exceeding the estimations of all the individual models. Moreover, the methodology accounts for potentially significant departures from "nominal predictions" due to "upsetting events" such as the 2008 global banking crisis. © 2012 Society for Risk Analysis.
Zehender, Gianguglielmo; Lai, Alessia; Veo, Carla; Bergna, Annalisa; Ciccozzi, Massimo; Galli, Massimo
2018-06-01
Variola virus (VARV), the causative agent of smallpox, is an exclusively human virus belonging to the genus Orthopoxvirus, which includes many other viral species covering a wide range of mammal hosts, such as vaccinia, cowpox, camelpox, taterapox, ectromelia, and monkeypox virus. The tempo and mode of evolution of Orthopoxviruses were reconstructed using a Bayesian phylodynamic framework by analysing 80 hemagglutinin sequences retrieved from public databases. Bayesian phylogeography was used to estimate their putative ancestral hosts. In order to estimate the substitution rate, the tree including all of the available Orthopoxviruses was calibrated using historical references dating the South American variola minor clade (alastrim) to between the XVI and XIX century. The mean substitution rate determined by the analysis was 6.5 × 10 -6 substitutions/site/year. Based on this evolutionary estimate, the time of the most recent common ancestor of the genus Orthopoxvirus was placed at about 10 000 years before the present. Cowpox virus was the species closest to the root of the phylogenetic tree. The root of VARV circulating in the XX century was estimated to be about 700 years ago, corresponding to about 1300 AD. The divergence between West African and South American VARV went back about 500 years ago (falling approximately in the XVI century). A rodent species is the most probable ancestral host from which the ancestors of all the known Orthopoxviruses were transmitted to the other mammal host species, and each of these species represented a dead-end for each new poxvirus species, without any further inter-specific spread. © 2018 Wiley Periodicals, Inc.
Antal, Péter; Kiszel, Petra Sz.; Gézsi, András; Hadadi, Éva; Virág, Viktor; Hajós, Gergely; Millinghoffer, András; Nagy, Adrienne; Kiss, András; Semsei, Ágnes F.; Temesi, Gergely; Melegh, Béla; Kisfali, Péter; Széll, Márta; Bikov, András; Gálffy, Gabriella; Tamási, Lilla; Falus, András; Szalai, Csaba
2012-01-01
Genetic studies indicate high number of potential factors related to asthma. Based on earlier linkage analyses we selected the 11q13 and 14q22 asthma susceptibility regions, for which we designed a partial genome screening study using 145 SNPs in 1201 individuals (436 asthmatic children and 765 controls). The results were evaluated with traditional frequentist methods and we applied a new statistical method, called Bayesian network based Bayesian multilevel analysis of relevance (BN-BMLA). This method uses Bayesian network representation to provide detailed characterization of the relevance of factors, such as joint significance, the type of dependency, and multi-target aspects. We estimated posteriors for these relations within the Bayesian statistical framework, in order to estimate the posteriors whether a variable is directly relevant or its association is only mediated. With frequentist methods one SNP (rs3751464 in the FRMD6 gene) provided evidence for an association with asthma (OR = 1.43(1.2–1.8); p = 3×10−4). The possible role of the FRMD6 gene in asthma was also confirmed in an animal model and human asthmatics. In the BN-BMLA analysis altogether 5 SNPs in 4 genes were found relevant in connection with asthma phenotype: PRPF19 on chromosome 11, and FRMD6, PTGER2 and PTGDR on chromosome 14. In a subsequent step a partial dataset containing rhinitis and further clinical parameters was used, which allowed the analysis of relevance of SNPs for asthma and multiple targets. These analyses suggested that SNPs in the AHNAK and MS4A2 genes were indirectly associated with asthma. This paper indicates that BN-BMLA explores the relevant factors more comprehensively than traditional statistical methods and extends the scope of strong relevance based methods to include partial relevance, global characterization of relevance and multi-target relevance. PMID:22432035
Bayesian inference based on dual generalized order statistics from the exponentiated Weibull model
NASA Astrophysics Data System (ADS)
Al Sobhi, Mashail M.
2015-02-01
Bayesian estimation for the two parameters and the reliability function of the exponentiated Weibull model are obtained based on dual generalized order statistics (DGOS). Also, Bayesian prediction bounds for future DGOS from exponentiated Weibull model are obtained. The symmetric and asymmetric loss functions are considered for Bayesian computations. The Markov chain Monte Carlo (MCMC) methods are used for computing the Bayes estimates and prediction bounds. The results have been specialized to the lower record values. Comparisons are made between Bayesian and maximum likelihood estimators via Monte Carlo simulation.
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Ma, Jun; Yang, Gang; Du, Bo; Zhang, Liangpei
2017-06-01
A new Bayesian method named Poisson Nonnegative Matrix Factorization with Parameter Subspace Clustering Constraint (PNMF-PSCC) has been presented to extract endmembers from Hyperspectral Imagery (HSI). First, the method integrates the liner spectral mixture model with the Bayesian framework and it formulates endmember extraction into a Bayesian inference problem. Second, the Parameter Subspace Clustering Constraint (PSCC) is incorporated into the statistical program to consider the clustering of all pixels in the parameter subspace. The PSCC could enlarge differences among ground objects and helps finding endmembers with smaller spectrum divergences. Meanwhile, the PNMF-PSCC method utilizes the Poisson distribution as the prior knowledge of spectral signals to better explain the quantum nature of light in imaging spectrometer. Third, the optimization problem of PNMF-PSCC is formulated into maximizing the joint density via the Maximum A Posterior (MAP) estimator. The program is finally solved by iteratively optimizing two sub-problems via the Alternating Direction Method of Multipliers (ADMM) framework and the FURTHESTSUM initialization scheme. Five state-of-the art methods are implemented to make comparisons with the performance of PNMF-PSCC on both the synthetic and real HSI datasets. Experimental results show that the PNMF-PSCC outperforms all the five methods in Spectral Angle Distance (SAD) and Root-Mean-Square-Error (RMSE), and especially it could identify good endmembers for ground objects with smaller spectrum divergences.
NASA Astrophysics Data System (ADS)
Fukuda, Jun'ichi; Johnson, Kaj M.
2010-06-01
We present a unified theoretical framework and solution method for probabilistic, Bayesian inversions of crustal deformation data. The inversions involve multiple data sets with unknown relative weights, model parameters that are related linearly or non-linearly through theoretic models to observations, prior information on model parameters and regularization priors to stabilize underdetermined problems. To efficiently handle non-linear inversions in which some of the model parameters are linearly related to the observations, this method combines both analytical least-squares solutions and a Monte Carlo sampling technique. In this method, model parameters that are linearly and non-linearly related to observations, relative weights of multiple data sets and relative weights of prior information and regularization priors are determined in a unified Bayesian framework. In this paper, we define the mixed linear-non-linear inverse problem, outline the theoretical basis for the method, provide a step-by-step algorithm for the inversion, validate the inversion method using synthetic data and apply the method to two real data sets. We apply the method to inversions of multiple geodetic data sets with unknown relative data weights for interseismic fault slip and locking depth. We also apply the method to the problem of estimating the spatial distribution of coseismic slip on faults with unknown fault geometry, relative data weights and smoothing regularization weight.
Hydraulic Conductivity Estimation using Bayesian Model Averaging and Generalized Parameterization
NASA Astrophysics Data System (ADS)
Tsai, F. T.; Li, X.
2006-12-01
Non-uniqueness in parameterization scheme is an inherent problem in groundwater inverse modeling due to limited data. To cope with the non-uniqueness problem of parameterization, we introduce a Bayesian Model Averaging (BMA) method to integrate a set of selected parameterization methods. The estimation uncertainty in BMA includes the uncertainty in individual parameterization methods as the within-parameterization variance and the uncertainty from using different parameterization methods as the between-parameterization variance. Moreover, the generalized parameterization (GP) method is considered in the geostatistical framework in this study. The GP method aims at increasing the flexibility of parameterization through the combination of a zonation structure and an interpolation method. The use of BMP with GP avoids over-confidence in a single parameterization method. A normalized least-squares estimation (NLSE) is adopted to calculate the posterior probability for each GP. We employee the adjoint state method for the sensitivity analysis on the weighting coefficients in the GP method. The adjoint state method is also applied to the NLSE problem. The proposed methodology is implemented to the Alamitos Barrier Project (ABP) in California, where the spatially distributed hydraulic conductivity is estimated. The optimal weighting coefficients embedded in GP are identified through the maximum likelihood estimation (MLE) where the misfits between the observed and calculated groundwater heads are minimized. The conditional mean and conditional variance of the estimated hydraulic conductivity distribution using BMA are obtained to assess the estimation uncertainty.
Iterative updating of model error for Bayesian inversion
NASA Astrophysics Data System (ADS)
Calvetti, Daniela; Dunlop, Matthew; Somersalo, Erkki; Stuart, Andrew
2018-02-01
In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations. We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations when the data is finite dimensional. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit under a simplifying assumption. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.
Computational statistics using the Bayesian Inference Engine
NASA Astrophysics Data System (ADS)
Weinberg, Martin D.
2013-09-01
This paper introduces the Bayesian Inference Engine (BIE), a general parallel, optimized software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organize and reuse expensive derived data. The BIE is the first platform for computational statistics designed explicitly to enable Bayesian update and model comparison for astronomical problems. Bayesian update is based on the representation of high-dimensional posterior distributions using metric-ball-tree based kernel density estimation. Among its algorithmic offerings, the BIE emphasizes hybrid tempered Markov chain Monte Carlo schemes that robustly sample multimodal posterior distributions in high-dimensional parameter spaces. Moreover, the BIE implements a full persistence or serialization system that stores the full byte-level image of the running inference and previously characterized posterior distributions for later use. Two new algorithms to compute the marginal likelihood from the posterior distribution, developed for and implemented in the BIE, enable model comparison for complex models and data sets. Finally, the BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. It includes an extensible object-oriented and easily extended framework that implements every aspect of the Bayesian inference. By providing a variety of statistical algorithms for all phases of the inference problem, a scientist may explore a variety of approaches with a single model and data implementation. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU General Public License.
ERIC Educational Resources Information Center
Griffiths, Thomas L.; Chater, Nick; Norris, Dennis; Pouget, Alexandre
2012-01-01
Bowers and Davis (2012) criticize Bayesian modelers for telling "just so" stories about cognition and neuroscience. Their criticisms are weakened by not giving an accurate characterization of the motivation behind Bayesian modeling or the ways in which Bayesian models are used and by not evaluating this theoretical framework against specific…
Bayesian estimation of seasonal course of canopy leaf area index from hyperspectral satellite data
NASA Astrophysics Data System (ADS)
Varvia, Petri; Rautiainen, Miina; Seppänen, Aku
2018-03-01
In this paper, Bayesian inversion of a physically-based forest reflectance model is investigated to estimate of boreal forest canopy leaf area index (LAI) from EO-1 Hyperion hyperspectral data. The data consist of multiple forest stands with different species compositions and structures, imaged in three phases of the growing season. The Bayesian estimates of canopy LAI are compared to reference estimates based on a spectral vegetation index. The forest reflectance model contains also other unknown variables in addition to LAI, for example leaf single scattering albedo and understory reflectance. In the Bayesian approach, these variables are estimated simultaneously with LAI. The feasibility and seasonal variation of these estimates is also examined. Credible intervals for the estimates are also calculated and evaluated. The results show that the Bayesian inversion approach is significantly better than using a comparable spectral vegetation index regression.
Characterizing reliability in a product/process design-assurance program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerscher, W.J. III; Booker, J.M.; Bement, T.R.
1997-10-01
Over the years many advancing techniques in the area of reliability engineering have surfaced in the military sphere of influence, and one of these techniques is Reliability Growth Testing (RGT). Private industry has reviewed RGT as part of the solution to their reliability concerns, but many practical considerations have slowed its implementation. It`s objective is to demonstrate the reliability requirement of a new product with a specified confidence. This paper speaks directly to that objective but discusses a somewhat different approach to achieving it. Rather than conducting testing as a continuum and developing statistical confidence bands around the results, thismore » Bayesian updating approach starts with a reliability estimate characterized by large uncertainty and then proceeds to reduce the uncertainty by folding in fresh information in a Bayesian framework.« less
A Unified Analysis of Structured Sonar-terrain Data using Bayesian Functional Mixed Models.
Zhu, Hongxiao; Caspers, Philip; Morris, Jeffrey S; Wu, Xiaowei; Müller, Rolf
2018-01-01
Sonar emits pulses of sound and uses the reflected echoes to gain information about target objects. It offers a low cost, complementary sensing modality for small robotic platforms. While existing analytical approaches often assume independence across echoes, real sonar data can have more complicated structures due to device setup or experimental design. In this paper, we consider sonar echo data collected from multiple terrain substrates with a dual-channel sonar head. Our goals are to identify the differential sonar responses to terrains and study the effectiveness of this dual-channel design in discriminating targets. We describe a unified analytical framework that achieves these goals rigorously, simultaneously, and automatically. The analysis was done by treating the echo envelope signals as functional responses and the terrain/channel information as covariates in a functional regression setting. We adopt functional mixed models that facilitate the estimation of terrain and channel effects while capturing the complex hierarchical structure in data. This unified analytical framework incorporates both Gaussian models and robust models. We fit the models using a full Bayesian approach, which enables us to perform multiple inferential tasks under the same modeling framework, including selecting models, estimating the effects of interest, identifying significant local regions, discriminating terrain types, and describing the discriminatory power of local regions. Our analysis of the sonar-terrain data identifies time regions that reflect differential sonar responses to terrains. The discriminant analysis suggests that a multi- or dual-channel design achieves target identification performance comparable with or better than a single-channel design.
A Unified Analysis of Structured Sonar-terrain Data using Bayesian Functional Mixed Models
Zhu, Hongxiao; Caspers, Philip; Morris, Jeffrey S.; Wu, Xiaowei; Müller, Rolf
2017-01-01
Sonar emits pulses of sound and uses the reflected echoes to gain information about target objects. It offers a low cost, complementary sensing modality for small robotic platforms. While existing analytical approaches often assume independence across echoes, real sonar data can have more complicated structures due to device setup or experimental design. In this paper, we consider sonar echo data collected from multiple terrain substrates with a dual-channel sonar head. Our goals are to identify the differential sonar responses to terrains and study the effectiveness of this dual-channel design in discriminating targets. We describe a unified analytical framework that achieves these goals rigorously, simultaneously, and automatically. The analysis was done by treating the echo envelope signals as functional responses and the terrain/channel information as covariates in a functional regression setting. We adopt functional mixed models that facilitate the estimation of terrain and channel effects while capturing the complex hierarchical structure in data. This unified analytical framework incorporates both Gaussian models and robust models. We fit the models using a full Bayesian approach, which enables us to perform multiple inferential tasks under the same modeling framework, including selecting models, estimating the effects of interest, identifying significant local regions, discriminating terrain types, and describing the discriminatory power of local regions. Our analysis of the sonar-terrain data identifies time regions that reflect differential sonar responses to terrains. The discriminant analysis suggests that a multi- or dual-channel design achieves target identification performance comparable with or better than a single-channel design. PMID:29749977
Tom, Jennifer A; Sinsheimer, Janet S; Suchard, Marc A
Massive datasets in the gigabyte and terabyte range combined with the availability of increasingly sophisticated statistical tools yield analyses at the boundary of what is computationally feasible. Compromising in the face of this computational burden by partitioning the dataset into more tractable sizes results in stratified analyses, removed from the context that justified the initial data collection. In a Bayesian framework, these stratified analyses generate intermediate realizations, often compared using point estimates that fail to account for the variability within and correlation between the distributions these realizations approximate. However, although the initial concession to stratify generally precludes the more sensible analysis using a single joint hierarchical model, we can circumvent this outcome and capitalize on the intermediate realizations by extending the dynamic iterative reweighting MCMC algorithm. In doing so, we reuse the available realizations by reweighting them with importance weights, recycling them into a now tractable joint hierarchical model. We apply this technique to intermediate realizations generated from stratified analyses of 687 influenza A genomes spanning 13 years allowing us to revisit hypotheses regarding the evolutionary history of influenza within a hierarchical statistical framework.
Tom, Jennifer A.; Sinsheimer, Janet S.; Suchard, Marc A.
2015-01-01
Massive datasets in the gigabyte and terabyte range combined with the availability of increasingly sophisticated statistical tools yield analyses at the boundary of what is computationally feasible. Compromising in the face of this computational burden by partitioning the dataset into more tractable sizes results in stratified analyses, removed from the context that justified the initial data collection. In a Bayesian framework, these stratified analyses generate intermediate realizations, often compared using point estimates that fail to account for the variability within and correlation between the distributions these realizations approximate. However, although the initial concession to stratify generally precludes the more sensible analysis using a single joint hierarchical model, we can circumvent this outcome and capitalize on the intermediate realizations by extending the dynamic iterative reweighting MCMC algorithm. In doing so, we reuse the available realizations by reweighting them with importance weights, recycling them into a now tractable joint hierarchical model. We apply this technique to intermediate realizations generated from stratified analyses of 687 influenza A genomes spanning 13 years allowing us to revisit hypotheses regarding the evolutionary history of influenza within a hierarchical statistical framework. PMID:26681992
Bayesian model calibration of ramp compression experiments on Z
NASA Astrophysics Data System (ADS)
Brown, Justin; Hund, Lauren
2017-06-01
Bayesian model calibration (BMC) is a statistical framework to estimate inputs for a computational model in the presence of multiple uncertainties, making it well suited to dynamic experiments which must be coupled with numerical simulations to interpret the results. Often, dynamic experiments are diagnosed using velocimetry and this output can be modeled using a hydrocode. Several calibration issues unique to this type of scenario including the functional nature of the output, uncertainty of nuisance parameters within the simulation, and model discrepancy identifiability are addressed, and a novel BMC process is proposed. As a proof of concept, we examine experiments conducted on Sandia National Laboratories' Z-machine which ramp compressed tantalum to peak stresses of 250 GPa. The proposed BMC framework is used to calibrate the cold curve of Ta (with uncertainty), and we conclude that the procedure results in simple, fast, and valid inferences. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
NASA Astrophysics Data System (ADS)
Hadjidoukas, P. E.; Angelikopoulos, P.; Papadimitriou, C.; Koumoutsakos, P.
2015-03-01
We present Π4U, an extensible framework, for non-intrusive Bayesian Uncertainty Quantification and Propagation (UQ+P) of complex and computationally demanding physical models, that can exploit massively parallel computer architectures. The framework incorporates Laplace asymptotic approximations as well as stochastic algorithms, along with distributed numerical differentiation and task-based parallelism for heterogeneous clusters. Sampling is based on the Transitional Markov Chain Monte Carlo (TMCMC) algorithm and its variants. The optimization tasks associated with the asymptotic approximations are treated via the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). A modified subset simulation method is used for posterior reliability measurements of rare events. The framework accommodates scheduling of multiple physical model evaluations based on an adaptive load balancing library and shows excellent scalability. In addition to the software framework, we also provide guidelines as to the applicability and efficiency of Bayesian tools when applied to computationally demanding physical models. Theoretical and computational developments are demonstrated with applications drawn from molecular dynamics, structural dynamics and granular flow.
Bayesian methods for characterizing unknown parameters of material models
Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.
2016-02-04
A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less
Bayesian methods for characterizing unknown parameters of material models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Emery, J. M.; Grigoriu, M. D.; Field Jr., R. V.
A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). As a result, the Bayesian method is also employed tomore » characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.« less
NASA Astrophysics Data System (ADS)
von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo
2014-06-01
Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.
The Bayesian boom: good thing or bad?
Hahn, Ulrike
2014-01-01
A series of high-profile critiques of Bayesian models of cognition have recently sparked controversy. These critiques question the contribution of rational, normative considerations in the study of cognition. The present article takes central claims from these critiques and evaluates them in light of specific models. Closer consideration of actual examples of Bayesian treatments of different cognitive phenomena allows one to defuse these critiques showing that they cannot be sustained across the diversity of applications of the Bayesian framework for cognitive modeling. More generally, there is nothing in the Bayesian framework that would inherently give rise to the deficits that these critiques perceive, suggesting they have been framed at the wrong level of generality. At the same time, the examples are used to demonstrate the different ways in which consideration of rationality uniquely benefits both theory and practice in the study of cognition. PMID:25152738
A Tutorial in Bayesian Potential Outcomes Mediation Analysis.
Miočević, Milica; Gonzalez, Oscar; Valente, Matthew J; MacKinnon, David P
2018-01-01
Statistical mediation analysis is used to investigate intermediate variables in the relation between independent and dependent variables. Causal interpretation of mediation analyses is challenging because randomization of subjects to levels of the independent variable does not rule out the possibility of unmeasured confounders of the mediator to outcome relation. Furthermore, commonly used frequentist methods for mediation analysis compute the probability of the data given the null hypothesis, which is not the probability of a hypothesis given the data as in Bayesian analysis. Under certain assumptions, applying the potential outcomes framework to mediation analysis allows for the computation of causal effects, and statistical mediation in the Bayesian framework gives indirect effects probabilistic interpretations. This tutorial combines causal inference and Bayesian methods for mediation analysis so the indirect and direct effects have both causal and probabilistic interpretations. Steps in Bayesian causal mediation analysis are shown in the application to an empirical example.
Decentralized cooperative TOA/AOA target tracking for hierarchical wireless sensor networks.
Chen, Ying-Chih; Wen, Chih-Yu
2012-11-08
This paper proposes a distributed method for cooperative target tracking in hierarchical wireless sensor networks. The concept of leader-based information processing is conducted to achieve object positioning, considering a cluster-based network topology. Random timers and local information are applied to adaptively select a sub-cluster for the localization task. The proposed energy-efficient tracking algorithm allows each sub-cluster member to locally estimate the target position with a Bayesian filtering framework and a neural networking model, and further performs estimation fusion in the leader node with the covariance intersection algorithm. This paper evaluates the merits and trade-offs of the protocol design towards developing more efficient and practical algorithms for object position estimation.
Liu, Fang; Eugenio, Evercita C
2018-04-01
Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
ERIC Educational Resources Information Center
Wang, Lijuan; McArdle, John J.
2008-01-01
The main purpose of this research is to evaluate the performance of a Bayesian approach for estimating unknown change points using Monte Carlo simulations. The univariate and bivariate unknown change point mixed models were presented and the basic idea of the Bayesian approach for estimating the models was discussed. The performance of Bayesian…
A crash course on data analysis in asteroseismology
NASA Astrophysics Data System (ADS)
Appourchaux, Thierry
2014-02-01
In this course, I try to provide a few basics required for performing data analysis in asteroseismology. First, I address how one can properly treat times series: the sampling, the filtering effect, the use of Fourier transform, the associated statistics. Second, I address how one can apply statistics for decision making and for parameter estimation either in a frequentist of a Bayesian framework. Last, I review how these basic principle have been applied (or not) in asteroseismology.
Frasso, Gianluca; Lambert, Philippe
2016-10-01
SummaryThe 2014 Ebola outbreak in Sierra Leone is analyzed using a susceptible-exposed-infectious-removed (SEIR) epidemic compartmental model. The discrete time-stochastic model for the epidemic evolution is coupled to a set of ordinary differential equations describing the dynamics of the expected proportions of subjects in each epidemic state. The unknown parameters are estimated in a Bayesian framework by combining data on the number of new (laboratory confirmed) Ebola cases reported by the Ministry of Health and prior distributions for the transition rates elicited using information collected by the WHO during the follow-up of specific Ebola cases. The time-varying disease transmission rate is modeled in a flexible way using penalized B-splines. Our framework represents a valuable stochastic tool for the study of an epidemic dynamic even when only irregularly observed and possibly aggregated data are available. Simulations and the analysis of the 2014 Sierra Leone Ebola data highlight the merits of the proposed methodology. In particular, the flexible modeling of the disease transmission rate makes the estimation of the effective reproduction number robust to the misspecification of the initial epidemic states and to underreporting of the infectious cases. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Automated Bayesian model development for frequency detection in biological time series.
Granqvist, Emma; Oldroyd, Giles E D; Morris, Richard J
2011-06-24
A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time series often deviate significantly from the requirements of optimality for Fourier transformation. In this paper we present an alternative approach based on Bayesian inference. We show the value of placing spectral analysis in the framework of Bayesian inference and demonstrate how model comparison can automate this procedure.
Automated Bayesian model development for frequency detection in biological time series
2011-01-01
Background A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. Results In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Conclusions Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time series often deviate significantly from the requirements of optimality for Fourier transformation. In this paper we present an alternative approach based on Bayesian inference. We show the value of placing spectral analysis in the framework of Bayesian inference and demonstrate how model comparison can automate this procedure. PMID:21702910
Bayesian estimates of the incidence of rare cancers in Europe.
Botta, Laura; Capocaccia, Riccardo; Trama, Annalisa; Herrmann, Christian; Salmerón, Diego; De Angelis, Roberta; Mallone, Sandra; Bidoli, Ettore; Marcos-Gragera, Rafael; Dudek-Godeau, Dorota; Gatta, Gemma; Cleries, Ramon
2018-04-21
The RARECAREnet project has updated the estimates of the burden of the 198 rare cancers in each European country. Suspecting that scant data could affect the reliability of statistical analysis, we employed a Bayesian approach to estimate the incidence of these cancers. We analyzed about 2,000,000 rare cancers diagnosed in 2000-2007 provided by 83 population-based cancer registries from 27 European countries. We considered European incidence rates (IRs), calculated over all the data available in RARECAREnet, as a valid a priori to merge with country-specific observed data. Therefore we provided (1) Bayesian estimates of IRs and the yearly numbers of cases of rare cancers in each country; (2) the expected time (T) in years needed to observe one new case; and (3) practical criteria to decide when to use the Bayesian approach. Bayesian and classical estimates did not differ much; substantial differences (>10%) ranged from 77 rare cancers in Iceland to 14 in England. The smaller the population the larger the number of rare cancers needing a Bayesian approach. Bayesian estimates were useful for cancers with fewer than 150 observed cases in a country during the study period; this occurred mostly when the population of the country is small. For the first time the Bayesian estimates of IRs and the yearly expected numbers of cases for each rare cancer in each individual European country were calculated. Moreover, the indicator T is useful to convey incidence estimates for exceptionally rare cancers and in small countries; it far exceeds the professional lifespan of a medical doctor. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Leube, Philipp; Geiges, Andreas; Nowak, Wolfgang
2010-05-01
Incorporating hydrogeological data, such as head and tracer data, into stochastic models of subsurface flow and transport helps to reduce prediction uncertainty. Considering limited financial resources available for the data acquisition campaign, information needs towards the prediction goal should be satisfied in a efficient and task-specific manner. For finding the best one among a set of design candidates, an objective function is commonly evaluated, which measures the expected impact of data on prediction confidence, prior to their collection. An appropriate approach to this task should be stochastically rigorous, master non-linear dependencies between data, parameters and model predictions, and allow for a wide variety of different data types. Existing methods fail to fulfill all these requirements simultaneously. For this reason, we introduce a new method, denoted as CLUE (Cross-bred Likelihood Uncertainty Estimator), that derives the essential distributions and measures of data utility within a generalized, flexible and accurate framework. The method makes use of Bayesian GLUE (Generalized Likelihood Uncertainty Estimator) and extends it to an optimal design method by marginalizing over the yet unknown data values. Operating in a purely Bayesian Monte-Carlo framework, CLUE is a strictly formal information processing scheme free of linearizations. It provides full flexibility associated with the type of measurements (linear, non-linear, direct, indirect) and accounts for almost arbitrary sources of uncertainty (e.g. heterogeneity, geostatistical assumptions, boundary conditions, model concepts) via stochastic simulation and Bayesian model averaging. This helps to minimize the strength and impact of possible subjective prior assumptions, that would be hard to defend prior to data collection. Our study focuses on evaluating two different uncertainty measures: (i) expected conditional variance and (ii) expected relative entropy of a given prediction goal. The applicability and advantages are shown in a synthetic example. Therefor, we consider a contaminant source, posing a threat on a drinking water well in an aquifer. Furthermore, we assume uncertainty in geostatistical parameters, boundary conditions and hydraulic gradient. The two mentioned measures evaluate the sensitivity of (1) general prediction confidence and (2) exceedance probability of a legal regulatory threshold value on sampling locations.
A general framework for updating belief distributions.
Bissiri, P G; Holmes, C C; Walker, S G
2016-11-01
We propose a framework for general Bayesian inference. We argue that a valid update of a prior belief distribution to a posterior can be made for parameters which are connected to observations through a loss function rather than the traditional likelihood function, which is recovered as a special case. Modern application areas make it increasingly challenging for Bayesians to attempt to model the true data-generating mechanism. For instance, when the object of interest is low dimensional, such as a mean or median, it is cumbersome to have to achieve this via a complete model for the whole data distribution. More importantly, there are settings where the parameter of interest does not directly index a family of density functions and thus the Bayesian approach to learning about such parameters is currently regarded as problematic. Our framework uses loss functions to connect information in the data to functionals of interest. The updating of beliefs then follows from a decision theoretic approach involving cumulative loss functions. Importantly, the procedure coincides with Bayesian updating when a true likelihood is known yet provides coherent subjective inference in much more general settings. Connections to other inference frameworks are highlighted.
Uncovering state-dependent relationships in shallow lakes using Bayesian latent variable regression.
Vitense, Kelsey; Hanson, Mark A; Herwig, Brian R; Zimmer, Kyle D; Fieberg, John
2018-03-01
Ecosystems sometimes undergo dramatic shifts between contrasting regimes. Shallow lakes, for instance, can transition between two alternative stable states: a clear state dominated by submerged aquatic vegetation and a turbid state dominated by phytoplankton. Theoretical models suggest that critical nutrient thresholds differentiate three lake types: highly resilient clear lakes, lakes that may switch between clear and turbid states following perturbations, and highly resilient turbid lakes. For effective and efficient management of shallow lakes and other systems, managers need tools to identify critical thresholds and state-dependent relationships between driving variables and key system features. Using shallow lakes as a model system for which alternative stable states have been demonstrated, we developed an integrated framework using Bayesian latent variable regression (BLR) to classify lake states, identify critical total phosphorus (TP) thresholds, and estimate steady state relationships between TP and chlorophyll a (chl a) using cross-sectional data. We evaluated the method using data simulated from a stochastic differential equation model and compared its performance to k-means clustering with regression (KMR). We also applied the framework to data comprising 130 shallow lakes. For simulated data sets, BLR had high state classification rates (median/mean accuracy >97%) and accurately estimated TP thresholds and state-dependent TP-chl a relationships. Classification and estimation improved with increasing sample size and decreasing noise levels. Compared to KMR, BLR had higher classification rates and better approximated the TP-chl a steady state relationships and TP thresholds. We fit the BLR model to three different years of empirical shallow lake data, and managers can use the estimated bifurcation diagrams to prioritize lakes for management according to their proximity to thresholds and chance of successful rehabilitation. Our model improves upon previous methods for shallow lakes because it allows classification and regression to occur simultaneously and inform one another, directly estimates TP thresholds and the uncertainty associated with thresholds and state classifications, and enables meaningful constraints to be built into models. The BLR framework is broadly applicable to other ecosystems known to exhibit alternative stable states in which regression can be used to establish relationships between driving variables and state variables. © 2017 by the Ecological Society of America.
A Comparison of a Bayesian and a Maximum Likelihood Tailored Testing Procedure.
ERIC Educational Resources Information Center
McKinley, Robert L.; Reckase, Mark D.
A study was conducted to compare tailored testing procedures based on a Bayesian ability estimation technique and on a maximum likelihood ability estimation technique. The Bayesian tailored testing procedure selected items so as to minimize the posterior variance of the ability estimate distribution, while the maximum likelihood tailored testing…
Jiménez, José; García, Emilio J; Llaneza, Luis; Palacios, Vicente; González, Luis Mariano; García-Domínguez, Francisco; Múñoz-Igualada, Jaime; López-Bao, José Vicente
2016-08-01
In many cases, the first step in large-carnivore management is to obtain objective, reliable, and cost-effective estimates of population parameters through procedures that are reproducible over time. However, monitoring predators over large areas is difficult, and the data have a high level of uncertainty. We devised a practical multimethod and multistate modeling approach based on Bayesian hierarchical-site-occupancy models that combined multiple survey methods to estimate different population states for use in monitoring large predators at a regional scale. We used wolves (Canis lupus) as our model species and generated reliable estimates of the number of sites with wolf reproduction (presence of pups). We used 2 wolf data sets from Spain (Western Galicia in 2013 and Asturias in 2004) to test the approach. Based on howling surveys, the naïve estimation (i.e., estimate based only on observations) of the number of sites with reproduction was 9 and 25 sites in Western Galicia and Asturias, respectively. Our model showed 33.4 (SD 9.6) and 34.4 (3.9) sites with wolf reproduction, respectively. The number of occupied sites with wolf reproduction was 0.67 (SD 0.19) and 0.76 (0.11), respectively. This approach can be used to design more cost-effective monitoring programs (i.e., to define the sampling effort needed per site). Our approach should inspire well-coordinated surveys across multiple administrative borders and populations and lead to improved decision making for management of large carnivores on a landscape level. The use of this Bayesian framework provides a simple way to visualize the degree of uncertainty around population-parameter estimates and thus provides managers and stakeholders an intuitive approach to interpreting monitoring results. Our approach can be widely applied to large spatial scales in wildlife monitoring where detection probabilities differ between population states and where several methods are being used to estimate different population parameters. © 2016 Society for Conservation Biology.
A new software for deformation source optimization, the Bayesian Earthquake Analysis Tool (BEAT)
NASA Astrophysics Data System (ADS)
Vasyura-Bathke, H.; Dutta, R.; Jonsson, S.; Mai, P. M.
2017-12-01
Modern studies of crustal deformation and the related source estimation, including magmatic and tectonic sources, increasingly use non-linear optimization strategies to estimate geometric and/or kinematic source parameters and often consider both jointly, geodetic and seismic data. Bayesian inference is increasingly being used for estimating posterior distributions of deformation source model parameters, given measured/estimated/assumed data and model uncertainties. For instance, some studies consider uncertainties of a layered medium and propagate these into source parameter uncertainties, while others use informative priors to reduce the model parameter space. In addition, innovative sampling algorithms have been developed to efficiently explore the high-dimensional parameter spaces. Compared to earlier studies, these improvements have resulted in overall more robust source model parameter estimates that include uncertainties. However, the computational burden of these methods is high and estimation codes are rarely made available along with the published results. Even if the codes are accessible, it is usually challenging to assemble them into a single optimization framework as they are typically coded in different programing languages. Therefore, further progress and future applications of these methods/codes are hampered, while reproducibility and validation of results has become essentially impossible. In the spirit of providing open-access and modular codes to facilitate progress and reproducible research in deformation source estimations, we undertook the effort of developing BEAT, a python package that comprises all the above-mentioned features in one single programing environment. The package builds on the pyrocko seismological toolbox (www.pyrocko.org), and uses the pymc3 module for Bayesian statistical model fitting. BEAT is an open-source package (https://github.com/hvasbath/beat), and we encourage and solicit contributions to the project. Here, we present our strategy for developing BEAT and show application examples; especially the effect of including the model prediction uncertainty of the velocity model in following source optimizations: full moment tensor, Mogi source, moderate strike-slip earth-quake.
A Bayesian model for estimating multi-state disease progression.
Shen, Shiwen; Han, Simon X; Petousis, Panayiotis; Weiss, Robert E; Meng, Frank; Bui, Alex A T; Hsu, William
2017-02-01
A growing number of individuals who are considered at high risk of cancer are now routinely undergoing population screening. However, noted harms such as radiation exposure, overdiagnosis, and overtreatment underscore the need for better temporal models that predict who should be screened and at what frequency. The mean sojourn time (MST), an average duration period when a tumor can be detected by imaging but with no observable clinical symptoms, is a critical variable for formulating screening policy. Estimation of MST has been long studied using continuous Markov model (CMM) with Maximum likelihood estimation (MLE). However, a lot of traditional methods assume no observation error of the imaging data, which is unlikely and can bias the estimation of the MST. In addition, the MLE may not be stably estimated when data is sparse. Addressing these shortcomings, we present a probabilistic modeling approach for periodic cancer screening data. We first model the cancer state transition using a three state CMM model, while simultaneously considering observation error. We then jointly estimate the MST and observation error within a Bayesian framework. We also consider the inclusion of covariates to estimate individualized rates of disease progression. Our approach is demonstrated on participants who underwent chest x-ray screening in the National Lung Screening Trial (NLST) and validated using posterior predictive p-values and Pearson's chi-square test. Our model demonstrates more accurate and sensible estimates of MST in comparison to MLE. Copyright © 2016 Elsevier Ltd. All rights reserved.
Dediu, Dan
2011-02-07
Language is a hallmark of our species and understanding linguistic diversity is an area of major interest. Genetic factors influencing the cultural transmission of language provide a powerful and elegant explanation for aspects of the present day linguistic diversity and a window into the emergence and evolution of language. In particular, it has recently been proposed that linguistic tone-the usage of voice pitch to convey lexical and grammatical meaning-is biased by two genes involved in brain growth and development, ASPM and Microcephalin. This hypothesis predicts that tone is a stable characteristic of language because of its 'genetic anchoring'. The present paper tests this prediction using a Bayesian phylogenetic framework applied to a large set of linguistic features and language families, using multiple software implementations, data codings, stability estimations, linguistic classifications and outgroup choices. The results of these different methods and datasets show a large agreement, suggesting that this approach produces reliable estimates of the stability of linguistic data. Moreover, linguistic tone is found to be stable across methods and datasets, providing suggestive support for the hypothesis of genetic influences on its distribution.
Combining information from multiple flood projections in a hierarchical Bayesian framework
NASA Astrophysics Data System (ADS)
Le Vine, Nataliya
2016-04-01
This study demonstrates, in the context of flood frequency analysis, the potential of a recently proposed hierarchical Bayesian approach to combine information from multiple models. The approach explicitly accommodates shared multimodel discrepancy as well as the probabilistic nature of the flood estimates, and treats the available models as a sample from a hypothetical complete (but unobserved) set of models. The methodology is applied to flood estimates from multiple hydrological projections (the Future Flows Hydrology data set) for 135 catchments in the UK. The advantages of the approach are shown to be: (1) to ensure adequate "baseline" with which to compare future changes; (2) to reduce flood estimate uncertainty; (3) to maximize use of statistical information in circumstances where multiple weak predictions individually lack power, but collectively provide meaningful information; (4) to diminish the importance of model consistency when model biases are large; and (5) to explicitly consider the influence of the (model performance) stationarity assumption. Moreover, the analysis indicates that reducing shared model discrepancy is the key to further reduction of uncertainty in the flood frequency analysis. The findings are of value regarding how conclusions about changing exposure to flooding are drawn, and to flood frequency change attribution studies.
NASA Astrophysics Data System (ADS)
Smith, T.; Marshall, L.
2007-12-01
In many mountainous regions, the single most important parameter in forecasting the controls on regional water resources is snowpack (Williams et al., 1999). In an effort to bridge the gap between theoretical understanding and functional modeling of snow-driven watersheds, a flexible hydrologic modeling framework is being developed. The aim is to create a suite of models that move from parsimonious structures, concentrated on aggregated watershed response, to those focused on representing finer scale processes and distributed response. This framework will operate as a tool to investigate the link between hydrologic model predictive performance, uncertainty, model complexity, and observable hydrologic processes. Bayesian methods, and particularly Markov chain Monte Carlo (MCMC) techniques, are extremely useful in uncertainty assessment and parameter estimation of hydrologic models. However, these methods have some difficulties in implementation. In a traditional Bayesian setting, it can be difficult to reconcile multiple data types, particularly those offering different spatial and temporal coverage, depending on the model type. These difficulties are also exacerbated by sensitivity of MCMC algorithms to model initialization and complex parameter interdependencies. As a way of circumnavigating some of the computational complications, adaptive MCMC algorithms have been developed to take advantage of the information gained from each successive iteration. Two adaptive algorithms are compared is this study, the Adaptive Metropolis (AM) algorithm, developed by Haario et al (2001), and the Delayed Rejection Adaptive Metropolis (DRAM) algorithm, developed by Haario et al (2006). While neither algorithm is truly Markovian, it has been proven that each satisfies the desired ergodicity and stationarity properties of Markov chains. Both algorithms were implemented as the uncertainty and parameter estimation framework for a conceptual rainfall-runoff model based on the Probability Distributed Model (PDM), developed by Moore (1985). We implement the modeling framework in Stringer Creek watershed in the Tenderfoot Creek Experimental Forest (TCEF), Montana. The snowmelt-driven watershed offers that additional challenge of modeling snow accumulation and melt and current efforts are aimed at developing a temperature- and radiation-index snowmelt model. Auxiliary data available from within TCEF's watersheds are used to support in the understanding of information value as it relates to predictive performance. Because the model is based on lumped parameters, auxiliary data are hard to incorporate directly. However, these additional data offer benefits through the ability to inform prior distributions of the lumped, model parameters. By incorporating data offering different information into the uncertainty assessment process, a cross-validation technique is engaged to better ensure that modeled results reflect real process complexity.
Estimating the Mass of the Milky Way Using the Ensemble of Classical Satellite Galaxies
NASA Astrophysics Data System (ADS)
Patel, Ekta; Besla, Gurtina; Sohn, Sangmo Tony; Mandel, Kaisey
2018-06-01
High precision proper motions are currently available for approximately 20% of the Milky Way's known satellite galaxies. Often, the 6D phase space information of each satellite is used separately to constrain the mass of the MW. In this talk, I will discuss the Bayesian framework outlined in Patel et al. 2017b to make inferences of the MW's mass using satellite properties such as specific orbital angular momentum, rather than just position and velocity. By extending this framework from one satellite to a population of satellites, we can now form simultaneous MW mass estimates using the Illustris-Dark cosmological simulation that are unbiased by high speed satellites such as Leo I (Patel et al., submitted). Our resulting MW mass estimates reduce the current factor of two uncertainty in the mass range of the MW and show promising signs for improvement as upcoming ground- and space-based observatories obtain proper motions for additional MW satellite galaxies.
Molitor, John
2012-03-01
Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.
NASA Astrophysics Data System (ADS)
Plessis, S.; McDougall, D.; Mandt, K.; Greathouse, T.; Luspay-Kuti, A.
2015-11-01
Bimolecular diffusion coefficients are important parameters used by atmospheric models to calculate altitude profiles of minor constituents in an atmosphere. Unfortunately, laboratory measurements of these coefficients were never conducted at temperature conditions relevant to the atmosphere of Titan. Here we conduct a detailed uncertainty analysis of the bimolecular diffusion coefficient parameters as applied to Titan's upper atmosphere to provide a better understanding of the impact of uncertainty for this parameter on models. Because temperature and pressure conditions are much lower than the laboratory conditions in which bimolecular diffusion parameters were measured, we apply a Bayesian framework, a problem-agnostic framework, to determine parameter estimates and associated uncertainties. We solve the Bayesian calibration problem using the open-source QUESO library which also performs a propagation of uncertainties in the calibrated parameters to temperature and pressure conditions observed in Titan's upper atmosphere. Our results show that, after propagating uncertainty through the Massman model, the uncertainty in molecular diffusion is highly correlated to temperature and we observe no noticeable correlation with pressure. We propagate the calibrated molecular diffusion estimate and associated uncertainty to obtain an estimate with uncertainty due to bimolecular diffusion for the methane molar fraction as a function of altitude. Results show that the uncertainty in methane abundance due to molecular diffusion is in general small compared to eddy diffusion and the chemical kinetics description. However, methane abundance is most sensitive to uncertainty in molecular diffusion above 1200 km where the errors are nontrivial and could have important implications for scientific research based on diffusion models in this altitude range.
Non-ignorable missingness in logistic regression.
Wang, Joanna J J; Bartlett, Mark; Ryan, Louise
2017-08-30
Nonresponses and missing data are common in observational studies. Ignoring or inadequately handling missing data may lead to biased parameter estimation, incorrect standard errors and, as a consequence, incorrect statistical inference and conclusions. We present a strategy for modelling non-ignorable missingness where the probability of nonresponse depends on the outcome. Using a simple case of logistic regression, we quantify the bias in regression estimates and show the observed likelihood is non-identifiable under non-ignorable missing data mechanism. We then adopt a selection model factorisation of the joint distribution as the basis for a sensitivity analysis to study changes in estimated parameters and the robustness of study conclusions against different assumptions. A Bayesian framework for model estimation is used as it provides a flexible approach for incorporating different missing data assumptions and conducting sensitivity analysis. Using simulated data, we explore the performance of the Bayesian selection model in correcting for bias in a logistic regression. We then implement our strategy using survey data from the 45 and Up Study to investigate factors associated with worsening health from the baseline to follow-up survey. Our findings have practical implications for the use of the 45 and Up Study data to answer important research questions relating to health and quality-of-life. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Jat, Prahlad; Serre, Marc L
2016-12-01
Widespread contamination of surface water chloride is an emerging environmental concern. Consequently accurate and cost-effective methods are needed to estimate chloride along all river miles of potentially contaminated watersheds. Here we introduce a Bayesian Maximum Entropy (BME) space/time geostatistical estimation framework that uses river distances, and we compare it with Euclidean BME to estimate surface water chloride from 2005 to 2014 in the Gunpowder-Patapsco, Severn, and Patuxent subbasins in Maryland. River BME improves the cross-validation R 2 by 23.67% over Euclidean BME, and river BME maps are significantly different than Euclidean BME maps, indicating that it is important to use river BME maps to assess water quality impairment. The river BME maps of chloride concentration show wide contamination throughout Baltimore and Columbia-Ellicott cities, the disappearance of a clean buffer separating these two large urban areas, and the emergence of multiple localized pockets of contamination in surrounding areas. The number of impaired river miles increased by 0.55% per year in 2005-2009 and by 1.23% per year in 2011-2014, corresponding to a marked acceleration of the rate of impairment. Our results support the need for control measures and increased monitoring of unassessed river miles. Copyright © 2016. Published by Elsevier Ltd.
Gardner, Beth; Reppucci, Juan; Lucherini, Mauro; Royle, J. Andrew
2010-01-01
We develop a hierarchical capture–recapture model for demographically open populations when auxiliary spatial information about location of capture is obtained. Such spatial capture–recapture data arise from studies based on camera trapping, DNA sampling, and other situations in which a spatial array of devices records encounters of unique individuals. We integrate an individual-based formulation of a Jolly-Seber type model with recently developed spatially explicit capture–recapture models to estimate density and demographic parameters for survival and recruitment. We adopt a Bayesian framework for inference under this model using the method of data augmentation which is implemented in the software program WinBUGS. The model was motivated by a camera trapping study of Pampas cats Leopardus colocolo from Argentina, which we present as an illustration of the model in this paper. We provide estimates of density and the first quantitative assessment of vital rates for the Pampas cat in the High Andes. The precision of these estimates is poor due likely to the sparse data set. Unlike conventional inference methods which usually rely on asymptotic arguments, Bayesian inferences are valid in arbitrary sample sizes, and thus the method is ideal for the study of rare or endangered species for which small data sets are typical.
Gardner, Beth; Reppucci, Juan; Lucherini, Mauro; Royle, J Andrew
2010-11-01
We develop a hierarchical capture-recapture model for demographically open populations when auxiliary spatial information about location of capture is obtained. Such spatial capture-recapture data arise from studies based on camera trapping, DNA sampling, and other situations in which a spatial array of devices records encounters of unique individuals. We integrate an individual-based formulation of a Jolly-Seber type model with recently developed spatially explicit capture-recapture models to estimate density and demographic parameters for survival and recruitment. We adopt a Bayesian framework for inference under this model using the method of data augmentation which is implemented in the software program WinBUGS. The model was motivated by a camera trapping study of Pampas cats Leopardus colocolo from Argentina, which we present as an illustration of the model in this paper. We provide estimates of density and the first quantitative assessment of vital rates for the Pampas cat in the High Andes. The precision of these estimates is poor due likely to the sparse data set. Unlike conventional inference methods which usually rely on asymptotic arguments, Bayesian inferences are valid in arbitrary sample sizes, and thus the method is ideal for the study of rare or endangered species for which small data sets are typical.
2014-01-01
Background Transmission models can aid understanding of disease dynamics and are useful in testing the efficiency of control measures. The aim of this study was to formulate an appropriate stochastic Susceptible-Infectious-Resistant/Carrier (SIR) model for Salmonella Typhimurium in pigs and thus estimate the transmission parameters between states. Results The transmission parameters were estimated using data from a longitudinal study of three Danish farrow-to-finish pig herds known to be infected. A Bayesian model framework was proposed, which comprised Binomial components for the transition from susceptible to infectious and from infectious to carrier; and a Poisson component for carrier to infectious. Cohort random effects were incorporated into these models to allow for unobserved cohort-specific variables as well as unobserved sources of transmission, thus enabling a more realistic estimation of the transmission parameters. In the case of the transition from susceptible to infectious, the cohort random effects were also time varying. The number of infectious pigs not detected by the parallel testing was treated as unknown, and the probability of non-detection was estimated using information about the sensitivity and specificity of the bacteriological and serological tests. The estimate of the transmission rate from susceptible to infectious was 0.33 [0.06, 1.52], from infectious to carrier was 0.18 [0.14, 0.23] and from carrier to infectious was 0.01 [0.0001, 0.04]. The estimate for the basic reproduction ration (R 0 ) was 1.91 [0.78, 5.24]. The probability of non-detection was estimated to be 0.18 [0.12, 0.25]. Conclusions The proposed framework for stochastic SIR models was successfully implemented to estimate transmission rate parameters for Salmonella Typhimurium in swine field data. R 0 was 1.91, implying that there was dissemination of the infection within pigs of the same cohort. There was significant temporal-cohort variability, especially at the susceptible to infectious stage. The model adequately fitted the data, allowing for both observed and unobserved sources of uncertainty (cohort effects, diagnostic test sensitivity), so leading to more reliable estimates of transmission parameters. PMID:24774444
Bayesian Sensitivity Analysis of Statistical Models with Missing Data
ZHU, HONGTU; IBRAHIM, JOSEPH G.; TANG, NIANSHENG
2013-01-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures. PMID:24753718
Model-based Bayesian inference for ROC data analysis
NASA Astrophysics Data System (ADS)
Lei, Tianhu; Bae, K. Ty
2013-03-01
This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.
A Bayesian Assessment of Seismic Semi-Periodicity Forecasts
NASA Astrophysics Data System (ADS)
Nava, F.; Quinteros, C.; Glowacka, E.; Frez, J.
2016-01-01
Among the schemes for earthquake forecasting, the search for semi-periodicity during large earthquakes in a given seismogenic region plays an important role. When considering earthquake forecasts based on semi-periodic sequence identification, the Bayesian formalism is a useful tool for: (1) assessing how well a given earthquake satisfies a previously made forecast; (2) re-evaluating the semi-periodic sequence probability; and (3) testing other prior estimations of the sequence probability. A comparison of Bayesian estimates with updated estimates of semi-periodic sequences that incorporate new data not used in the original estimates shows extremely good agreement, indicating that: (1) the probability that a semi-periodic sequence is not due to chance is an appropriate estimate for the prior sequence probability estimate; and (2) the Bayesian formalism does a very good job of estimating corrected semi-periodicity probabilities, using slightly less data than that used for updated estimates. The Bayesian approach is exemplified explicitly by its application to the Parkfield semi-periodic forecast, and results are given for its application to other forecasts in Japan and Venezuela.
Textual and visual content-based anti-phishing: a Bayesian approach.
Zhang, Haijun; Liu, Gang; Chow, Tommy W S; Liu, Wenyin
2011-10-01
A novel framework using a Bayesian approach for content-based phishing web page detection is presented. Our model takes into account textual and visual contents to measure the similarity between the protected web page and suspicious web pages. A text classifier, an image classifier, and an algorithm fusing the results from classifiers are introduced. An outstanding feature of this paper is the exploration of a Bayesian model to estimate the matching threshold. This is required in the classifier for determining the class of the web page and identifying whether the web page is phishing or not. In the text classifier, the naive Bayes rule is used to calculate the probability that a web page is phishing. In the image classifier, the earth mover's distance is employed to measure the visual similarity, and our Bayesian model is designed to determine the threshold. In the data fusion algorithm, the Bayes theory is used to synthesize the classification results from textual and visual content. The effectiveness of our proposed approach was examined in a large-scale dataset collected from real phishing cases. Experimental results demonstrated that the text classifier and the image classifier we designed deliver promising results, the fusion algorithm outperforms either of the individual classifiers, and our model can be adapted to different phishing cases. © 2011 IEEE
NASA Astrophysics Data System (ADS)
Schurer, Andrew; Hegerl, Gabriele
2016-04-01
The evaluation of the transient climate response (TCR) is of critical importance to policy makers as it can be used to calculate a simple estimate of the expected warming given predicted greenhouse gas emissions. Previous studies using optimal detection techniques have been able to estimate a TCR value from the historic record using simulations from some of the models which took part in the Coupled Model Intercomparison Project Phase 5 (CMIP5) but have found that others give unconstrained results. At least partly this is due to degeneracy between the greenhouse gas and aerosol signals which makes separation of the temperature response to these forcings problematic. Here we re-visit this important topic by using an adapted optimal detection analysis within a Bayesian framework. We account for observational uncertainty by the use of an ensemble of instrumental observations, and model uncertainty by combining the results from several different models. This framework allows the use of prior information which is found to help separate the response to the different forcings leading to a more constrained estimate of TCR.
NASA Astrophysics Data System (ADS)
Lu, Dan; Ricciuto, Daniel; Walker, Anthony; Safta, Cosmin; Munger, William
2017-09-01
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results in a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. The result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.
Bayesian inference for the spatio-temporal invasion of alien species.
Cook, Alex; Marion, Glenn; Butler, Adam; Gibson, Gavin
2007-08-01
In this paper we develop a Bayesian approach to parameter estimation in a stochastic spatio-temporal model of the spread of invasive species across a landscape. To date, statistical techniques, such as logistic and autologistic regression, have outstripped stochastic spatio-temporal models in their ability to handle large numbers of covariates. Here we seek to address this problem by making use of a range of covariates describing the bio-geographical features of the landscape. Relative to regression techniques, stochastic spatio-temporal models are more transparent in their representation of biological processes. They also explicitly model temporal change, and therefore do not require the assumption that the species' distribution (or other spatial pattern) has already reached equilibrium as is often the case with standard statistical approaches. In order to illustrate the use of such techniques we apply them to the analysis of data detailing the spread of an invasive plant, Heracleum mantegazzianum, across Britain in the 20th Century using geo-referenced covariate information describing local temperature, elevation and habitat type. The use of Markov chain Monte Carlo sampling within a Bayesian framework facilitates statistical assessments of differences in the suitability of different habitat classes for H. mantegazzianum, and enables predictions of future spread to account for parametric uncertainty and system variability. Our results show that ignoring such covariate information may lead to biased estimates of key processes and implausible predictions of future distributions.
NASA Astrophysics Data System (ADS)
Costa, Veber; Fernandes, Wilson
2017-11-01
Extreme flood estimation has been a key research topic in hydrological sciences. Reliable estimates of such events are necessary as structures for flood conveyance are continuously evolving in size and complexity and, as a result, their failure-associated hazards become more and more pronounced. Due to this fact, several estimation techniques intended to improve flood frequency analysis and reducing uncertainty in extreme quantile estimation have been addressed in the literature in the last decades. In this paper, we develop a Bayesian framework for the indirect estimation of extreme flood quantiles from rainfall-runoff models. In the proposed approach, an ensemble of long daily rainfall series is simulated with a stochastic generator, which models extreme rainfall amounts with an upper-bounded distribution function, namely, the 4-parameter lognormal model. The rationale behind the generation model is that physical limits for rainfall amounts, and consequently for floods, exist and, by imposing an appropriate upper bound for the probabilistic model, more plausible estimates can be obtained for those rainfall quantiles with very low exceedance probabilities. Daily rainfall time series are converted into streamflows by routing each realization of the synthetic ensemble through a conceptual hydrologic model, the Rio Grande rainfall-runoff model. Calibration of parameters is performed through a nonlinear regression model, by means of the specification of a statistical model for the residuals that is able to accommodate autocorrelation, heteroscedasticity and nonnormality. By combining the outlined steps in a Bayesian structure of analysis, one is able to properly summarize the resulting uncertainty and estimating more accurate credible intervals for a set of flood quantiles of interest. The method for extreme flood indirect estimation was applied to the American river catchment, at the Folsom dam, in the state of California, USA. Results show that most floods, including exceptionally large non-systematic events, were reasonably estimated with the proposed approach. In addition, by accounting for uncertainties in each modeling step, one is able to obtain a better understanding of the influential factors in large flood formation dynamics.
Cyber-T web server: differential analysis of high-throughput data.
Kayala, Matthew A; Baldi, Pierre
2012-07-01
The Bayesian regularization method for high-throughput differential analysis, described in Baldi and Long (A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 2001: 17: 509-519) and implemented in the Cyber-T web server, is one of the most widely validated. Cyber-T implements a t-test using a Bayesian framework to compute a regularized variance of the measurements associated with each probe under each condition. This regularized estimate is derived by flexibly combining the empirical measurements with a prior, or background, derived from pooling measurements associated with probes in the same neighborhood. This approach flexibly addresses problems associated with low replication levels and technology biases, not only for DNA microarrays, but also for other technologies, such as protein arrays, quantitative mass spectrometry and next-generation sequencing (RNA-seq). Here we present an update to the Cyber-T web server, incorporating several useful new additions and improvements. Several preprocessing data normalization options including logarithmic and (Variance Stabilizing Normalization) VSN transforms are included. To augment two-sample t-tests, a one-way analysis of variance is implemented. Several methods for multiple tests correction, including standard frequentist methods and a probabilistic mixture model treatment, are available. Diagnostic plots allow visual assessment of the results. The web server provides comprehensive documentation and example data sets. The Cyber-T web server, with R source code and data sets, is publicly available at http://cybert.ics.uci.edu/.
Handel, Ian G.; Tanya, Vincent N.; Hamman, Saidou M.; Nfon, Charles; Bergman, Ingrid E.; Malirat, Viviana; Sorensen, Karl J.; Bronsvoort, Barend M. de C.
2014-01-01
Herdsman-reported disease prevalence is widely used in veterinary epidemiologic studies, especially for diseases with visible external lesions; however, the accuracy of such reports is rarely validated. Thus, we used latent class analysis in a Bayesian framework to compare sensitivity and specificity of herdsman reporting with virus neutralization testing and use of 3 nonstructural protein ELISAs for estimates of foot-and-mouth disease (FMD) prevalence on the Adamawa plateau of Cameroon in 2000. Herdsman-reported estimates in this FMD-endemic area were comparable to those obtained from serologic testing. To harness to this cost-effective resource of monitoring emerging infectious diseases, we suggest that estimates of the sensitivity and specificity of herdsmen reporting should be done in parallel with serologic surveys of other animal diseases. PMID:25417556
Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...
Estimating Tree Height-Diameter Models with the Bayesian Method
Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei
2014-01-01
Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the “best” model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2. PMID:24711733
Estimating tree height-diameter models with the Bayesian method.
Zhang, Xiongqing; Duan, Aiguo; Zhang, Jianguo; Xiang, Congwei
2014-01-01
Six candidate height-diameter models were used to analyze the height-diameter relationships. The common methods for estimating the height-diameter models have taken the classical (frequentist) approach based on the frequency interpretation of probability, for example, the nonlinear least squares method (NLS) and the maximum likelihood method (ML). The Bayesian method has an exclusive advantage compared with classical method that the parameters to be estimated are regarded as random variables. In this study, the classical and Bayesian methods were used to estimate six height-diameter models, respectively. Both the classical method and Bayesian method showed that the Weibull model was the "best" model using data1. In addition, based on the Weibull model, data2 was used for comparing Bayesian method with informative priors with uninformative priors and classical method. The results showed that the improvement in prediction accuracy with Bayesian method led to narrower confidence bands of predicted value in comparison to that for the classical method, and the credible bands of parameters with informative priors were also narrower than uninformative priors and classical method. The estimated posterior distributions for parameters can be set as new priors in estimating the parameters using data2.
Yu, Hwa-Lung; Wang, Chih-Hsin
2013-02-05
Understanding the daily changes in ambient air quality concentrations is important to the assessing human exposure and environmental health. However, the fine temporal scales (e.g., hourly) involved in this assessment often lead to high variability in air quality concentrations. This is because of the complex short-term physical and chemical mechanisms among the pollutants. Consequently, high heterogeneity is usually present in not only the averaged pollution levels, but also the intraday variance levels of the daily observations of ambient concentration across space and time. This characteristic decreases the estimation performance of common techniques. This study proposes a novel quantile-based Bayesian maximum entropy (QBME) method to account for the nonstationary and nonhomogeneous characteristics of ambient air pollution dynamics. The QBME method characterizes the spatiotemporal dependence among the ambient air quality levels based on their location-specific quantiles and accounts for spatiotemporal variations using a local weighted smoothing technique. The epistemic framework of the QBME method can allow researchers to further consider the uncertainty of space-time observations. This study presents the spatiotemporal modeling of daily CO and PM10 concentrations across Taiwan from 1998 to 2009 using the QBME method. Results show that the QBME method can effectively improve estimation accuracy in terms of lower mean absolute errors and standard deviations over space and time, especially for pollutants with strong nonhomogeneous variances across space. In addition, the epistemic framework can allow researchers to assimilate the site-specific secondary information where the observations are absent because of the common preferential sampling issues of environmental data. The proposed QBME method provides a practical and powerful framework for the spatiotemporal modeling of ambient pollutants.
NASA Astrophysics Data System (ADS)
Thelen, Brian T.; Xique, Ismael J.; Burns, Joseph W.; Goley, G. Steven; Nolan, Adam R.; Benson, Jonathan W.
2017-04-01
With all of the new remote sensing modalities available, and with ever increasing capabilities and frequency of collection, there is a desire to fundamentally understand/quantify the information content in the collected image data relative to various exploitation goals, such as detection/classification. A fundamental approach for this is the framework of Bayesian decision theory, but a daunting challenge is to have significantly flexible and accurate multivariate models for the features and/or pixels that capture a wide assortment of distributions and dependen- cies. In addition, data can come in the form of both continuous and discrete representations, where the latter is often generated based on considerations of robustness to imaging conditions and occlusions/degradations. In this paper we propose a novel suite of "latent" models fundamentally based on multivariate Gaussian copula models that can be used for quantized data from SAR imagery. For this Latent Gaussian Copula (LGC) model, we derive an approximate, maximum-likelihood estimation algorithm and demonstrate very reasonable estimation performance even for the larger images with many pixels. However applying these LGC models to large dimen- sions/images within a Bayesian decision/classification theory is infeasible due to the computational/numerical issues in evaluating the true full likelihood, and we propose an alternative class of novel pseudo-likelihoood detection statistics that are computationally feasible. We show in a few simple examples that these statistics have the potential to provide very good and robust detection/classification performance. All of this framework is demonstrated on a simulated SLICY data set, and the results show the importance of modeling the dependencies, and of utilizing the pseudo-likelihood methods.
Marginally specified priors for non-parametric Bayesian estimation
Kessler, David C.; Hoff, Peter D.; Dunson, David B.
2014-01-01
Summary Prior specification for non-parametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. A statistician is unlikely to have informed opinions about all aspects of such a parameter but will have real information about functionals of the parameter, such as the population mean or variance. The paper proposes a new framework for non-parametric Bayes inference in which the prior distribution for a possibly infinite dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a non-parametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard non-parametric prior distributions in common use and inherit the large support of the standard priors on which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard non-parametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modelling of high dimensional sparse contingency tables. PMID:25663813
NASA Astrophysics Data System (ADS)
Xu, Yingru; Bernhard, Jonah E.; Bass, Steffen A.; Nahrgang, Marlene; Cao, Shanshan
2018-01-01
By applying a Bayesian model-to-data analysis, we estimate the temperature and momentum dependence of the heavy quark diffusion coefficient in an improved Langevin framework. The posterior range of the diffusion coefficient is obtained by performing a Markov chain Monte Carlo random walk and calibrating on the experimental data of D -meson RAA and v2 in three different collision systems at the Relativistic Heavy-Ion Collidaer (RHIC) and the Large Hadron Collider (LHC): Au-Au collisions at 200 GeV and Pb-Pb collisions at 2.76 and 5.02 TeV. The spatial diffusion coefficient is found to be consistent with lattice QCD calculations and comparable with other models' estimation. We demonstrate the capability of our improved Langevin model to simultaneously describe the RAA and v2 at both RHIC and the LHC energies, as well as the higher order flow coefficient such as D meson v3. We show that by applying a Bayesian analysis, we are able to quantitatively and systematically study the heavy flavor dynamics in heavy-ion collisions.
Li, Qian; Trivedi, Pravin K
2016-02-01
This paper develops an extended specification of the two-part model, which controls for unobservable self-selection and heterogeneity of health insurance, and analyzes the impact of Medicare supplemental plans on the prescription drug expenditure of the elderly, using a linked data set based on the Medicare Current Beneficiary Survey data for 2003-2004. The econometric analysis is conducted using a Bayesian econometric framework. We estimate the treatment effects for different counterfactuals and find significant evidence of endogeneity in plan choice and the presence of both adverse and advantageous selections in the supplemental insurance market. The average incentive effect is estimated to be $757 (2004 value) or 41% increase per person per year for the elderly enrolled in supplemental plans with drug coverage against the Medicare fee-for-service counterfactual and is $350 or 21% against the supplemental plans without drug coverage counterfactual. The incentive effect varies by different sources of drug coverage: highest for employer-sponsored insurance plans, followed by Medigap and managed medicare plans. Copyright © 2014 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Hopcroft, Peter O.; Valdes, Paul J.; Kaplan, Jed O.
2018-04-01
The observed rise in atmospheric methane (CH4) from 375 ppbv during the Last Glacial Maximum (LGM: 21,000 years ago) to 680 ppbv during the late preindustrial era is not well understood. Atmospheric chemistry considerations implicate an increase in CH4 sources, but process-based estimates fail to reproduce the required amplitude. CH4 stable isotopes provide complementary information that can help constrain the underlying causes of the increase. We combine Earth System model simulations of the late preindustrial and LGM CH4 cycles, including process-based estimates of the isotopic discrimination of vegetation, in a box model of atmospheric CH4 and its isotopes. Using a Bayesian approach, we show how model-based constraints and ice core observations may be combined in a consistent probabilistic framework. The resultant posterior distributions point to a strong reduction in wetland and other biogenic CH4 emissions during the LGM, with a modest increase in the geological source, or potentially natural or anthropogenic fires, accounting for the observed enrichment of δ13CH4.
A Bayesian inversion for slip distribution of 1 Apr 2007 Mw8.1 Solomon Islands Earthquake
NASA Astrophysics Data System (ADS)
Chen, T.; Luo, H.
2013-12-01
On 1 Apr 2007 the megathrust Mw8.1 Solomon Islands earthquake occurred in the southeast pacific along the New Britain subduction zone. 102 vertical displacement measurements over the southeastern end of the rupture zone from two field surveys after this event provide a unique constraint for slip distribution inversion. In conventional inversion method (such as bounded variable least squares) the smoothing parameter that determines the relative weight placed on fitting the data versus smoothing the slip distribution is often subjectively selected at the bend of the trade-off curve. Here a fully probabilistic inversion method[Fukuda,2008] is applied to estimate distributed slip and smoothing parameter objectively. The joint posterior probability density function of distributed slip and the smoothing parameter is formulated under a Bayesian framework and sampled with Markov chain Monte Carlo method. We estimate the spatial distribution of dip slip associated with the 1 Apr 2007 Solomon Islands earthquake with this method. Early results show a shallower dip angle than previous study and highly variable dip slip both along-strike and down-dip.
Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola
Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope
2010-01-01
The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775
A Bayesian approach to estimate evoked potentials.
Sparacino, Giovanni; Milani, Stefano; Arslan, Edoardo; Cobelli, Claudio
2002-06-01
Several approaches, based on different assumptions and with various degree of theoretical sophistication and implementation complexity, have been developed for improving the measurement of evoked potentials (EP) performed by conventional averaging (CA). In many of these methods, one of the major challenges is the exploitation of a priori knowledge. In this paper, we present a new method where the 2nd-order statistical information on the background EEG and on the unknown EP, necessary for the optimal filtering of each sweep in a Bayesian estimation framework, is, respectively, estimated from pre-stimulus data and obtained through a multiple integration of a white noise process model. The latter model is flexible (i.e. it can be employed for a large class of EP) and simple enough to be easily identifiable from the post-stimulus data thanks to a smoothing criterion. The mean EP is determined as the weighted average of the filtered sweeps, where each weight is inversely proportional to the expected value of the norm of the correspondent filter error, a quantity determinable thanks to the employment of the Bayesian approach. The performance of the new approach is shown on both simulated and real auditory EP. A signal-to-noise ratio enhancement is obtained that can allow the (possibly automatic) identification of peak latencies and amplitudes with less sweeps than those required by CA. For cochlear EP, the method also allows the audiology investigator to gather new and clinically important information. The possibility of handling single-sweep analysis with further development of the method is also addressed.
Bayesian inference of Calibration curves: application to archaeomagnetism
NASA Astrophysics Data System (ADS)
Lanos, P.
2003-04-01
The range of errors that occur at different stages of the archaeomagnetic calibration process are modelled using a Bayesian hierarchical model. The archaeomagnetic data obtained from archaeological structures such as hearths, kilns or sets of bricks and tiles, exhibit considerable experimental errors and are typically more or less well dated by archaeological context, history or chronometric methods (14C, TL, dendrochronology, etc.). They can also be associated with stratigraphic observations which provide prior relative chronological information. The modelling we describe in this paper allows all these observations, on materials from a given period, to be linked together, and the use of penalized maximum likelihood for smoothing univariate, spherical or three-dimensional time series data allows representation of the secular variation of the geomagnetic field over time. The smooth curve we obtain (which takes the form of a penalized natural cubic spline) provides an adaptation to the effects of variability in the density of reference points over time. Since our model takes account of all the known errors in the archaeomagnetic calibration process, we are able to obtain a functional highest-posterior-density envelope on the new curve. With this new posterior estimate of the curve available to us, the Bayesian statistical framework then allows us to estimate the calendar dates of undated archaeological features (such as kilns) based on one, two or three geomagnetic parameters (inclination, declination and/or intensity). Date estimates are presented in much the same way as those that arise from radiocarbon dating. In order to illustrate the model and inference methods used, we will present results based on German archaeomagnetic data recently published by a German team.
Trans-Dimensional Bayesian Imaging of 3-D Crustal and Upper Mantle Structure in Northeast Asia
NASA Astrophysics Data System (ADS)
Kim, S.; Tkalcic, H.; Rhie, J.; Chen, Y.
2016-12-01
Imaging 3-D structures using stepwise inversions of ambient noise and receiver function data is now a routine work. Here, we carry out the inversion in the trans-dimensional and hierarchical extension of the Bayesian framework to obtain rigorous estimates of uncertainty and high-resolution images of crustal and upper mantle structures beneath Northeast (NE) Asia. The methods inherently account for data sensitivities by means of using adaptive parameterizations and treating data noise as free parameters. Therefore, parsimonious results from the methods are balanced out between model complexity and data fitting. This allows fully exploiting data information, preventing from over- or under-estimation of the data fit, and increases model resolution. In addition, the reliability of results is more rigorously checked through the use of Bayesian uncertainties. It is shown by various synthetic recovery tests that complex and spatially variable features are well resolved in our resulting images of NE Asia. Rayleigh wave phase and group velocity tomograms (8-70 s), a 3-D shear-wave velocity model from depth inversions of the estimated dispersion maps, and regional 3-D models (NE China, the Korean Peninsula, and the Japanese islands) from joint inversions with receiver function data of dense networks are presented. High-resolution models are characterized by a number of tectonically meaningful features. We focus our interpretation on complex patterns of sub-lithospheric low velocity structures that extend from back-arc regions to continental margins. We interpret the anomalies in conjunction with distal and distributed intraplate volcanoes in NE Asia. Further discussion on other imaged features will be presented.
Accurate Biomass Estimation via Bayesian Adaptive Sampling
NASA Technical Reports Server (NTRS)
Wheeler, Kevin R.; Knuth, Kevin H.; Castle, Joseph P.; Lvov, Nikolay
2005-01-01
The following concepts were introduced: a) Bayesian adaptive sampling for solving biomass estimation; b) Characterization of MISR Rahman model parameters conditioned upon MODIS landcover. c) Rigorous non-parametric Bayesian approach to analytic mixture model determination. d) Unique U.S. asset for science product validation and verification.
A novel approach for estimating ingested dose associated with paracetamol overdose
Zurlinden, Todd J.; Heard, Kennon
2015-01-01
Aim In cases of paracetamol (acetaminophen, APAP) overdose, an accurate estimate of tissue‐specific paracetamol pharmacokinetics (PK) and ingested dose can offer health care providers important information for the individualized treatment and follow‐up of affected patients. Here a novel methodology is presented to make such estimates using a standard serum paracetamol measurement and a computational framework. Methods The core component of the computational framework was a physiologically‐based pharmacokinetic (PBPK) model developed and evaluated using an extensive set of human PK data. Bayesian inference was used for parameter and dose estimation, allowing the incorporation of inter‐study variability, and facilitating the calculation of uncertainty in model outputs. Results Simulations of paracetamol time course concentrations in the blood were in close agreement with experimental data under a wide range of dosing conditions. Also, predictions of administered dose showed good agreement with a large collection of clinical and emergency setting PK data over a broad dose range. In addition to dose estimation, the platform was applied for the determination of optimal blood sampling times for dose reconstruction and quantitation of the potential role of paracetamol conjugate measurement on dose estimation. Conclusions Current therapies for paracetamol overdose rely on a generic methodology involving the use of a clinical nomogram. By using the computational framework developed in this study, serum sample data, and the individual patient's anthropometric and physiological information, personalized serum and liver pharmacokinetic profiles and dose estimate could be generated to help inform an individualized overdose treatment and follow‐up plan. PMID:26441245
A novel approach for estimating ingested dose associated with paracetamol overdose.
Zurlinden, Todd J; Heard, Kennon; Reisfeld, Brad
2016-04-01
In cases of paracetamol (acetaminophen, APAP) overdose, an accurate estimate of tissue-specific paracetamol pharmacokinetics (PK) and ingested dose can offer health care providers important information for the individualized treatment and follow-up of affected patients. Here a novel methodology is presented to make such estimates using a standard serum paracetamol measurement and a computational framework. The core component of the computational framework was a physiologically-based pharmacokinetic (PBPK) model developed and evaluated using an extensive set of human PK data. Bayesian inference was used for parameter and dose estimation, allowing the incorporation of inter-study variability, and facilitating the calculation of uncertainty in model outputs. Simulations of paracetamol time course concentrations in the blood were in close agreement with experimental data under a wide range of dosing conditions. Also, predictions of administered dose showed good agreement with a large collection of clinical and emergency setting PK data over a broad dose range. In addition to dose estimation, the platform was applied for the determination of optimal blood sampling times for dose reconstruction and quantitation of the potential role of paracetamol conjugate measurement on dose estimation. Current therapies for paracetamol overdose rely on a generic methodology involving the use of a clinical nomogram. By using the computational framework developed in this study, serum sample data, and the individual patient's anthropometric and physiological information, personalized serum and liver pharmacokinetic profiles and dose estimate could be generated to help inform an individualized overdose treatment and follow-up plan. © 2015 The British Pharmacological Society.
ERIC Educational Resources Information Center
Wu, Haiyan
2013-01-01
General diagnostic models (GDMs) and Bayesian networks are mathematical frameworks that cover a wide variety of psychometric models. Both extend latent class models, and while GDMs also extend item response theory (IRT) models, Bayesian networks can be parameterized using discretized IRT. The purpose of this study is to examine similarities and…
Kimberley K. Ayre; Wayne G. Landis
2012-01-01
We present a Bayesian network model based on the ecological risk assessment framework to evaluate potential impacts to habitats and resources resulting from wildfire, grazing, forest management activities, and insect outbreaks in a forested landscape in northeastern Oregon. The Bayesian network structure consisted of three tiers of nodes: landscape disturbances,...
Robust, Adaptive Functional Regression in Functional Mixed Model Framework.
Zhu, Hongxiao; Brown, Philip J; Morris, Jeffrey S
2011-09-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets.
Robust, Adaptive Functional Regression in Functional Mixed Model Framework
Zhu, Hongxiao; Brown, Philip J.; Morris, Jeffrey S.
2012-01-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets. PMID:22308015
A Bayesian framework for knowledge attribution: evidence from semantic integration.
Powell, Derek; Horne, Zachary; Pinillos, N Ángel; Holyoak, Keith J
2015-06-01
We propose a Bayesian framework for the attribution of knowledge, and apply this framework to generate novel predictions about knowledge attribution for different types of "Gettier cases", in which an agent is led to a justified true belief yet has made erroneous assumptions. We tested these predictions using a paradigm based on semantic integration. We coded the frequencies with which participants falsely recalled the word "thought" as "knew" (or a near synonym), yielding an implicit measure of conceptual activation. Our experiments confirmed the predictions of our Bayesian account of knowledge attribution across three experiments. We found that Gettier cases due to counterfeit objects were not treated as knowledge (Experiment 1), but those due to intentionally-replaced evidence were (Experiment 2). Our findings are not well explained by an alternative account focused only on luck, because accidentally-replaced evidence activated the knowledge concept more strongly than did similar false belief cases (Experiment 3). We observed a consistent pattern of results across a number of different vignettes that varied the quality and type of evidence available to agents, the relative stakes involved, and surface details of content. Accordingly, the present findings establish basic phenomena surrounding people's knowledge attributions in Gettier cases, and provide explanations of these phenomena within a Bayesian framework. Copyright © 2015 Elsevier B.V. All rights reserved.
Staatz, Christine E; Tett, Susan E
2011-12-01
This review seeks to summarize the available data about Bayesian estimation of area under the plasma concentration-time curve (AUC) and dosage prediction for mycophenolic acid (MPA) and evaluate whether sufficient evidence is available for routine use of Bayesian dosage prediction in clinical practice. A literature search identified 14 studies that assessed the predictive performance of maximum a posteriori Bayesian estimation of MPA AUC and one report that retrospectively evaluated how closely dosage recommendations based on Bayesian forecasting achieved targeted MPA exposure. Studies to date have mostly been undertaken in renal transplant recipients, with limited investigation in patients treated with MPA for autoimmune disease or haematopoietic stem cell transplantation. All of these studies have involved use of the mycophenolate mofetil (MMF) formulation of MPA, rather than the enteric-coated mycophenolate sodium (EC-MPS) formulation. Bias associated with estimation of MPA AUC using Bayesian forecasting was generally less than 10%. However some difficulties with imprecision was evident, with values ranging from 4% to 34% (based on estimation involving two or more concentration measurements). Evaluation of whether MPA dosing decisions based on Bayesian forecasting (by the free website service https://pharmaco.chu-limoges.fr) achieved target drug exposure has only been undertaken once. When MMF dosage recommendations were applied by clinicians, a higher proportion (72-80%) of subsequent estimated MPA AUC values were within the 30-60 mg · h/L target range, compared with when dosage recommendations were not followed (only 39-57% within target range). Such findings provide evidence that Bayesian dosage prediction is clinically useful for achieving target MPA AUC. This study, however, was retrospective and focussed only on adult renal transplant recipients. Furthermore, in this study, Bayesian-generated AUC estimations and dosage predictions were not compared with a later full measured AUC but rather with a further AUC estimate based on a second Bayesian analysis. This study also provided some evidence that a useful monitoring schedule for MPA AUC following adult renal transplant would be every 2 weeks during the first month post-transplant, every 1-3 months between months 1 and 12, and each year thereafter. It will be interesting to see further validations in different patient groups using the free website service. In summary, the predictive performance of Bayesian estimation of MPA, comparing estimated with measured AUC values, has been reported in several studies. However, the next step of predicting dosages based on these Bayesian-estimated AUCs, and prospectively determining how closely these predicted dosages give drug exposure matching targeted AUCs, remains largely unaddressed. Further prospective studies are required, particularly in non-renal transplant patients and with the EC-MPS formulation. Other important questions remain to be answered, such as: do Bayesian forecasting methods devised to date use the best population pharmacokinetic models or most accurate algorithms; are the methods simple to use for routine clinical practice; do the algorithms actually improve dosage estimations beyond empirical recommendations in all groups that receive MPA therapy; and, importantly, do the dosage predictions, when followed, improve patient health outcomes?
Kling, Daniel; Egeland, Thore; Mostad, Petter
2012-01-01
In a number of applications there is a need to determine the most likely pedigree for a group of persons based on genetic markers. Adequate models are needed to reach this goal. The markers used to perform the statistical calculations can be linked and there may also be linkage disequilibrium (LD) in the population. The purpose of this paper is to present a graphical Bayesian Network framework to deal with such data. Potential LD is normally ignored and it is important to verify that the resulting calculations are not biased. Even if linkage does not influence results for regular paternity cases, it may have substantial impact on likelihood ratios involving other, more extended pedigrees. Models for LD influence likelihoods for all pedigrees to some degree and an initial estimate of the impact of ignoring LD and/or linkage is desirable, going beyond mere rules of thumb based on marker distance. Furthermore, we show how one can readily include a mutation model in the Bayesian Network; extending other programs or formulas to include such models may require considerable amounts of work and will in many case not be practical. As an example, we consider the two STR markers vWa and D12S391. We estimate probabilities for population haplotypes to account for LD using a method based on data from trios, while an estimate for the degree of linkage is taken from the literature. The results show that accounting for haplotype frequencies is unnecessary in most cases for this specific pair of markers. When doing calculations on regular paternity cases, the markers can be considered statistically independent. In more complex cases of disputed relatedness, for instance cases involving siblings or so-called deficient cases, or when small differences in the LR matter, independence should not be assumed. (The networks are freely available at http://arken.umb.no/~dakl/BayesianNetworks.) PMID:22984448
Conroy, M.J.; Runge, J.P.; Barker, R.J.; Schofield, M.R.; Fonnesbeck, C.J.
2008-01-01
Many organisms are patchily distributed, with some patches occupied at high density, others at lower densities, and others not occupied. Estimation of overall abundance can be difficult and is inefficient via intensive approaches such as capture-mark-recapture (CMR) or distance sampling. We propose a two-phase sampling scheme and model in a Bayesian framework to estimate abundance for patchily distributed populations. In the first phase, occupancy is estimated by binomial detection samples taken on all selected sites, where selection may be of all sites available, or a random sample of sites. Detection can be by visual surveys, detection of sign, physical captures, or other approach. At the second phase, if a detection threshold is achieved, CMR or other intensive sampling is conducted via standard procedures (grids or webs) to estimate abundance. Detection and CMR data are then used in a joint likelihood to model probability of detection in the occupancy sample via an abundance-detection model. CMR modeling is used to estimate abundance for the abundance-detection relationship, which in turn is used to predict abundance at the remaining sites, where only detection data are collected. We present a full Bayesian modeling treatment of this problem, in which posterior inference on abundance and other parameters (detection, capture probability) is obtained under a variety of assumptions about spatial and individual sources of heterogeneity. We apply the approach to abundance estimation for two species of voles (Microtus spp.) in Montana, USA. We also use a simulation study to evaluate the frequentist properties of our procedure given known patterns in abundance and detection among sites as well as design criteria. For most population characteristics and designs considered, bias and mean-square error (MSE) were low, and coverage of true parameter values by Bayesian credibility intervals was near nominal. Our two-phase, adaptive approach allows efficient estimation of abundance of rare and patchily distributed species and is particularly appropriate when sampling in all patches is impossible, but a global estimate of abundance is required.
A new Bayesian Earthquake Analysis Tool (BEAT)
NASA Astrophysics Data System (ADS)
Vasyura-Bathke, Hannes; Dutta, Rishabh; Jónsson, Sigurjón; Mai, Martin
2017-04-01
Modern earthquake source estimation studies increasingly use non-linear optimization strategies to estimate kinematic rupture parameters, often considering geodetic and seismic data jointly. However, the optimization process is complex and consists of several steps that need to be followed in the earthquake parameter estimation procedure. These include pre-describing or modeling the fault geometry, calculating the Green's Functions (often assuming a layered elastic half-space), and estimating the distributed final slip and possibly other kinematic source parameters. Recently, Bayesian inference has become popular for estimating posterior distributions of earthquake source model parameters given measured/estimated/assumed data and model uncertainties. For instance, some research groups consider uncertainties of the layered medium and propagate these to the source parameter uncertainties. Other groups make use of informative priors to reduce the model parameter space. In addition, innovative sampling algorithms have been developed that efficiently explore the often high-dimensional parameter spaces. Compared to earlier studies, these improvements have resulted in overall more robust source model parameter estimates that include uncertainties. However, the computational demands of these methods are high and estimation codes are rarely distributed along with the published results. Even if codes are made available, it is often difficult to assemble them into a single optimization framework as they are typically coded in different programing languages. Therefore, further progress and future applications of these methods/codes are hampered, while reproducibility and validation of results has become essentially impossible. In the spirit of providing open-access and modular codes to facilitate progress and reproducible research in earthquake source estimations, we undertook the effort of producing BEAT, a python package that comprises all the above-mentioned features in one single programing environment. The package is build on top of the pyrocko seismological toolbox (www.pyrocko.org) and makes use of the pymc3 module for Bayesian statistical model fitting. BEAT is an open-source package (https://github.com/hvasbath/beat) and we encourage and solicit contributions to the project. In this contribution, we present our strategy for developing BEAT, show application examples, and discuss future developments.
NASA Astrophysics Data System (ADS)
Babcock, C. R.; Finley, A. O.; Andersen, H. E.; Moskal, L. M.; Morton, D. C.; Cook, B.; Nelson, R.
2017-12-01
Upcoming satellite lidar missions, such as GEDI and IceSat-2, are designed to collect laser altimetry data from space for narrow bands along orbital tracts. As a result lidar metric sets derived from these sources will not be of complete spatial coverage. This lack of complete coverage, or sparsity, means traditional regression approaches that consider lidar metrics as explanatory variables (without error) cannot be used to generate wall-to-wall maps of forest inventory variables. We implement a coregionalization framework to jointly model sparsely sampled lidar information and point-referenced forest variable measurements to create wall-to-wall maps with full probabilistic uncertainty quantification of all inputs. We inform the model with USFS Forest Inventory and Analysis (FIA) in-situ forest measurements and GLAS lidar data to spatially predict aboveground forest biomass (AGB) across the contiguous US. We cast our model within a Bayesian hierarchical framework to better model complex space-varying correlation structures among the lidar metrics and FIA data, which yields improved prediction and uncertainty assessment. To circumvent computational difficulties that arise when fitting complex geostatistical models to massive datasets, we use a Nearest Neighbor Gaussian process (NNGP) prior. Results indicate that a coregionalization modeling approach to leveraging sampled lidar data to improve AGB estimation is effective. Further, fitting the coregionalization model within a Bayesian mode of inference allows for AGB quantification across scales ranging from individual pixel estimates of AGB density to total AGB for the continental US with uncertainty. The coregionalization framework examined here is directly applicable to future spaceborne lidar acquisitions from GEDI and IceSat-2. Pairing these lidar sources with the extensive FIA forest monitoring plot network using a joint prediction framework, such as the coregionalization model explored here, offers the potential to improve forest AGB accounting certainty and provide maps for post-model fitting analysis of the spatial distribution of AGB.
An introduction to Bayesian statistics in health psychology.
Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske
2017-09-01
The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.
Scale Mixture Models with Applications to Bayesian Inference
NASA Astrophysics Data System (ADS)
Qin, Zhaohui S.; Damien, Paul; Walker, Stephen
2003-11-01
Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.
Structural and parameteric uncertainty quantification in cloud microphysics parameterization schemes
NASA Astrophysics Data System (ADS)
van Lier-Walqui, M.; Morrison, H.; Kumjian, M. R.; Prat, O. P.; Martinkus, C.
2017-12-01
Atmospheric model parameterization schemes employ approximations to represent the effects of unresolved processes. These approximations are a source of error in forecasts, caused in part by considerable uncertainty about the optimal value of parameters within each scheme -- parameteric uncertainty. Furthermore, there is uncertainty regarding the best choice of the overarching structure of the parameterization scheme -- structrual uncertainty. Parameter estimation can constrain the first, but may struggle with the second because structural choices are typically discrete. We address this problem in the context of cloud microphysics parameterization schemes by creating a flexible framework wherein structural and parametric uncertainties can be simultaneously constrained. Our scheme makes no assuptions about drop size distribution shape or the functional form of parametrized process rate terms. Instead, these uncertainties are constrained by observations using a Markov Chain Monte Carlo sampler within a Bayesian inference framework. Our scheme, the Bayesian Observationally-constrained Statistical-physical Scheme (BOSS), has flexibility to predict various sets of prognostic drop size distribution moments as well as varying complexity of process rate formulations. We compare idealized probabilistic forecasts from versions of BOSS with varying levels of structural complexity. This work has applications in ensemble forecasts with model physics uncertainty, data assimilation, and cloud microphysics process studies.
NASA Astrophysics Data System (ADS)
Hermans, Thomas; Nguyen, Frédéric; Klepikova, Maria; Dassargues, Alain; Caers, Jef
2018-04-01
In theory, aquifer thermal energy storage (ATES) systems can recover in winter the heat stored in the aquifer during summer to increase the energy efficiency of the system. In practice, the energy efficiency is often lower than expected from simulations due to spatial heterogeneity of hydraulic properties or non-favorable hydrogeological conditions. A proper design of ATES systems should therefore consider the uncertainty of the prediction related to those parameters. We use a novel framework called Bayesian Evidential Learning (BEL) to estimate the heat storage capacity of an alluvial aquifer using a heat tracing experiment. BEL is based on two main stages: pre- and postfield data acquisition. Before data acquisition, Monte Carlo simulations and global sensitivity analysis are used to assess the information content of the data to reduce the uncertainty of the prediction. After data acquisition, prior falsification and machine learning based on the same Monte Carlo are used to directly assess uncertainty on key prediction variables from observations. The result is a full quantification of the posterior distribution of the prediction conditioned to observed data, without any explicit full model inversion. We demonstrate the methodology in field conditions and validate the framework using independent measurements.
NASA Astrophysics Data System (ADS)
Meillier, Céline; Chatelain, Florent; Michel, Olivier; Bacon, Roland; Piqueras, Laure; Bacher, Raphael; Ayasso, Hacheme
2016-04-01
We present SELFI, the Source Emission Line FInder, a new Bayesian method optimized for detection of faint galaxies in Multi Unit Spectroscopic Explorer (MUSE) deep fields. MUSE is the new panoramic integral field spectrograph at the Very Large Telescope (VLT) that has unique capabilities for spectroscopic investigation of the deep sky. It has provided data cubes with 324 million voxels over a single 1 arcmin2 field of view. To address the challenge of faint-galaxy detection in these large data cubes, we developed a new method that processes 3D data either for modeling or for estimation and extraction of source configurations. This object-based approach yields a natural sparse representation of the sources in massive data fields, such as MUSE data cubes. In the Bayesian framework, the parameters that describe the observed sources are considered random variables. The Bayesian model leads to a general and robust algorithm where the parameters are estimated in a fully data-driven way. This detection algorithm was applied to the MUSE observation of Hubble Deep Field-South. With 27 h total integration time, these observations provide a catalog of 189 sources of various categories and with secured redshift. The algorithm retrieved 91% of the galaxies with only 9% false detection. This method also allowed the discovery of three new Lyα emitters and one [OII] emitter, all without any Hubble Space Telescope counterpart. We analyzed the reasons for failure for some targets, and found that the most important limitation of the method is when faint sources are located in the vicinity of bright spatially resolved galaxies that cannot be approximated by the Sérsic elliptical profile. The software and its documentation are available on the MUSE science web service (muse-vlt.eu/science).
Varughese, Eunice A; Brinkman, Nichole E; Anneken, Emily M; Cashdollar, Jennifer L; Fout, G Shay; Furlong, Edward T; Kolpin, Dana W; Glassmeyer, Susan T; Keely, Scott P
2018-04-01
Drinking water treatment plants rely on purification of contaminated source waters to provide communities with potable water. One group of possible contaminants are enteric viruses. Measurement of viral quantities in environmental water systems are often performed using polymerase chain reaction (PCR) or quantitative PCR (qPCR). However, true values may be underestimated due to challenges involved in a multi-step viral concentration process and due to PCR inhibition. In this study, water samples were concentrated from 25 drinking water treatment plants (DWTPs) across the US to study the occurrence of enteric viruses in source water and removal after treatment. The five different types of viruses studied were adenovirus, norovirus GI, norovirus GII, enterovirus, and polyomavirus. Quantitative PCR was performed on all samples to determine presence or absence of these viruses in each sample. Ten DWTPs showed presence of one or more viruses in source water, with four DWTPs having treated drinking water testing positive. Furthermore, PCR inhibition was assessed for each sample using an exogenous amplification control, which indicated that all of the DWTP samples, including source and treated water samples, had some level of inhibition, confirming that inhibition plays an important role in PCR-based assessments of environmental samples. PCR inhibition measurements, viral recovery, and other assessments were incorporated into a Bayesian model to more accurately determine viral load in both source and treated water. Results of the Bayesian model indicated that viruses are present in source water and treated water. By using a Bayesian framework that incorporates inhibition, as well as many other parameters that affect viral detection, this study offers an approach for more accurately estimating the occurrence of viral pathogens in environmental waters. Published by Elsevier B.V.
Understanding Past Population Dynamics: Bayesian Coalescent-Based Modeling with Covariates
Gill, Mandev S.; Lemey, Philippe; Bennett, Shannon N.; Biek, Roman; Suchard, Marc A.
2016-01-01
Effective population size characterizes the genetic variability in a population and is a parameter of paramount importance in population genetics and evolutionary biology. Kingman’s coalescent process enables inference of past population dynamics directly from molecular sequence data, and researchers have developed a number of flexible coalescent-based models for Bayesian nonparametric estimation of the effective population size as a function of time. Major goals of demographic reconstruction include identifying driving factors of effective population size, and understanding the association between the effective population size and such factors. Building upon Bayesian nonparametric coalescent-based approaches, we introduce a flexible framework that incorporates time-varying covariates that exploit Gaussian Markov random fields to achieve temporal smoothing of effective population size trajectories. To approximate the posterior distribution, we adapt efficient Markov chain Monte Carlo algorithms designed for highly structured Gaussian models. Incorporating covariates into the demographic inference framework enables the modeling of associations between the effective population size and covariates while accounting for uncertainty in population histories. Furthermore, it can lead to more precise estimates of population dynamics. We apply our model to four examples. We reconstruct the demographic history of raccoon rabies in North America and find a significant association with the spatiotemporal spread of the outbreak. Next, we examine the effective population size trajectory of the DENV-4 virus in Puerto Rico along with viral isolate count data and find similar cyclic patterns. We compare the population history of the HIV-1 CRF02_AG clade in Cameroon with HIV incidence and prevalence data and find that the effective population size is more reflective of incidence rate. Finally, we explore the hypothesis that the population dynamics of musk ox during the Late Quaternary period were related to climate change. [Coalescent; effective population size; Gaussian Markov random fields; phylodynamics; phylogenetics; population genetics. PMID:27368344
A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.
ERIC Educational Resources Information Center
Glas, Cees A. W.; Meijer, Rob R.
A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…
A Tutorial Introduction to Bayesian Models of Cognitive Development
ERIC Educational Resources Information Center
Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei
2011-01-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the "what", the "how", and the "why" of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for…
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
Spatio-temporal interpolation of precipitation during monsoon periods in Pakistan
NASA Astrophysics Data System (ADS)
Hussain, Ijaz; Spöck, Gunter; Pilz, Jürgen; Yu, Hwa-Lung
2010-08-01
Spatio-temporal estimation of precipitation over a region is essential to the modeling of hydrologic processes for water resources management. The changes of magnitude and space-time heterogeneity of rainfall observations make space-time estimation of precipitation a challenging task. In this paper we propose a Box-Cox transformed hierarchical Bayesian multivariate spatio-temporal interpolation method for the skewed response variable. The proposed method is applied to estimate space-time monthly precipitation in the monsoon periods during 1974-2000, and 27-year monthly average precipitation data are obtained from 51 stations in Pakistan. The results of transformed hierarchical Bayesian multivariate spatio-temporal interpolation are compared to those of non-transformed hierarchical Bayesian interpolation by using cross-validation. The software developed by [11] is used for Bayesian non-stationary multivariate space-time interpolation. It is observed that the transformed hierarchical Bayesian method provides more accuracy than the non-transformed hierarchical Bayesian method.
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided.
Computational modelling of cellular level metabolism
NASA Astrophysics Data System (ADS)
Calvetti, D.; Heino, J.; Somersalo, E.
2008-07-01
The steady and stationary state inverse problems consist of estimating the reaction and transport fluxes, blood concentrations and possibly the rates of change of some of the concentrations based on data which are often scarce noisy and sampled over a population. The Bayesian framework provides a natural setting for the solution of this inverse problem, because a priori knowledge about the system itself and the unknown reaction fluxes and transport rates can compensate for the insufficiency of measured data, provided that the computational costs do not become prohibitive. This article identifies the computational challenges which have to be met when analyzing the steady and stationary states of multicompartment model for cellular metabolism and suggest stable and efficient ways to handle the computations. The outline of a computational tool based on the Bayesian paradigm for the simulation and analysis of complex cellular metabolic systems is also presented.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Bayesian Group Bridge for Bi-level Variable Selection.
Mallick, Himel; Yi, Nengjun
2017-06-01
A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.
A Model-Based Probabilistic Inversion Framework for Wire Fault Detection Using TDR
NASA Technical Reports Server (NTRS)
Schuet, Stefan R.; Timucin, Dogan A.; Wheeler, Kevin R.
2010-01-01
Time-domain reflectometry (TDR) is one of the standard methods for diagnosing faults in electrical wiring and interconnect systems, with a long-standing history focused mainly on hardware development of both high-fidelity systems for laboratory use and portable hand-held devices for field deployment. While these devices can easily assess distance to hard faults such as sustained opens or shorts, their ability to assess subtle but important degradation such as chafing remains an open question. This paper presents a unified framework for TDR-based chafing fault detection in lossy coaxial cables by combining an S-parameter based forward modeling approach with a probabilistic (Bayesian) inference algorithm. Results are presented for the estimation of nominal and faulty cable parameters from laboratory data.
NASA Astrophysics Data System (ADS)
Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor
2018-02-01
Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.
Merging information from multi-model flood projections in a hierarchical Bayesian framework
NASA Astrophysics Data System (ADS)
Le Vine, Nataliya
2016-04-01
Multi-model ensembles are becoming widely accepted for flood frequency change analysis. The use of multiple models results in large uncertainty around estimates of flood magnitudes, due to both uncertainty in model selection and natural variability of river flow. The challenge is therefore to extract the most meaningful signal from the multi-model predictions, accounting for both model quality and uncertainties in individual model estimates. The study demonstrates the potential of a recently proposed hierarchical Bayesian approach to combine information from multiple models. The approach facilitates explicit treatment of shared multi-model discrepancy as well as the probabilistic nature of the flood estimates, by treating the available models as a sample from a hypothetical complete (but unobserved) set of models. The advantages of the approach are: 1) to insure an adequate 'baseline' conditions with which to compare future changes; 2) to reduce flood estimate uncertainty; 3) to maximize use of statistical information in circumstances where multiple weak predictions individually lack power, but collectively provide meaningful information; 4) to adjust multi-model consistency criteria when model biases are large; and 5) to explicitly consider the influence of the (model performance) stationarity assumption. Moreover, the analysis indicates that reducing shared model discrepancy is the key to further reduction of uncertainty in the flood frequency analysis. The findings are of value regarding how conclusions about changing exposure to flooding are drawn, and to flood frequency change attribution studies.
Astrocytic tracer dynamics estimated from [1-¹¹C]-acetate PET measurements.
Arnold, Andrea; Calvetti, Daniela; Gjedde, Albert; Iversen, Peter; Somersalo, Erkki
2015-12-01
We address the problem of estimating the unknown parameters of a model of tracer kinetics from sequences of positron emission tomography (PET) scan data using a statistical sequential algorithm for the inference of magnitudes of dynamic parameters. The method, based on Bayesian statistical inference, is a modification of a recently proposed particle filtering and sequential Monte Carlo algorithm, where instead of preassigning the accuracy in the propagation of each particle, we fix the time step and account for the numerical errors in the innovation term. We apply the algorithm to PET images of [1-¹¹C]-acetate-derived tracer accumulation, estimating the transport rates in a three-compartment model of astrocytic uptake and metabolism of the tracer for a cohort of 18 volunteers from 3 groups, corresponding to healthy control individuals, cirrhotic liver and hepatic encephalopathy patients. The distribution of the parameters for the individuals and for the groups presented within the Bayesian framework support the hypothesis that the parameters for the hepatic encephalopathy group follow a significantly different distribution than the other two groups. The biological implications of the findings are also discussed. © The Authors 2014. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.
Hadwin, Paul J; Peterson, Sean D
2017-04-01
The Bayesian framework for parameter inference provides a basis from which subject-specific reduced-order vocal fold models can be generated. Previously, it has been shown that a particle filter technique is capable of producing estimates and associated credibility intervals of time-varying reduced-order vocal fold model parameters. However, the particle filter approach is difficult to implement and has a high computational cost, which can be barriers to clinical adoption. This work presents an alternative estimation strategy based upon Kalman filtering aimed at reducing the computational cost of subject-specific model development. The robustness of this approach to Gaussian and non-Gaussian noise is discussed. The extended Kalman filter (EKF) approach is found to perform very well in comparison with the particle filter technique at dramatically lower computational cost. Based upon the test cases explored, the EKF is comparable in terms of accuracy to the particle filter technique when greater than 6000 particles are employed; if less particles are employed, the EKF actually performs better. For comparable levels of accuracy, the solution time is reduced by 2 orders of magnitude when employing the EKF. By virtue of the approximations used in the EKF, however, the credibility intervals tend to be slightly underpredicted.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moerk, Anna-Karin, E-mail: anna-karin.mork@ki.s; Jonsson, Fredrik; Pharsight, a Certara company, St. Louis, MO
2009-11-01
The aim of this study was to derive improved estimates of population variability and uncertainty of physiologically based pharmacokinetic (PBPK) model parameters, especially of those related to the washin-washout behavior of polar volatile substances. This was done by optimizing a previously published washin-washout PBPK model for acetone in a Bayesian framework using Markov chain Monte Carlo simulation. The sensitivity of the model parameters was investigated by creating four different prior sets, where the uncertainty surrounding the population variability of the physiological model parameters was given values corresponding to coefficients of variation of 1%, 25%, 50%, and 100%, respectively. The PBPKmore » model was calibrated to toxicokinetic data from 2 previous studies where 18 volunteers were exposed to 250-550 ppm of acetone at various levels of workload. The updated PBPK model provided a good description of the concentrations in arterial, venous, and exhaled air. The precision of most of the model parameter estimates was improved. New information was particularly gained on the population distribution of the parameters governing the washin-washout effect. The results presented herein provide a good starting point to estimate the target dose of acetone in the working and general populations for risk assessment purposes.« less
Chen, Shi; Sanderson, Michael W; Lee, Chihoon; Cernicchiaro, Natalia; Renter, David G; Lanzas, Cristina
2016-09-15
Understanding the transmission dynamics of pathogens is essential to determine the epidemiology, ecology, and ways of controlling enterohemorrhagic Escherichia coli (EHEC) in animals and their environments. Our objective was to estimate the epidemiological fitness of common EHEC strains in cattle populations. For that purpose, we developed a Markov chain model to characterize the dynamics of 7 serogroups of enterohemorrhagic Escherichia coli (O26, O45, O103, O111, O121, O145, and O157) in cattle production environments based on a set of cross-sectional data on infection prevalence in 2 years in two U.S. states. The basic reproduction number (R0) was estimated using a Bayesian framework for each serogroup based on two criteria (using serogroup alone [the O-group data] and using O serogroup, Shiga toxin gene[s], and intimin [eae] gene together [the EHEC data]). In addition, correlations between external covariates (e.g., location, ambient temperature, dietary, and probiotic usage) and prevalence/R0 were quantified. R0 estimates varied substantially among different EHEC serogroups, with EHEC O157 having an R0 of >1 (∼1.5) and all six other EHEC serogroups having an R0 of less than 1. Using the O-group data substantially increased R0 estimates for the O26, O45, and O103 serogroups (R0 > 1) but not for the others. Different covariates had distinct influences on different serogroups: the coefficients for each covariate were different among serogroups. Our modeling and analysis of this system can be readily expanded to other pathogen systems in order to estimate the pathogen and external factors that influence spread of infectious agents. In this paper we describe a Bayesian modeling framework to estimate basic reproduction numbers of multiple serotypes of Shiga toxin-producing Escherichia coli according to a cross-sectional study. We then coupled a compartmental model to reconstruct the infection dynamics of these serotypes and quantify their risk in the population. We incorporated different sensitivity levels of detecting different serotypes and evaluated their potential influence on the estimation of basic reproduction numbers. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Sanderson, Michael W.; Lee, Chihoon; Cernicchiaro, Natalia; Renter, David G.; Lanzas, Cristina
2016-01-01
ABSTRACT Understanding the transmission dynamics of pathogens is essential to determine the epidemiology, ecology, and ways of controlling enterohemorrhagic Escherichia coli (EHEC) in animals and their environments. Our objective was to estimate the epidemiological fitness of common EHEC strains in cattle populations. For that purpose, we developed a Markov chain model to characterize the dynamics of 7 serogroups of enterohemorrhagic Escherichia coli (O26, O45, O103, O111, O121, O145, and O157) in cattle production environments based on a set of cross-sectional data on infection prevalence in 2 years in two U.S. states. The basic reproduction number (R0) was estimated using a Bayesian framework for each serogroup based on two criteria (using serogroup alone [the O-group data] and using O serogroup, Shiga toxin gene[s], and intimin [eae] gene together [the EHEC data]). In addition, correlations between external covariates (e.g., location, ambient temperature, dietary, and probiotic usage) and prevalence/R0 were quantified. R0 estimates varied substantially among different EHEC serogroups, with EHEC O157 having an R0 of >1 (∼1.5) and all six other EHEC serogroups having an R0 of less than 1. Using the O-group data substantially increased R0 estimates for the O26, O45, and O103 serogroups (R0 > 1) but not for the others. Different covariates had distinct influences on different serogroups: the coefficients for each covariate were different among serogroups. Our modeling and analysis of this system can be readily expanded to other pathogen systems in order to estimate the pathogen and external factors that influence spread of infectious agents. IMPORTANCE In this paper we describe a Bayesian modeling framework to estimate basic reproduction numbers of multiple serotypes of Shiga toxin-producing Escherichia coli according to a cross-sectional study. We then coupled a compartmental model to reconstruct the infection dynamics of these serotypes and quantify their risk in the population. We incorporated different sensitivity levels of detecting different serotypes and evaluated their potential influence on the estimation of basic reproduction numbers. PMID:27401976
Image denoising in mixed Poisson-Gaussian noise.
Luisier, Florian; Blu, Thierry; Unser, Michael
2011-03-01
We propose a general methodology (PURE-LET) to design and optimize a wide class of transform-domain thresholding algorithms for denoising images corrupted by mixed Poisson-Gaussian noise. We express the denoising process as a linear expansion of thresholds (LET) that we optimize by relying on a purely data-adaptive unbiased estimate of the mean-squared error (MSE), derived in a non-Bayesian framework (PURE: Poisson-Gaussian unbiased risk estimate). We provide a practical approximation of this theoretical MSE estimate for the tractable optimization of arbitrary transform-domain thresholding. We then propose a pointwise estimator for undecimated filterbank transforms, which consists of subband-adaptive thresholding functions with signal-dependent thresholds that are globally optimized in the image domain. We finally demonstrate the potential of the proposed approach through extensive comparisons with state-of-the-art techniques that are specifically tailored to the estimation of Poisson intensities. We also present denoising results obtained on real images of low-count fluorescence microscopy.
Sampling schemes and parameter estimation for nonlinear Bernoulli-Gaussian sparse models
NASA Astrophysics Data System (ADS)
Boudineau, Mégane; Carfantan, Hervé; Bourguignon, Sébastien; Bazot, Michael
2016-06-01
We address the sparse approximation problem in the case where the data are approximated by the linear combination of a small number of elementary signals, each of these signals depending non-linearly on additional parameters. Sparsity is explicitly expressed through a Bernoulli-Gaussian hierarchical model in a Bayesian framework. Posterior mean estimates are computed using Markov Chain Monte-Carlo algorithms. We generalize the partially marginalized Gibbs sampler proposed in the linear case in [1], and build an hybrid Hastings-within-Gibbs algorithm in order to account for the nonlinear parameters. All model parameters are then estimated in an unsupervised procedure. The resulting method is evaluated on a sparse spectral analysis problem. It is shown to converge more efficiently than the classical joint estimation procedure, with only a slight increase of the computational cost per iteration, consequently reducing the global cost of the estimation procedure.
Multinomial mixture model with heterogeneous classification probabilities
Holland, M.D.; Gray, B.R.
2011-01-01
Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.
Estimation of post-test probabilities by residents: Bayesian reasoning versus heuristics?
Hall, Stacey; Phang, Sen Han; Schaefer, Jeffrey P; Ghali, William; Wright, Bruce; McLaughlin, Kevin
2014-08-01
Although the process of diagnosing invariably begins with a heuristic, we encourage our learners to support their diagnoses by analytical cognitive processes, such as Bayesian reasoning, in an attempt to mitigate the effects of heuristics on diagnosing. There are, however, limited data on the use ± impact of Bayesian reasoning on the accuracy of disease probability estimates. In this study our objective was to explore whether Internal Medicine residents use a Bayesian process to estimate disease probabilities by comparing their disease probability estimates to literature-derived Bayesian post-test probabilities. We gave 35 Internal Medicine residents four clinical vignettes in the form of a referral letter and asked them to estimate the post-test probability of the target condition in each case. We then compared these to literature-derived probabilities. For each vignette the estimated probability was significantly different from the literature-derived probability. For the two cases with low literature-derived probability our participants significantly overestimated the probability of these target conditions being the correct diagnosis, whereas for the two cases with high literature-derived probability the estimated probability was significantly lower than the calculated value. Our results suggest that residents generate inaccurate post-test probability estimates. Possible explanations for this include ineffective application of Bayesian reasoning, attribute substitution whereby a complex cognitive task is replaced by an easier one (e.g., a heuristic), or systematic rater bias, such as central tendency bias. Further studies are needed to identify the reasons for inaccuracy of disease probability estimates and to explore ways of improving accuracy.
Bayesian randomized clinical trials: From fixed to adaptive design.
Yin, Guosheng; Lam, Chi Kin; Shi, Haolun
2017-08-01
Randomized controlled studies are the gold standard for phase III clinical trials. Using α-spending functions to control the overall type I error rate, group sequential methods are well established and have been dominating phase III studies. Bayesian randomized design, on the other hand, can be viewed as a complement instead of competitive approach to the frequentist methods. For the fixed Bayesian design, the hypothesis testing can be cast in the posterior probability or Bayes factor framework, which has a direct link to the frequentist type I error rate. Bayesian group sequential design relies upon Bayesian decision-theoretic approaches based on backward induction, which is often computationally intensive. Compared with the frequentist approaches, Bayesian methods have several advantages. The posterior predictive probability serves as a useful and convenient tool for trial monitoring, and can be updated at any time as the data accrue during the trial. The Bayesian decision-theoretic framework possesses a direct link to the decision making in the practical setting, and can be modeled more realistically to reflect the actual cost-benefit analysis during the drug development process. Other merits include the possibility of hierarchical modeling and the use of informative priors, which would lead to a more comprehensive utilization of information from both historical and longitudinal data. From fixed to adaptive design, we focus on Bayesian randomized controlled clinical trials and make extensive comparisons with frequentist counterparts through numerical studies. Copyright © 2017 Elsevier Inc. All rights reserved.
Bayesian network learning for natural hazard assessments
NASA Astrophysics Data System (ADS)
Vogel, Kristin
2016-04-01
Even though quite different in occurrence and consequences, from a modelling perspective many natural hazards share similar properties and challenges. Their complex nature as well as lacking knowledge about their driving forces and potential effects make their analysis demanding. On top of the uncertainty about the modelling framework, inaccurate or incomplete event observations and the intrinsic randomness of the natural phenomenon add up to different interacting layers of uncertainty, which require a careful handling. Thus, for reliable natural hazard assessments it is crucial not only to capture and quantify involved uncertainties, but also to express and communicate uncertainties in an intuitive way. Decision-makers, who often find it difficult to deal with uncertainties, might otherwise return to familiar (mostly deterministic) proceedings. In the scope of the DFG research training group „NatRiskChange" we apply the probabilistic framework of Bayesian networks for diverse natural hazard and vulnerability studies. The great potential of Bayesian networks was already shown in previous natural hazard assessments. Treating each model component as random variable, Bayesian networks aim at capturing the joint distribution of all considered variables. Hence, each conditional distribution of interest (e.g. the effect of precautionary measures on damage reduction) can be inferred. The (in-)dependencies between the considered variables can be learned purely data driven or be given by experts. Even a combination of both is possible. By translating the (in-)dependences into a graph structure, Bayesian networks provide direct insights into the workings of the system and allow to learn about the underlying processes. Besides numerous studies on the topic, learning Bayesian networks from real-world data remains challenging. In previous studies, e.g. on earthquake induced ground motion and flood damage assessments, we tackled the problems arising with continuous variables and incomplete observations. Further studies rise the challenge of relying on very small data sets. Since parameter estimates for complex models based on few observations are unreliable, it is necessary to focus on simplified, yet still meaningful models. A so called Markov Blanket approach is developed to identify the most relevant model components and to construct a simple Bayesian network based on those findings. Since the proceeding is completely data driven, it can easily be transferred to various applications in natural hazard domains. This study is funded by the Deutsche Forschungsgemeinschaft (DFG) within the research training programme GRK 2043/1 "NatRiskChange - Natural hazards and risks in a changing world" at Potsdam University.
Bayesian analysis of the astrobiological implications of life’s early emergence on Earth
Spiegel, David S.; Turner, Edwin L.
2012-01-01
Life arose on Earth sometime in the first few hundred million years after the young planet had cooled to the point that it could support water-based organisms on its surface. The early emergence of life on Earth has been taken as evidence that the probability of abiogenesis is high, if starting from young Earth-like conditions. We revisit this argument quantitatively in a Bayesian statistical framework. By constructing a simple model of the probability of abiogenesis, we calculate a Bayesian estimate of its posterior probability, given the data that life emerged fairly early in Earth’s history and that, billions of years later, curious creatures noted this fact and considered its implications. We find that, given only this very limited empirical information, the choice of Bayesian prior for the abiogenesis probability parameter has a dominant influence on the computed posterior probability. Although terrestrial life's early emergence provides evidence that life might be abundant in the universe if early-Earth-like conditions are common, the evidence is inconclusive and indeed is consistent with an arbitrarily low intrinsic probability of abiogenesis for plausible uninformative priors. Finding a single case of life arising independently of our lineage (on Earth, elsewhere in the solar system, or on an extrasolar planet) would provide much stronger evidence that abiogenesis is not extremely rare in the universe. PMID:22198766
Bayesian analysis of the astrobiological implications of life's early emergence on Earth.
Spiegel, David S; Turner, Edwin L
2012-01-10
Life arose on Earth sometime in the first few hundred million years after the young planet had cooled to the point that it could support water-based organisms on its surface. The early emergence of life on Earth has been taken as evidence that the probability of abiogenesis is high, if starting from young Earth-like conditions. We revisit this argument quantitatively in a bayesian statistical framework. By constructing a simple model of the probability of abiogenesis, we calculate a bayesian estimate of its posterior probability, given the data that life emerged fairly early in Earth's history and that, billions of years later, curious creatures noted this fact and considered its implications. We find that, given only this very limited empirical information, the choice of bayesian prior for the abiogenesis probability parameter has a dominant influence on the computed posterior probability. Although terrestrial life's early emergence provides evidence that life might be abundant in the universe if early-Earth-like conditions are common, the evidence is inconclusive and indeed is consistent with an arbitrarily low intrinsic probability of abiogenesis for plausible uninformative priors. Finding a single case of life arising independently of our lineage (on Earth, elsewhere in the solar system, or on an extrasolar planet) would provide much stronger evidence that abiogenesis is not extremely rare in the universe.
Bayesian power spectrum inference with foreground and target contamination treatment
NASA Astrophysics Data System (ADS)
Jasche, J.; Lavaux, G.
2017-10-01
This work presents a joint and self-consistent Bayesian treatment of various foreground and target contaminations when inferring cosmological power spectra and three-dimensional density fields from galaxy redshift surveys. This is achieved by introducing additional block-sampling procedures for unknown coefficients of foreground and target contamination templates to the previously presented ARES framework for Bayesian large-scale structure analyses. As a result, the method infers jointly and fully self-consistently three-dimensional density fields, cosmological power spectra, luminosity-dependent galaxy biases, noise levels of the respective galaxy distributions, and coefficients for a set of a priori specified foreground templates. In addition, this fully Bayesian approach permits detailed quantification of correlated uncertainties amongst all inferred quantities and correctly marginalizes over observational systematic effects. We demonstrate the validity and efficiency of our approach in obtaining unbiased estimates of power spectra via applications to realistic mock galaxy observations that are subject to stellar contamination and dust extinction. While simultaneously accounting for galaxy biases and unknown noise levels, our method reliably and robustly infers three-dimensional density fields and corresponding cosmological power spectra from deep galaxy surveys. Furthermore, our approach correctly accounts for joint and correlated uncertainties between unknown coefficients of foreground templates and the amplitudes of the power spectrum. This effect amounts to correlations and anti-correlations of up to 10 per cent across wide ranges in Fourier space.
Optimal joint detection and estimation that maximizes ROC-type curves
Wunderlich, Adam; Goossens, Bart; Abbey, Craig K.
2017-01-01
Combined detection-estimation tasks are frequently encountered in medical imaging. Optimal methods for joint detection and estimation are of interest because they provide upper bounds on observer performance, and can potentially be utilized for imaging system optimization, evaluation of observer efficiency, and development of image formation algorithms. We present a unified Bayesian framework for decision rules that maximize receiver operating characteristic (ROC)-type summary curves, including ROC, localization ROC (LROC), estimation ROC (EROC), free-response ROC (FROC), alternative free-response ROC (AFROC), and exponentially-transformed FROC (EFROC) curves, succinctly summarizing previous results. The approach relies on an interpretation of ROC-type summary curves as plots of an expected utility versus an expected disutility (or penalty) for signal-present decisions. We propose a general utility structure that is flexible enough to encompass many ROC variants and yet sufficiently constrained to allow derivation of a linear expected utility equation that is similar to that for simple binary detection. We illustrate our theory with an example comparing decision strategies for joint detection-estimation of a known signal with unknown amplitude. In addition, building on insights from our utility framework, we propose new ROC-type summary curves and associated optimal decision rules for joint detection-estimation tasks with an unknown, potentially-multiple, number of signals in each observation. PMID:27093544
Optimal Joint Detection and Estimation That Maximizes ROC-Type Curves.
Wunderlich, Adam; Goossens, Bart; Abbey, Craig K
2016-09-01
Combined detection-estimation tasks are frequently encountered in medical imaging. Optimal methods for joint detection and estimation are of interest because they provide upper bounds on observer performance, and can potentially be utilized for imaging system optimization, evaluation of observer efficiency, and development of image formation algorithms. We present a unified Bayesian framework for decision rules that maximize receiver operating characteristic (ROC)-type summary curves, including ROC, localization ROC (LROC), estimation ROC (EROC), free-response ROC (FROC), alternative free-response ROC (AFROC), and exponentially-transformed FROC (EFROC) curves, succinctly summarizing previous results. The approach relies on an interpretation of ROC-type summary curves as plots of an expected utility versus an expected disutility (or penalty) for signal-present decisions. We propose a general utility structure that is flexible enough to encompass many ROC variants and yet sufficiently constrained to allow derivation of a linear expected utility equation that is similar to that for simple binary detection. We illustrate our theory with an example comparing decision strategies for joint detection-estimation of a known signal with unknown amplitude. In addition, building on insights from our utility framework, we propose new ROC-type summary curves and associated optimal decision rules for joint detection-estimation tasks with an unknown, potentially-multiple, number of signals in each observation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lu, Dan; Ricciuto, Daniel M.; Walker, Anthony P.
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results inmore » a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. Here, the result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.« less
Lu, Dan; Ricciuto, Daniel M.; Walker, Anthony P.; ...
2017-09-27
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results inmore » a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. Here, the result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.« less
Tsai, Tsung-Heng; Tadesse, Mahlet G.; Di Poto, Cristina; Pannell, Lewis K.; Mechref, Yehia; Wang, Yue; Ressom, Habtom W.
2013-01-01
Motivation: Liquid chromatography-mass spectrometry (LC-MS) has been widely used for profiling expression levels of biomolecules in various ‘-omic’ studies including proteomics, metabolomics and glycomics. Appropriate LC-MS data preprocessing steps are needed to detect true differences between biological groups. Retention time (RT) alignment, which is required to ensure that ion intensity measurements among multiple LC-MS runs are comparable, is one of the most important yet challenging preprocessing steps. Current alignment approaches estimate RT variability using either single chromatograms or detected peaks, but do not simultaneously take into account the complementary information embedded in the entire LC-MS data. Results: We propose a Bayesian alignment model for LC-MS data analysis. The alignment model provides estimates of the RT variability along with uncertainty measures. The model enables integration of multiple sources of information including internal standards and clustered chromatograms in a mathematically rigorous framework. We apply the model to LC-MS metabolomic, proteomic and glycomic data. The performance of the model is evaluated based on ground-truth data, by measuring correlation of variation, RT difference across runs and peak-matching performance. We demonstrate that Bayesian alignment model improves significantly the RT alignment performance through appropriate integration of relevant information. Availability and implementation: MATLAB code, raw and preprocessed LC-MS data are available at http://omics.georgetown.edu/alignLCMS.html Contact: hwr@georgetown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24013927
Stepwise and stagewise approaches for spatial cluster detection
Xu, Jiale
2016-01-01
Spatial cluster detection is an important tool in many areas such as sociology, botany and public health. Previous work has mostly taken either hypothesis testing framework or Bayesian framework. In this paper, we propose a few approaches under a frequentist variable selection framework for spatial cluster detection. The forward stepwise methods search for multiple clusters by iteratively adding currently most likely cluster while adjusting for the effects of previously identified clusters. The stagewise methods also consist of a series of steps, but with tiny step size in each iteration. We study the features and performances of our proposed methods using simulations on idealized grids or real geographic area. From the simulations, we compare the performance of the proposed methods in terms of estimation accuracy and power of detections. These methods are applied to the the well-known New York leukemia data as well as Indiana poverty data. PMID:27246273
Stepwise and stagewise approaches for spatial cluster detection.
Xu, Jiale; Gangnon, Ronald E
2016-05-01
Spatial cluster detection is an important tool in many areas such as sociology, botany and public health. Previous work has mostly taken either a hypothesis testing framework or a Bayesian framework. In this paper, we propose a few approaches under a frequentist variable selection framework for spatial cluster detection. The forward stepwise methods search for multiple clusters by iteratively adding currently most likely cluster while adjusting for the effects of previously identified clusters. The stagewise methods also consist of a series of steps, but with a tiny step size in each iteration. We study the features and performances of our proposed methods using simulations on idealized grids or real geographic areas. From the simulations, we compare the performance of the proposed methods in terms of estimation accuracy and power. These methods are applied to the the well-known New York leukemia data as well as Indiana poverty data. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bhadra, Dhiman; Daniels, Michael J.; Kim, Sungduk; Ghosh, Malay; Mukherjee, Bhramar
2014-01-01
In a typical case-control study, exposure information is collected at a single time-point for the cases and controls. However, case-control studies are often embedded in existing cohort studies containing a wealth of longitudinal exposure history on the participants. Recent medical studies have indicated that incorporating past exposure history, or a constructed summary measure of cumulative exposure derived from the past exposure history, when available, may lead to more precise and clinically meaningful estimates of the disease risk. In this paper, we propose a flexible Bayesian semiparametric approach to model the longitudinal exposure profiles of the cases and controls and then use measures of cumulative exposure based on a weighted integral of this trajectory in the final disease risk model. The estimation is done via a joint likelihood. In the construction of the cumulative exposure summary, we introduce an influence function, a smooth function of time to characterize the association pattern of the exposure profile on the disease status with different time windows potentially having differential influence/weights. This enables us to analyze how the present disease status of a subject is influenced by his/her past exposure history conditional on the current ones. The joint likelihood formulation allows us to properly account for uncertainties associated with both stages of the estimation process in an integrated manner. Analysis is carried out in a hierarchical Bayesian framework using Reversible jump Markov chain Monte Carlo (RJMCMC) algorithms. The proposed methodology is motivated by, and applied to a case-control study of prostate cancer where longitudinal biomarker information is available for the cases and controls. PMID:22313248
qPR: An adaptive partial-report procedure based on Bayesian inference.
Baek, Jongsoo; Lesmes, Luis Andres; Lu, Zhong-Lin
2016-08-01
Iconic memory is best assessed with the partial report procedure in which an array of letters appears briefly on the screen and a poststimulus cue directs the observer to report the identity of the cued letter(s). Typically, 6-8 cue delays or 600-800 trials are tested to measure the iconic memory decay function. Here we develop a quick partial report, or qPR, procedure based on a Bayesian adaptive framework to estimate the iconic memory decay function with much reduced testing time. The iconic memory decay function is characterized by an exponential function and a joint probability distribution of its three parameters. Starting with a prior of the parameters, the method selects the stimulus to maximize the expected information gain in the next test trial. It then updates the posterior probability distribution of the parameters based on the observer's response using Bayesian inference. The procedure is reiterated until either the total number of trials or the precision of the parameter estimates reaches a certain criterion. Simulation studies showed that only 100 trials were necessary to reach an average absolute bias of 0.026 and a precision of 0.070 (both in terms of probability correct). A psychophysical validation experiment showed that estimates of the iconic memory decay function obtained with 100 qPR trials exhibited good precision (the half width of the 68.2% credible interval = 0.055) and excellent agreement with those obtained with 1,600 trials of the conventional method of constant stimuli procedure (RMSE = 0.063). Quick partial-report relieves the data collection burden in characterizing iconic memory and makes it possible to assess iconic memory in clinical populations.
qPR: An adaptive partial-report procedure based on Bayesian inference
Baek, Jongsoo; Lesmes, Luis Andres; Lu, Zhong-Lin
2016-01-01
Iconic memory is best assessed with the partial report procedure in which an array of letters appears briefly on the screen and a poststimulus cue directs the observer to report the identity of the cued letter(s). Typically, 6–8 cue delays or 600–800 trials are tested to measure the iconic memory decay function. Here we develop a quick partial report, or qPR, procedure based on a Bayesian adaptive framework to estimate the iconic memory decay function with much reduced testing time. The iconic memory decay function is characterized by an exponential function and a joint probability distribution of its three parameters. Starting with a prior of the parameters, the method selects the stimulus to maximize the expected information gain in the next test trial. It then updates the posterior probability distribution of the parameters based on the observer's response using Bayesian inference. The procedure is reiterated until either the total number of trials or the precision of the parameter estimates reaches a certain criterion. Simulation studies showed that only 100 trials were necessary to reach an average absolute bias of 0.026 and a precision of 0.070 (both in terms of probability correct). A psychophysical validation experiment showed that estimates of the iconic memory decay function obtained with 100 qPR trials exhibited good precision (the half width of the 68.2% credible interval = 0.055) and excellent agreement with those obtained with 1,600 trials of the conventional method of constant stimuli procedure (RMSE = 0.063). Quick partial-report relieves the data collection burden in characterizing iconic memory and makes it possible to assess iconic memory in clinical populations. PMID:27580045
Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis
ERIC Educational Resources Information Center
Ansari, Asim; Iyengar, Raghuram
2006-01-01
We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…
Bayesian stable isotope mixing models
In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixtur...
NASA Astrophysics Data System (ADS)
Zha, Yuanyuan; Yeh, Tian-Chyi J.; Illman, Walter A.; Onoe, Hironori; Mok, Chin Man W.; Wen, Jet-Chau; Huang, Shao-Yang; Wang, Wenke
2017-04-01
Hydraulic tomography (HT) has become a mature aquifer test technology over the last two decades. It collects nonredundant information of aquifer heterogeneity by sequentially stressing the aquifer at different wells and collecting aquifer responses at other wells during each stress. The collected information is then interpreted by inverse models. Among these models, the geostatistical approaches, built upon the Bayesian framework, first conceptualize hydraulic properties to be estimated as random fields, which are characterized by means and covariance functions. They then use the spatial statistics as prior information with the aquifer response data to estimate the spatial distribution of the hydraulic properties at a site. Since the spatial statistics describe the generic spatial structures of the geologic media at the site rather than site-specific ones (e.g., known spatial distributions of facies, faults, or paleochannels), the estimates are often not optimal. To improve the estimates, we introduce a general statistical framework, which allows the inclusion of site-specific spatial patterns of geologic features. Subsequently, we test this approach with synthetic numerical experiments. Results show that this approach, using conditional mean and covariance that reflect site-specific large-scale geologic features, indeed improves the HT estimates. Afterward, this approach is applied to HT surveys at a kilometer-scale-fractured granite field site with a distinct fault zone. We find that by including fault information from outcrops and boreholes for HT analysis, the estimated hydraulic properties are improved. The improved estimates subsequently lead to better prediction of flow during a different pumping test at the site.
NASA Astrophysics Data System (ADS)
Babcock, Chad; Finley, Andrew O.; Andersen, Hans-Erik; Pattison, Robert; Cook, Bruce D.; Morton, Douglas C.; Alonzo, Michael; Nelson, Ross; Gregoire, Timothy; Ene, Liviu; Gobakken, Terje; Næsset, Erik
2018-06-01
The goal of this research was to develop and examine the performance of a geostatistical coregionalization modeling approach for combining field inventory measurements, strip samples of airborne lidar and Landsat-based remote sensing data products to predict aboveground biomass (AGB) in interior Alaska's Tanana Valley. The proposed modeling strategy facilitates pixel-level mapping of AGB density predictions across the entire spatial domain. Additionally, the coregionalization framework allows for statistically sound estimation of total AGB for arbitrary areal units within the study area---a key advance to support diverse management objectives in interior Alaska. This research focuses on appropriate characterization of prediction uncertainty in the form of posterior predictive coverage intervals and standard deviations. Using the framework detailed here, it is possible to quantify estimation uncertainty for any spatial extent, ranging from pixel-level predictions of AGB density to estimates of AGB stocks for the full domain. The lidar-informed coregionalization models consistently outperformed their counterpart lidar-free models in terms of point-level predictive performance and total AGB precision. Additionally, the inclusion of Landsat-derived forest cover as a covariate further improved estimation precision in regions with lower lidar sampling intensity. Our findings also demonstrate that model-based approaches that do not explicitly account for residual spatial dependence can grossly underestimate uncertainty, resulting in falsely precise estimates of AGB. On the other hand, in a geostatistical setting, residual spatial structure can be modeled within a Bayesian hierarchical framework to obtain statistically defensible assessments of uncertainty for AGB estimates.
A model-based approach to wildland fire reconstruction using sediment charcoal records
Itter, Malcolm S.; Finley, Andrew O.; Hooten, Mevin B.; Higuera, Philip E.; Marlon, Jennifer R.; Kelly, Ryan; McLachlan, Jason S.
2017-01-01
Lake sediment charcoal records are used in paleoecological analyses to reconstruct fire history, including the identification of past wildland fires. One challenge of applying sediment charcoal records to infer fire history is the separation of charcoal associated with local fire occurrence and charcoal originating from regional fire activity. Despite a variety of methods to identify local fires from sediment charcoal records, an integrated statistical framework for fire reconstruction is lacking. We develop a Bayesian point process model to estimate the probability of fire associated with charcoal counts from individual-lake sediments and estimate mean fire return intervals. A multivariate extension of the model combines records from multiple lakes to reduce uncertainty in local fire identification and estimate a regional mean fire return interval. The univariate and multivariate models are applied to 13 lakes in the Yukon Flats region of Alaska. Both models resulted in similar mean fire return intervals (100–350 years) with reduced uncertainty under the multivariate model due to improved estimation of regional charcoal deposition. The point process model offers an integrated statistical framework for paleofire reconstruction and extends existing methods to infer regional fire history from multiple lake records with uncertainty following directly from posterior distributions.
Rediscovery of Good-Turing estimators via Bayesian nonparametrics.
Favaro, Stefano; Nipoti, Bernardo; Teh, Yee Whye
2016-03-01
The problem of estimating discovery probabilities originated in the context of statistical ecology, and in recent years it has become popular due to its frequent appearance in challenging applications arising in genetics, bioinformatics, linguistics, designs of experiments, machine learning, etc. A full range of statistical approaches, parametric and nonparametric as well as frequentist and Bayesian, has been proposed for estimating discovery probabilities. In this article, we investigate the relationships between the celebrated Good-Turing approach, which is a frequentist nonparametric approach developed in the 1940s, and a Bayesian nonparametric approach recently introduced in the literature. Specifically, under the assumption of a two parameter Poisson-Dirichlet prior, we show that Bayesian nonparametric estimators of discovery probabilities are asymptotically equivalent, for a large sample size, to suitably smoothed Good-Turing estimators. As a by-product of this result, we introduce and investigate a methodology for deriving exact and asymptotic credible intervals to be associated with the Bayesian nonparametric estimators of discovery probabilities. The proposed methodology is illustrated through a comprehensive simulation study and the analysis of Expressed Sequence Tags data generated by sequencing a benchmark complementary DNA library. © 2015, The International Biometric Society.
A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research
ERIC Educational Resources Information Center
van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B.; Neyer, Franz J.; van Aken, Marcel A. G.
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, the ingredients underlying Bayesian methods are…
Kärkkäinen, Hanni P; Sillanpää, Mikko J
2013-09-04
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.
Kärkkäinen, Hanni P.; Sillanpää, Mikko J.
2013-01-01
Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618
NASA Astrophysics Data System (ADS)
Duan, Y.; Durand, M. T.; Jezek, K. C.; Yardim, C.; Bringer, A.; Aksoy, M.; Johnson, J. T.
2017-12-01
The ultra-wideband software-defined microwave radiometer (UWBRAD) is designed to provide ice sheet internal temperature product via measuring low frequency microwave emission. Twelve channels ranging from 0.5 to 2.0 GHz are covered by the instrument. A Greenland air-borne demonstration was demonstrated in September 2016, provided first demonstration of Ultra-wideband radiometer observations of geophysical scenes, including ice sheets. Another flight is planned for September 2017 for acquiring measurements in central ice sheet. A Bayesian framework is designed to retrieve the ice sheet internal temperature from simulated UWBRAD brightness temperature (Tb) measurements over Greenland flight path with limited prior information of the ground. A 1-D heat-flow model, the Robin Model, was used to model the ice sheet internal temperature profile with ground information. Synthetic UWBRAD Tb observations was generated via the partially coherent radiation transfer model, which utilizes the Robin model temperature profile and an exponential fit of ice density from Borehole measurement as input, and corrupted with noise. The effective surface temperature, geothermal heat flux, the variance of upper layer ice density, and the variance of fine scale density variation at deeper ice sheet were treated as unknown variables within the retrieval framework. Each parameter is defined with its possible range and set to be uniformly distributed. The Markov Chain Monte Carlo (MCMC) approach is applied to make the unknown parameters randomly walk in the parameter space. We investigate whether the variables can be improved over priors using the MCMC approach and contribute to the temperature retrieval theoretically. UWBRAD measurements near camp century from 2016 was also treated with the MCMC to examine the framework with scattering effect. The fine scale density fluctuation is an important parameter. It is the most sensitive yet highly unknown parameter in the estimation framework. Including the fine scale density fluctuation greatly improved the retrieval results. The ice sheet vertical temperature profile, especially the 10m temperature, can be well retrieved via the MCMC process. Future retrieval work will apply the Bayesian approach to UWBRAD airborne measurements.
IMAGINE: Interstellar MAGnetic field INference Engine
NASA Astrophysics Data System (ADS)
Steininger, Theo
2018-03-01
IMAGINE (Interstellar MAGnetic field INference Engine) performs inference on generic parametric models of the Galaxy. The modular open source framework uses highly optimized tools and technology such as the MultiNest sampler (ascl:1109.006) and the information field theory framework NIFTy (ascl:1302.013) to create an instance of the Milky Way based on a set of parameters for physical observables, using Bayesian statistics to judge the mismatch between measured data and model prediction. The flexibility of the IMAGINE framework allows for simple refitting for newly available data sets and makes state-of-the-art Bayesian methods easily accessible particularly for random components of the Galactic magnetic field.
Non-Bayesian Optical Inference Machines
NASA Astrophysics Data System (ADS)
Kadar, Ivan; Eichmann, George
1987-01-01
In a recent paper, Eichmann and Caulfield) presented a preliminary exposition of optical learning machines suited for use in expert systems. In this paper, we extend the previous ideas by introducing learning as a means of reinforcement by information gathering and reasoning with uncertainty in a non-Bayesian framework2. More specifically, the non-Bayesian approach allows the representation of total ignorance (not knowing) as opposed to assuming equally likely prior distributions.
Asteroid orbital error analysis: Theory and application
NASA Technical Reports Server (NTRS)
Muinonen, K.; Bowell, Edward
1992-01-01
We present a rigorous Bayesian theory for asteroid orbital error estimation in which the probability density of the orbital elements is derived from the noise statistics of the observations. For Gaussian noise in a linearized approximation the probability density is also Gaussian, and the errors of the orbital elements at a given epoch are fully described by the covariance matrix. The law of error propagation can then be applied to calculate past and future positional uncertainty ellipsoids (Cappellari et al. 1976, Yeomans et al. 1987, Whipple et al. 1991). To our knowledge, this is the first time a Bayesian approach has been formulated for orbital element estimation. In contrast to the classical Fisherian school of statistics, the Bayesian school allows a priori information to be formally present in the final estimation. However, Bayesian estimation does give the same results as Fisherian estimation when no priori information is assumed (Lehtinen 1988, and reference therein).
Bayesian Inference for Time Trends in Parameter Values using Weighted Evidence Sets
DOE Office of Scientific and Technical Information (OSTI.GOV)
D. L. Kelly; A. Malkhasyan
2010-09-01
There is a nearly ubiquitous assumption in PSA that parameter values are at least piecewise-constant in time. As a result, Bayesian inference tends to incorporate many years of plant operation, over which there have been significant changes in plant operational and maintenance practices, plant management, etc. These changes can cause significant changes in parameter values over time; however, failure to perform Bayesian inference in the proper time-dependent framework can mask these changes. Failure to question the assumption of constant parameter values, and failure to perform Bayesian inference in the proper time-dependent framework were noted as important issues in NUREG/CR-6813, performedmore » for the U. S. Nuclear Regulatory Commission’s Advisory Committee on Reactor Safeguards in 2003. That report noted that “in-dustry lacks tools to perform time-trend analysis with Bayesian updating.” This paper describes an applica-tion of time-dependent Bayesian inference methods developed for the European Commission Ageing PSA Network. These methods utilize open-source software, implementing Markov chain Monte Carlo sampling. The paper also illustrates an approach to incorporating multiple sources of data via applicability weighting factors that address differences in key influences, such as vendor, component boundaries, conditions of the operating environment, etc.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dana L. Kelly; Albert Malkhasyan
2010-06-01
There is a nearly ubiquitous assumption in PSA that parameter values are at least piecewise-constant in time. As a result, Bayesian inference tends to incorporate many years of plant operation, over which there have been significant changes in plant operational and maintenance practices, plant management, etc. These changes can cause significant changes in parameter values over time; however, failure to perform Bayesian inference in the proper time-dependent framework can mask these changes. Failure to question the assumption of constant parameter values, and failure to perform Bayesian inference in the proper time-dependent framework were noted as important issues in NUREG/CR-6813, performedmore » for the U. S. Nuclear Regulatory Commission’s Advisory Committee on Reactor Safeguards in 2003. That report noted that “industry lacks tools to perform time-trend analysis with Bayesian updating.” This paper describes an application of time-dependent Bayesian inference methods developed for the European Commission Ageing PSA Network. These methods utilize open-source software, implementing Markov chain Monte Carlo sampling. The paper also illustrates the development of a generic prior distribution, which incorporates multiple sources of generic data via weighting factors that address differences in key influences, such as vendor, component boundaries, conditions of the operating environment, etc.« less
Sallam, Hesham M; Seiffert, Erik R
2016-01-01
The Fayum Depression of Egypt has yielded fossils of hystricognathous rodents from multiple Eocene and Oligocene horizons that range in age from ∼37 to ∼30 Ma and document several phases in the early evolution of crown Hystricognathi and one of its major subclades, Phiomorpha. Here we describe two new genera and species of basal phiomorphs, Birkamys korai and Mubhammys vadumensis, based on rostra and maxillary and mandibular remains from the terminal Eocene (∼34 Ma) Fayum Locality 41 (L-41). Birkamys is the smallest known Paleogene hystricognath, has very simple molars, and, like derived Oligocene-to-Recent phiomorphs (but unlike contemporaneous and older taxa) apparently retained dP(4)∕4 late into life, with no evidence for P(4)∕4 eruption or formation. Mubhammys is very similar in dental morphology to Birkamys, and also shows no evidence for P(4)∕4 formation or eruption, but is considerably larger. Though parsimony analysis with all characters equally weighted places Birkamys and Mubhammys as sister taxa of extant Thryonomys to the exclusion of much younger relatives of that genus, all other methods (standard Bayesian inference, Bayesian "tip-dating," and parsimony analysis with scaled transitions between "fixed" and polymorphic states) place these species in more basal positions within Hystricognathi, as sister taxa of Oligocene-to-Recent phiomorphs. We also employ tip-dating as a means for estimating the ages of early hystricognath-bearing localities, many of which are not well-constrained by geological, geochronological, or biostratigraphic evidence. By simultaneously taking into account phylogeny, evolutionary rates, and uniform priors that appropriately encompass the range of possible ages for fossil localities, dating of tips in this Bayesian framework allows paleontologists to move beyond vague and assumption-laden "stage of evolution" arguments in biochronology to provide relatively rigorous age assessments of poorly-constrained faunas. This approach should become increasingly robust as estimates are combined from multiple independent analyses of distantly related clades, and is broadly applicable across the tree of life; as such it is deserving of paleontologists' close attention. Notably, in the example provided here, hystricognathous rodents from Libya and Namibia that are controversially considered to be of middle Eocene age are instead estimated to be of late Eocene and late Oligocene age, respectively. Finally, we reconstruct the evolution of first lower molar size among Paleogene African hystricognaths using a Bayesian approach; the results of this analysis reconstruct a rapid latest Eocene dwarfing event along the lineage leading to Birkamys.
ERIC Educational Resources Information Center
Marcoulides, Katerina M.
2018-01-01
This study examined the use of Bayesian analysis methods for the estimation of item parameters in a two-parameter logistic item response theory model. Using simulated data under various design conditions with both informative and non-informative priors, the parameter recovery of Bayesian analysis methods were examined. Overall results showed that…
Zonta, Zivko J; Flotats, Xavier; Magrí, Albert
2014-08-01
The procedure commonly used for the assessment of the parameters included in activated sludge models (ASMs) relies on the estimation of their optimal value within a confidence region (i.e. frequentist inference). Once optimal values are estimated, parameter uncertainty is computed through the covariance matrix. However, alternative approaches based on the consideration of the model parameters as probability distributions (i.e. Bayesian inference), may be of interest. The aim of this work is to apply (and compare) both Bayesian and frequentist inference methods when assessing uncertainty for an ASM-type model, which considers intracellular storage and biomass growth, simultaneously. Practical identifiability was addressed exclusively considering respirometric profiles based on the oxygen uptake rate and with the aid of probabilistic global sensitivity analysis. Parameter uncertainty was thus estimated according to both the Bayesian and frequentist inferential procedures. Results were compared in order to evidence the strengths and weaknesses of both approaches. Since it was demonstrated that Bayesian inference could be reduced to a frequentist approach under particular hypotheses, the former can be considered as a more generalist methodology. Hence, the use of Bayesian inference is encouraged for tackling inferential issues in ASM environments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Xingyuan; Miller, Gretchen R.; Rubin, Yoram
2012-09-13
The heat pulse method is widely used to measure water flux through plants; it works by inferring the velocity of water through a porous medium from the speed at which a heat pulse is propagated through the system. No systematic, non-destructive calibration procedure exists to determine the site-specific parameters necessary for calculating sap velocity, e.g., wood thermal diffusivity and probe spacing. Such parameter calibration is crucial to obtain the correct transpiration flux density from the sap flow measurements at the plant scale; and consequently, to up-scale tree-level water fluxes to canopy and landscape scales. The purpose of this study ismore » to present a statistical framework for estimating the wood thermal diffusivity and probe spacing simutaneously from in-situ heat response curves collected by the implanted probes of a heat ratio apparatus. Conditioned on the time traces of wood temperature following a heat pulse, the parameters are inferred using a Bayesian inversion technique, based on the Markov chain Monte Carlo sampling method. The primary advantage of the proposed methodology is that it does not require known probe spacing or any further intrusive sampling of sapwood. The Bayesian framework also enables direct quantification of uncertainty in estimated sap flow velocity. Experiments using synthetic data show that repeated tests using the same apparatus are essential to obtain reliable and accurate solutions. When applied to field conditions, these tests are conducted during different seasons and automated using the existing data logging system. The seasonality of wood thermal diffusivity is obtained as a by-product of the parameter estimation process, and it is shown to be affected by both moisture content and temperature. Empirical factors are often introduced to account for the influence of non-ideal probe geometry on the estimation of heat pulse velocity, and they are estimated in this study as well. The proposed methodology can be applied for the calibration of existing heat ratio sap flow systems at other sites. It is especially useful when an alternative transpiration calibration device, such as a lysimeter, is not available.« less
Probabilistic segmentation and intensity estimation for microarray images.
Gottardo, Raphael; Besag, Julian; Stephens, Matthew; Murua, Alejandro
2006-01-01
We describe a probabilistic approach to simultaneous image segmentation and intensity estimation for complementary DNA microarray experiments. The approach overcomes several limitations of existing methods. In particular, it (a) uses a flexible Markov random field approach to segmentation that allows for a wider range of spot shapes than existing methods, including relatively common 'doughnut-shaped' spots; (b) models the image directly as background plus hybridization intensity, and estimates the two quantities simultaneously, avoiding the common logical error that estimates of foreground may be less than those of the corresponding background if the two are estimated separately; and (c) uses a probabilistic modeling approach to simultaneously perform segmentation and intensity estimation, and to compute spot quality measures. We describe two approaches to parameter estimation: a fast algorithm, based on the expectation-maximization and the iterated conditional modes algorithms, and a fully Bayesian framework. These approaches produce comparable results, and both appear to offer some advantages over other methods. We use an HIV experiment to compare our approach to two commercial software products: Spot and Arrayvision.
Palacios, Julia A; Minin, Vladimir N
2013-03-01
Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method. Copyright © 2013, The International Biometric Society.
Improving the quantification of contrast enhanced ultrasound using a Bayesian approach
NASA Astrophysics Data System (ADS)
Rizzo, Gaia; Tonietto, Matteo; Castellaro, Marco; Raffeiner, Bernd; Coran, Alessandro; Fiocco, Ugo; Stramare, Roberto; Grisan, Enrico
2017-03-01
Contrast Enhanced Ultrasound (CEUS) is a sensitive imaging technique to assess tissue vascularity, that can be useful in the quantification of different perfusion patterns. This can be particularly important in the early detection and staging of arthritis. In a recent study we have shown that a Gamma-variate can accurately quantify synovial perfusion and it is flexible enough to describe many heterogeneous patterns. Moreover, we have shown that through a pixel-by-pixel analysis the quantitative information gathered characterizes more effectively the perfusion. However, the SNR ratio of the data and the nonlinearity of the model makes the parameter estimation difficult. Using classical non-linear-leastsquares (NLLS) approach the number of unreliable estimates (those with an asymptotic coefficient of variation greater than a user-defined threshold) is significant, thus affecting the overall description of the perfusion kinetics and of its heterogeneity. In this work we propose to solve the parameter estimation at the pixel level within a Bayesian framework using Variational Bayes (VB), and an automatic and data-driven prior initialization. When evaluating the pixels for which both VB and NLLS provided reliable estimates, we demonstrated that the parameter values provided by the two methods are well correlated (Pearson's correlation between 0.85 and 0.99). Moreover, the mean number of unreliable pixels drastically reduces from 54% (NLLS) to 26% (VB), without increasing the computational time (0.05 s/pixel for NLLS and 0.07 s/pixel for VB). When considering the efficiency of the algorithms as computational time per reliable estimate, VB outperforms NLLS (0.11 versus 0.25 seconds per reliable estimate respectively).
McNally, Richard J.; Heeren, Alexandre; Robinaugh, Donald J.
2017-01-01
ABSTRACT Background: The network approach to mental disorders offers a novel framework for conceptualizing posttraumatic stress disorder (PTSD) as a causal system of interacting symptoms. Objective: In this study, we extended this work by estimating the structure of relations among PTSD symptoms in adults reporting personal histories of childhood sexual abuse (CSA; N = 179). Method: We employed two complementary methods. First, using the graphical LASSO, we computed a sparse, regularized partial correlation network revealing associations (edges) between pairs of PTSD symptoms (nodes). Next, using a Bayesian approach, we computed a directed acyclic graph (DAG) to estimate a directed, potentially causal model of the relations among symptoms. Results: For the first network, we found that physiological reactivity to reminders of trauma, dreams about the trauma, and lost of interest in previously enjoyed activities were highly central nodes. However, stability analyses suggest that these findings were unstable across subsets of our sample. The DAG suggests that becoming physiologically reactive and upset in response to reminders of the trauma may be key drivers of other symptoms in adult survivors of CSA. Conclusions: Our study illustrates the strengths and limitations of these network analytic approaches to PTSD. PMID:29038690
The Empirical Distribution of Singletons for Geographic Samples of DNA Sequences.
Cubry, Philippe; Vigouroux, Yves; François, Olivier
2017-01-01
Rare variants are important for drawing inference about past demographic events in a species history. A singleton is a rare variant for which genetic variation is carried by a unique chromosome in a sample. How singletons are distributed across geographic space provides a local measure of genetic diversity that can be measured at the individual level. Here, we define the empirical distribution of singletons in a sample of chromosomes as the proportion of the total number of singletons that each chromosome carries, and we present a theoretical background for studying this distribution. Next, we use computer simulations to evaluate the potential for the empirical distribution of singletons to provide a description of genetic diversity across geographic space. In a Bayesian framework, we show that the empirical distribution of singletons leads to accurate estimates of the geographic origin of range expansions. We apply the Bayesian approach to estimating the origin of the cultivated plant species Pennisetum glaucum [L.] R. Br . (pearl millet) in Africa, and find support for range expansion having started from Northern Mali. Overall, we report that the empirical distribution of singletons is a useful measure to analyze results of sequencing projects based on large scale sampling of individuals across geographic space.
Quantum state estimation when qubits are lost: a no-data-left-behind approach
Williams, Brian P.; Lougovski, Pavel
2017-04-06
We present an approach to Bayesian mean estimation of quantum states using hyperspherical parametrization and an experiment-specific likelihood which allows utilization of all available data, even when qubits are lost. With this method, we report the first closed-form Bayesian mean and maximum likelihood estimates for the ideal single qubit. Due to computational constraints, we utilize numerical sampling to determine the Bayesian mean estimate for a photonic two-qubit experiment in which our novel analysis reduces burdens associated with experimental asymmetries and inefficiencies. This method can be applied to quantum states of any dimension and experimental complexity.
Vilar, M J; Ranta, J; Virtanen, S; Korkeala, H
2015-01-01
Bayesian analysis was used to estimate the pig's and herd's true prevalence of enteropathogenic Yersinia in serum samples collected from Finnish pig farms. The sensitivity and specificity of the diagnostic test were also estimated for the commercially available ELISA which is used for antibody detection against enteropathogenic Yersinia. The Bayesian analysis was performed in two steps; the first step estimated the prior true prevalence of enteropathogenic Yersinia with data obtained from a systematic review of the literature. In the second step, data of the apparent prevalence (cross-sectional study data), prior true prevalence (first step), and estimated sensitivity and specificity of the diagnostic methods were used for building the Bayesian model. The true prevalence of Yersinia in slaughter-age pigs was 67.5% (95% PI 63.2-70.9). The true prevalence of Yersinia in sows was 74.0% (95% PI 57.3-82.4). The estimates of sensitivity and specificity values of the ELISA were 79.5% and 96.9%.
Estimating abundance and density of Amur tigers along the Sino-Russian border.
Xiao, Wenhong; Feng, Limin; Mou, Pu; Miquelle, Dale G; Hebblewhite, Mark; Goldberg, Joshua F; Robinson, Hugh S; Zhao, Xiaodan; Zhou, Bo; Wang, Tianming; Ge, Jianping
2016-07-01
As an apex predator the Amur tiger (Panthera tigris altaica) could play a pivotal role in maintaining the integrity of forest ecosystems in Northeast Asia. Due to habitat loss and harvest over the past century, tigers rapidly declined in China and are now restricted to the Russian Far East and bordering habitat in nearby China. To facilitate restoration of the tiger in its historical range, reliable estimates of population size are essential to assess effectiveness of conservation interventions. Here we used camera trap data collected in Hunchun National Nature Reserve from April to June 2013 and 2014 to estimate tiger density and abundance using both maximum likelihood and Bayesian spatially explicit capture-recapture (SECR) methods. A minimum of 8 individuals were detected in both sample periods and the documentation of marking behavior and reproduction suggests the presence of a resident population. Using Bayesian SECR modeling within the 11 400 km(2) state space, density estimates were 0.33 and 0.40 individuals/100 km(2) in 2013 and 2014, respectively, corresponding to an estimated abundance of 38 and 45 animals for this transboundary Sino-Russian population. In a maximum likelihood framework, we estimated densities of 0.30 and 0.24 individuals/100 km(2) corresponding to abundances of 34 and 27, in 2013 and 2014, respectively. These density estimates are comparable to other published estimates for resident Amur tiger populations in the Russian Far East. This study reveals promising signs of tiger recovery in Northeast China, and demonstrates the importance of connectivity between the Russian and Chinese populations for recovering tigers in Northeast China. © 2016 International Society of Zoological Sciences, Institute of Zoology/Chinese Academy of Sciences and John Wiley & Sons Australia, Ltd.
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range between the true value and the maximum likelihood estimated value lines.
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-04-01
Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
Posterior Predictive Model Checking in Bayesian Networks
ERIC Educational Resources Information Center
Crawford, Aaron
2014-01-01
This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex…
Bayesian sample size calculations in phase II clinical trials using a mixture of informative priors.
Gajewski, Byron J; Mayo, Matthew S
2006-08-15
A number of researchers have discussed phase II clinical trials from a Bayesian perspective. A recent article by Mayo and Gajewski focuses on sample size calculations, which they determine by specifying an informative prior distribution and then calculating a posterior probability that the true response will exceed a prespecified target. In this article, we extend these sample size calculations to include a mixture of informative prior distributions. The mixture comes from several sources of information. For example consider information from two (or more) clinicians. The first clinician is pessimistic about the drug and the second clinician is optimistic. We tabulate the results for sample size design using the fact that the simple mixture of Betas is a conjugate family for the Beta- Binomial model. We discuss the theoretical framework for these types of Bayesian designs and show that the Bayesian designs in this paper approximate this theoretical framework. Copyright 2006 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Wang, Hongrui; Wang, Cheng; Wang, Ying; Gao, Xiong; Yu, Chen
2017-06-01
This paper presents a Bayesian approach using Metropolis-Hastings Markov Chain Monte Carlo algorithm and applies this method for daily river flow rate forecast and uncertainty quantification for Zhujiachuan River using data collected from Qiaotoubao Gage Station and other 13 gage stations in Zhujiachuan watershed in China. The proposed method is also compared with the conventional maximum likelihood estimation (MLE) for parameter estimation and quantification of associated uncertainties. While the Bayesian method performs similarly in estimating the mean value of daily flow rate, it performs over the conventional MLE method on uncertainty quantification, providing relatively narrower reliable interval than the MLE confidence interval and thus more precise estimation by using the related information from regional gage stations. The Bayesian MCMC method might be more favorable in the uncertainty analysis and risk management.
Generalizability of Evidence-Based Assessment Recommendations for Pediatric Bipolar Disorder
Jenkins, Melissa M.; Youngstrom, Eric A.; Youngstrom, Jennifer Kogos; Feeny, Norah C.; Findling, Robert L.
2013-01-01
Bipolar disorder is frequently clinically diagnosed in youths who do not actually satisfy DSM-IV criteria, yet cases that would satisfy full DSM-IV criteria are often undetected clinically. Evidence-based assessment methods that incorporate Bayesian reasoning have demonstrated improved diagnostic accuracy, and consistency; however, their clinical utility is largely unexplored. The present study examines the effectiveness of promising evidence-based decision-making compared to the clinical gold standard. Participants were 562 youth, ages 5-17 and predominantly African American, drawn from a community mental health clinic. Research diagnoses combined semi-structured interview with youths’ psychiatric, developmental, and family mental health histories. Independent Bayesian estimates relied on published risk estimates from other samples discriminated bipolar diagnoses, Area Under Curve=.75, p<.00005. The Bayes and confidence ratings correlated rs =.30. Agreement about an evidence-based assessment intervention “threshold model” (wait/assess/treat) had K=.24, p<.05. No potential moderators of agreement between the Bayesian estimates and confidence ratings, including type of bipolar illness, were significant. Bayesian risk estimates were highly correlated with logistic regression estimates using optimal sample weights, r=.81, p<.0005. Clinical and Bayesian approaches agree in terms of overall concordance and deciding next clinical action, even when Bayesian predictions are based on published estimates from clinically and demographically different samples. Evidence-based assessment methods may be useful in settings that cannot routinely employ gold standard assessments, and they may help decrease rates of overdiagnosis while promoting earlier identification of true cases. PMID:22004538
Zheng, Qi; Grice, Elizabeth A
2016-10-01
Accurate mapping of next-generation sequencing (NGS) reads to reference genomes is crucial for almost all NGS applications and downstream analyses. Various repetitive elements in human and other higher eukaryotic genomes contribute in large part to ambiguously (non-uniquely) mapped reads. Most available NGS aligners attempt to address this by either removing all non-uniquely mapping reads, or reporting one random or "best" hit based on simple heuristics. Accurate estimation of the mapping quality of NGS reads is therefore critical albeit completely lacking at present. Here we developed a generalized software toolkit "AlignerBoost", which utilizes a Bayesian-based framework to accurately estimate mapping quality of ambiguously mapped NGS reads. We tested AlignerBoost with both simulated and real DNA-seq and RNA-seq datasets at various thresholds. In most cases, but especially for reads falling within repetitive regions, AlignerBoost dramatically increases the mapping precision of modern NGS aligners without significantly compromising the sensitivity even without mapping quality filters. When using higher mapping quality cutoffs, AlignerBoost achieves a much lower false mapping rate while exhibiting comparable or higher sensitivity compared to the aligner default modes, therefore significantly boosting the detection power of NGS aligners even using extreme thresholds. AlignerBoost is also SNP-aware, and higher quality alignments can be achieved if provided with known SNPs. AlignerBoost's algorithm is computationally efficient, and can process one million alignments within 30 seconds on a typical desktop computer. AlignerBoost is implemented as a uniform Java application and is freely available at https://github.com/Grice-Lab/AlignerBoost.
Uncertainty quantification in LES of channel flow
Safta, Cosmin; Blaylock, Myra; Templeton, Jeremy; ...
2016-07-12
Here, in this paper, we present a Bayesian framework for estimating joint densities for large eddy simulation (LES) sub-grid scale model parameters based on canonical forced isotropic turbulence direct numerical simulation (DNS) data. The framework accounts for noise in the independent variables, and we present alternative formulations for accounting for discrepancies between model and data. To generate probability densities for flow characteristics, posterior densities for sub-grid scale model parameters are propagated forward through LES of channel flow and compared with DNS data. Synthesis of the calibration and prediction results demonstrates that model parameters have an explicit filter width dependence andmore » are highly correlated. Discrepancies between DNS and calibrated LES results point to additional model form inadequacies that need to be accounted for.« less
Shahian, David M; He, Xia; Jacobs, Jeffrey P; Kurlansky, Paul A; Badhwar, Vinay; Cleveland, Joseph C; Fazzalari, Frank L; Filardo, Giovanni; Normand, Sharon-Lise T; Furnary, Anthony P; Magee, Mitchell J; Rankin, J Scott; Welke, Karl F; Han, Jane; O'Brien, Sean M
2015-10-01
Previous composite performance measures of The Society of Thoracic Surgeons (STS) were estimated at the STS participant level, typically a hospital or group practice. The STS Quality Measurement Task Force has now developed a multiprocedural, multidimensional composite measure suitable for estimating the performance of individual surgeons. The development sample from the STS National Database included 621,489 isolated coronary artery bypass grafting procedures, isolated aortic valve replacement, aortic valve replacement plus coronary artery bypass grafting, mitral, or mitral plus coronary artery bypass grafting procedures performed by 2,286 surgeons between July 1, 2011, and June 30, 2014. Each surgeon's composite score combined their aggregate risk-adjusted mortality and major morbidity rates (each weighted inversely by their standard deviations) and reflected the proportion of case types they performed. Model parameters were estimated in a Bayesian framework. Composite star ratings were examined using 90%, 95%, or 98% Bayesian credible intervals. Measure reliability was estimated using various 3-year case thresholds. The final composite measure was defined as 0.81 × (1 minus risk-standardized mortality rate) + 0.19 × (1 minus risk-standardized complication rate). Risk-adjusted mortality (median, 2.3%; interquartile range, 1.7% to 3.0%), morbidity (median, 13.7%; interquartile range, 10.8% to 17.1%), and composite scores (median, 95.4%; interquartile range, 94.4% to 96.3%) varied substantially across surgeons. Using 98% Bayesian credible intervals, there were 207 1-star (lower performance) surgeons (9.1%), 1,701 2-star (as-expected performance) surgeons (74.4%), and 378 3-star (higher performance) surgeons (16.5%). With an eligibility threshold of 100 cases over 3 years, measure reliability was 0.81. The STS has developed a multiprocedural composite measure suitable for evaluating performance at the individual surgeon level. Copyright © 2015 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
Modeling two strains of disease via aggregate-level infectivity curves.
Romanescu, Razvan; Deardon, Rob
2016-04-01
Well formulated models of disease spread, and efficient methods to fit them to observed data, are powerful tools for aiding the surveillance and control of infectious diseases. Our project considers the problem of the simultaneous spread of two related strains of disease in a context where spatial location is the key driver of disease spread. We start our modeling work with the individual level models (ILMs) of disease transmission, and extend these models to accommodate the competing spread of the pathogens in a two-tier hierarchical population (whose levels we refer to as 'farm' and 'animal'). The postulated interference mechanism between the two strains is a period of cross-immunity following infection. We also present a framework for speeding up the computationally intensive process of fitting the ILM to data, typically done using Markov chain Monte Carlo (MCMC) in a Bayesian framework, by turning the inference into a two-stage process. First, we approximate the number of animals infected on a farm over time by infectivity curves. These curves are fit to data sampled from farms, using maximum likelihood estimation, then, conditional on the fitted curves, Bayesian MCMC inference proceeds for the remaining parameters. Finally, we use posterior predictive distributions of salient epidemic summary statistics, in order to assess the model fitted.
Fundamentals and Recent Developments in Approximate Bayesian Computation
Lintusaari, Jarno; Gutmann, Michael U.; Dutta, Ritabrata; Kaski, Samuel; Corander, Jukka
2017-01-01
Abstract Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.] PMID:28175922
NASA Astrophysics Data System (ADS)
Lee, K. David; Wiesenfeld, Eric; Gelfand, Andrew
2007-04-01
One of the greatest challenges in modern combat is maintaining a high level of timely Situational Awareness (SA). In many situations, computational complexity and accuracy considerations make the development and deployment of real-time, high-level inference tools very difficult. An innovative hybrid framework that combines Bayesian inference, in the form of Bayesian Networks, and Possibility Theory, in the form of Fuzzy Logic systems, has recently been introduced to provide a rigorous framework for high-level inference. In previous research, the theoretical basis and benefits of the hybrid approach have been developed. However, lacking is a concrete experimental comparison of the hybrid framework with traditional fusion methods, to demonstrate and quantify this benefit. The goal of this research, therefore, is to provide a statistical analysis on the comparison of the accuracy and performance of hybrid network theory, with pure Bayesian and Fuzzy systems and an inexact Bayesian system approximated using Particle Filtering. To accomplish this task, domain specific models will be developed under these different theoretical approaches and then evaluated, via Monte Carlo Simulation, in comparison to situational ground truth to measure accuracy and fidelity. Following this, a rigorous statistical analysis of the performance results will be performed, to quantify the benefit of hybrid inference to other fusion tools.
Characterizing the Nash equilibria of three-player Bayesian quantum games
NASA Astrophysics Data System (ADS)
Solmeyer, Neal; Balu, Radhakrishnan
2017-05-01
Quantum games with incomplete information can be studied within a Bayesian framework. We analyze games quantized within the EWL framework [Eisert, Wilkens, and Lewenstein, Phys Rev. Lett. 83, 3077 (1999)]. We solve for the Nash equilibria of a variety of two-player quantum games and compare the results to the solutions of the corresponding classical games. We then analyze Bayesian games where there is uncertainty about the player types in two-player conflicting interest games. The solutions to the Bayesian games are found to have a phase diagram-like structure where different equilibria exist in different parameter regions, depending both on the amount of uncertainty and the degree of entanglement. We find that in games where a Pareto-optimal solution is not a Nash equilibrium, it is possible for the quantized game to have an advantage over the classical version. In addition, we analyze the behavior of the solutions as the strategy choices approach an unrestricted operation. We find that some games have a continuum of solutions, bounded by the solutions of a simpler restricted game. A deeper understanding of Bayesian quantum game theory could lead to novel quantum applications in a multi-agent setting.
Testing adaptive toolbox models: a Bayesian hierarchical approach.
Scheibehenne, Benjamin; Rieskamp, Jörg; Wagenmakers, Eric-Jan
2013-01-01
Many theories of human cognition postulate that people are equipped with a repertoire of strategies to solve the tasks they face. This theoretical framework of a cognitive toolbox provides a plausible account of intra- and interindividual differences in human behavior. Unfortunately, it is often unclear how to rigorously test the toolbox framework. How can a toolbox model be quantitatively specified? How can the number of toolbox strategies be limited to prevent uncontrolled strategy sprawl? How can a toolbox model be formally tested against alternative theories? The authors show how these challenges can be met by using Bayesian inference techniques. By means of parameter recovery simulations and the analysis of empirical data across a variety of domains (i.e., judgment and decision making, children's cognitive development, function learning, and perceptual categorization), the authors illustrate how Bayesian inference techniques allow toolbox models to be quantitatively specified, strategy sprawl to be contained, and toolbox models to be rigorously tested against competing theories. The authors demonstrate that their approach applies at the individual level but can also be generalized to the group level with hierarchical Bayesian procedures. The suggested Bayesian inference techniques represent a theoretical and methodological advancement for toolbox theories of cognition and behavior.
Bayesian Factor Analysis as a Variable Selection Problem: Alternative Priors and Consequences
Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric
2016-01-01
Factor analysis is a popular statistical technique for multivariate data analysis. Developments in the structural equation modeling framework have enabled the use of hybrid confirmatory/exploratory approaches in which factor loading structures can be explored relatively flexibly within a confirmatory factor analysis (CFA) framework. Recently, a Bayesian structural equation modeling (BSEM) approach (Muthén & Asparouhov, 2012) has been proposed as a way to explore the presence of cross-loadings in CFA models. We show that the issue of determining factor loading patterns may be formulated as a Bayesian variable selection problem in which Muthén and Asparouhov’s approach can be regarded as a BSEM approach with ridge regression prior (BSEM-RP). We propose another Bayesian approach, denoted herein as the Bayesian structural equation modeling with spike and slab prior (BSEM-SSP), which serves as a one-stage alternative to the BSEM-RP. We review the theoretical advantages and disadvantages of both approaches and compare their empirical performance relative to two modification indices-based approaches and exploratory factor analysis with target rotation. A teacher stress scale data set (Byrne, 2012; Pettegrew & Wolf, 1982) is used to demonstrate our approach. PMID:27314566
With or without you: predictive coding and Bayesian inference in the brain
Aitchison, Laurence; Lengyel, Máté
2018-01-01
Two theoretical ideas have emerged recently with the ambition to provide a unifying functional explanation of neural population coding and dynamics: predictive coding and Bayesian inference. Here, we describe the two theories and their combination into a single framework: Bayesian predictive coding. We clarify how the two theories can be distinguished, despite sharing core computational concepts and addressing an overlapping set of empirical phenomena. We argue that predictive coding is an algorithmic / representational motif that can serve several different computational goals of which Bayesian inference is but one. Conversely, while Bayesian inference can utilize predictive coding, it can also be realized by a variety of other representations. We critically evaluate the experimental evidence supporting Bayesian predictive coding and discuss how to test it more directly. PMID:28942084
Down, P M; Bradley, A J; Breen, J E; Browne, W J; Kypraios, T; Green, M J
2016-10-01
Importance of the dry period with respect to mastitis control is now well established although the precise interventions that reduce the risk of acquiring intramammary infections during this time are not clearly understood. There are very few intervention studies that have measured the clinical efficacy of specific mastitis interventions within a cost-effectiveness framework so there remains a large degree of uncertainty about the impact of a specific intervention and its costeffectiveness. The aim of this study was to use a Bayesian framework to investigate the cost-effectiveness of mastitis controls during the dry period. Data were assimilated from 77 UK dairy farms that participated in a British national mastitis control programme during 2009-2012 in which the majority of intramammary infections were acquired during the dry period. The data consisted of clinical mastitis (CM) and somatic cell count (SCC) records, herd management practices and details of interventions that were implemented by the farmer as part of the control plan. The outcomes used to measure the effectiveness of the interventions were i) changes in the incidence rate of clinical mastitis during the first 30days after calving and ii) the rate at which cows gained new infections during the dry period (measured by SCC changes across the dry period from <200,000cells/ml to >200,000cells/ml). A Bayesian one-step microsimulation model was constructed such that posterior predictions from the model incorporated uncertainty in all parameters. The incremental net benefit was calculated across 10,000 Markov chain Monte Carlo iterations, to estimate the cost-benefit (and associated uncertainty) of each mastitis intervention. Interventions identified as being cost-effective in most circumstances included selecting dry-cow therapy at the cow level, dry-cow rations formulated by a qualified nutritionist, use of individual calving pens, first milking cows within 24h of calving and spreading bedding evenly in dry-cow yards. The results of this study highlighted the efficacy of specific mastitis interventions in UK conditions which, when incorporated into a costeffectiveness framework, can be used to optimize decision making in mastitis control. This intervention study provides an example of how an intuitive and clinically useful Bayesian approach can be used to form the basis of an on-farm decision support tool. Copyright © 2016. Published by Elsevier B.V.
Probabilistic Approaches for Multi-Hazard Risk Assessment of Structures and Systems
NASA Astrophysics Data System (ADS)
Kwag, Shinyoung
Performance assessment of structures, systems, and components for multi-hazard scenarios has received significant attention in recent years. However, the concept of multi-hazard analysis is quite broad in nature and the focus of existing literature varies across a wide range of problems. In some cases, such studies focus on hazards that either occur simultaneously or are closely correlated with each other. For example, seismically induced flooding or seismically induced fires. In other cases, multi-hazard studies relate to hazards that are not dependent or correlated but have strong likelihood of occurrence at different times during the lifetime of a structure. The current approaches for risk assessment need enhancement to account for multi-hazard risks. It must be able to account for uncertainty propagation in a systems-level analysis, consider correlation among events or failure modes, and allow integration of newly available information from continually evolving simulation models, experimental observations, and field measurements. This dissertation presents a detailed study that proposes enhancements by incorporating Bayesian networks and Bayesian updating within a performance-based probabilistic framework. The performance-based framework allows propagation of risk as well as uncertainties in the risk estimates within a systems analysis. Unlike conventional risk assessment techniques such as a fault-tree analysis, a Bayesian network can account for statistical dependencies and correlations among events/hazards. The proposed approach is extended to develop a risk-informed framework for quantitative validation and verification of high fidelity system-level simulation tools. Validation of such simulations can be quite formidable within the context of a multi-hazard risk assessment in nuclear power plants. The efficiency of this approach lies in identification of critical events, components, and systems that contribute to the overall risk. Validation of any event or component on the critical path is relatively more important in a risk-informed environment. Significance of multi-hazard risk is also illustrated for uncorrelated hazards of earthquakes and high winds which may result in competing design objectives. It is also illustrated that the number of computationally intensive nonlinear simulations needed in performance-based risk assessment for external hazards can be significantly reduced by using the power of Bayesian updating in conjunction with the concept of equivalent limit-state.
2014-10-02
intervals (Neil, Tailor, Marquez, Fenton , & Hear, 2007). This is cumbersome, error prone and usually inaccurate. Even though a universal framework...Science. Neil, M., Tailor, M., Marquez, D., Fenton , N., & Hear. (2007). Inference in Bayesian networks using dynamic discretisation. Statistics
Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2004 - Annual Report
This report describes EPA's Hierarchical Bayesian model-generated (HBM) estimates of O3 and PM2.5 concentrations throughout the continental United States during the 2004 calendar year. HBM estimates provide the spatial and temporal variance of O3 ...
Bayesian Estimation Supersedes the "t" Test
ERIC Educational Resources Information Center
Kruschke, John K.
2013-01-01
Bayesian estimation for 2 groups provides complete distributions of credible values for the effect size, group means and their difference, standard deviations and their difference, and the normality of the data. The method handles outliers. The decision rule can accept the null value (unlike traditional "t" tests) when certainty in the estimate is…
In this paper, we present methods for estimating Freundlich isotherm fitting parameters (K and N) and their joint uncertainty, which have been implemented into the freeware software platforms R and WinBUGS. These estimates were determined by both Frequentist and Bayesian analyse...
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students’ e-learning via Facebook through Bayesian structural equation modeling
Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019
Flood quantile estimation at ungauged sites by Bayesian networks
NASA Astrophysics Data System (ADS)
Mediero, L.; Santillán, D.; Garrote, L.
2012-04-01
Estimating flood quantiles at a site for which no observed measurements are available is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. The most common technique used is the multiple regression analysis, which relates physical and climatic basin characteristic to flood quantiles. Regression equations are fitted from flood frequency data and basin characteristics at gauged sites. Regression equations are a rigid technique that assumes linear relationships between variables and cannot take the measurement errors into account. In addition, the prediction intervals are estimated in a very simplistic way from the variance of the residuals in the estimated model. Bayesian networks are a probabilistic computational structure taken from the field of Artificial Intelligence, which have been widely and successfully applied to many scientific fields like medicine and informatics, but application to the field of hydrology is recent. Bayesian networks infer the joint probability distribution of several related variables from observations through nodes, which represent random variables, and links, which represent causal dependencies between them. A Bayesian network is more flexible than regression equations, as they capture non-linear relationships between variables. In addition, the probabilistic nature of Bayesian networks allows taking the different sources of estimation uncertainty into account, as they give a probability distribution as result. A homogeneous region in the Tagus Basin was selected as case study. A regression equation was fitted taking the basin area, the annual maximum 24-hour rainfall for a given recurrence interval and the mean height as explanatory variables. Flood quantiles at ungauged sites were estimated by Bayesian networks. Bayesian networks need to be learnt from a huge enough data set. As observational data are reduced, a stochastic generator of synthetic data was developed. Synthetic basin characteristics were randomised, keeping the statistical properties of observed physical and climatic variables in the homogeneous region. The synthetic flood quantiles were stochastically generated taking the regression equation as basis. The learnt Bayesian network was validated by the reliability diagram, the Brier Score and the ROC diagram, which are common measures used in the validation of probabilistic forecasts. Summarising, the flood quantile estimations through Bayesian networks supply information about the prediction uncertainty as a probability distribution function of discharges is given as result. Therefore, the Bayesian network model has application as a decision support for water resources and planning management.
NASA Astrophysics Data System (ADS)
Cheung, Shao-Yong; Lee, Chieh-Han; Yu, Hwa-Lung
2017-04-01
Due to the limited hydrogeological observation data and high levels of uncertainty within, parameter estimation of the groundwater model has been an important issue. There are many methods of parameter estimation, for example, Kalman filter provides a real-time calibration of parameters through measurement of groundwater monitoring wells, related methods such as Extended Kalman Filter and Ensemble Kalman Filter are widely applied in groundwater research. However, Kalman Filter method is limited to linearity. This study propose a novel method, Bayesian Maximum Entropy Filtering, which provides a method that can considers the uncertainty of data in parameter estimation. With this two methods, we can estimate parameter by given hard data (certain) and soft data (uncertain) in the same time. In this study, we use Python and QGIS in groundwater model (MODFLOW) and development of Extended Kalman Filter and Bayesian Maximum Entropy Filtering in Python in parameter estimation. This method may provide a conventional filtering method and also consider the uncertainty of data. This study was conducted through numerical model experiment to explore, combine Bayesian maximum entropy filter and a hypothesis for the architecture of MODFLOW groundwater model numerical estimation. Through the virtual observation wells to simulate and observe the groundwater model periodically. The result showed that considering the uncertainty of data, the Bayesian maximum entropy filter will provide an ideal result of real-time parameters estimation.
A Bayesian kriging approach for blending satellite and ground precipitation observations
Verdin, Andrew P.; Rajagopalan, Balaji; Kleiber, William; Funk, Christopher C.
2015-01-01
Drought and flood management practices require accurate estimates of precipitation. Gauge observations, however, are often sparse in regions with complicated terrain, clustered in valleys, and of poor quality. Consequently, the spatial extent of wet events is poorly represented. Satellite-derived precipitation data are an attractive alternative, though they tend to underestimate the magnitude of wet events due to their dependency on retrieval algorithms and the indirect relationship between satellite infrared observations and precipitation intensities. Here we offer a Bayesian kriging approach for blending precipitation gauge data and the Climate Hazards Group Infrared Precipitation satellite-derived precipitation estimates for Central America, Colombia, and Venezuela. First, the gauge observations are modeled as a linear function of satellite-derived estimates and any number of other variables—for this research we include elevation. Prior distributions are defined for all model parameters and the posterior distributions are obtained simultaneously via Markov chain Monte Carlo sampling. The posterior distributions of these parameters are required for spatial estimation, and thus are obtained prior to implementing the spatial kriging model. This functional framework is applied to model parameters obtained by sampling from the posterior distributions, and the residuals of the linear model are subject to a spatial kriging model. Consequently, the posterior distributions and uncertainties of the blended precipitation estimates are obtained. We demonstrate this method by applying it to pentadal and monthly total precipitation fields during 2009. The model's performance and its inherent ability to capture wet events are investigated. We show that this blending method significantly improves upon the satellite-derived estimates and is also competitive in its ability to represent wet events. This procedure also provides a means to estimate a full conditional distribution of the “true” observed precipitation value at each grid cell.
NASA Astrophysics Data System (ADS)
Cui, Y.; Brioude, J. F.; Angevine, W. M.; McKeen, S. A.; Henze, D. K.; Bousserez, N.; Liu, Z.; McDonald, B.; Peischl, J.; Ryerson, T. B.; Frost, G. J.; Trainer, M.
2016-12-01
Production of unconventional natural gas grew rapidly during the past ten years in the US which led to an increase in emissions of methane (CH4) and, depending on the shale region, nitrogen oxides (NOx). In terms of radiative forcing, CH4 is the second most important greenhouse gas after CO2. NOx is a precursor of ozone (O3) in the troposphere and nitrate particles, both of which are regulated by the US Clean Air Act. Emission estimates of CH4 and NOx from the shale regions are still highly uncertain. We present top-down estimates of CH4 and NOx surface fluxes from the Haynesville and Fayetteville shale production regions using aircraft data collected during the Southeast Nexus of Climate Change and Air Quality (SENEX) field campaign (June-July, 2013) and the Shale Oil and Natural Gas Nexus (SONGNEX) field campaign (March-May, 2015) within a mesoscale inversion framework. The inversion method is based on a mesoscale Bayesian inversion system using multiple transport models. EPA's 2011 National CH4 and NOx Emission Inventories are used as prior information to optimize CH4 and NOx emissions. Furthermore, the posterior CH4 emission estimates are used to constrain NOx emission estimates using a flux ratio inversion technique. Sensitivity of the posterior estimates to the use of off-diagonal terms in the error covariance matrices, the transport models, and prior estimates is discussed. Compared to the ground-based in-situ observations, the optimized CH4 and NOx inventories improve ground level CH4 and O3 concentrations calculated by the Weather Research and Forecasting mesoscale model coupled with chemistry (WRF-Chem).
NASA Astrophysics Data System (ADS)
Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn
2013-04-01
SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.
Lorenz, Romy; Monti, Ricardo Pio; Violante, Inês R.; Anagnostopoulos, Christoforos; Faisal, Aldo A.; Montana, Giovanni; Leech, Robert
2016-01-01
Functional neuroimaging typically explores how a particular task activates a set of brain regions. Importantly though, the same neural system can be activated by inherently different tasks. To date, there is no approach available that systematically explores whether and how distinct tasks probe the same neural system. Here, we propose and validate an alternative framework, the Automatic Neuroscientist, which turns the standard fMRI approach on its head. We use real-time fMRI in combination with modern machine-learning techniques to automatically design the optimal experiment to evoke a desired target brain state. In this work, we present two proof-of-principle studies involving perceptual stimuli. In both studies optimization algorithms of varying complexity were employed; the first involved a stochastic approximation method while the second incorporated a more sophisticated Bayesian optimization technique. In the first study, we achieved convergence for the hypothesized optimum in 11 out of 14 runs in less than 10 min. Results of the second study showed how our closed-loop framework accurately and with high efficiency estimated the underlying relationship between stimuli and neural responses for each subject in one to two runs: with each run lasting 6.3 min. Moreover, we demonstrate that using only the first run produced a reliable solution at a group-level. Supporting simulation analyses provided evidence on the robustness of the Bayesian optimization approach for scenarios with low contrast-to-noise ratio. This framework is generalizable to numerous applications, ranging from optimizing stimuli in neuroimaging pilot studies to tailoring clinical rehabilitation therapy to patients and can be used with multiple imaging modalities in humans and animals. PMID:26804778
Lorenz, Romy; Monti, Ricardo Pio; Violante, Inês R; Anagnostopoulos, Christoforos; Faisal, Aldo A; Montana, Giovanni; Leech, Robert
2016-04-01
Functional neuroimaging typically explores how a particular task activates a set of brain regions. Importantly though, the same neural system can be activated by inherently different tasks. To date, there is no approach available that systematically explores whether and how distinct tasks probe the same neural system. Here, we propose and validate an alternative framework, the Automatic Neuroscientist, which turns the standard fMRI approach on its head. We use real-time fMRI in combination with modern machine-learning techniques to automatically design the optimal experiment to evoke a desired target brain state. In this work, we present two proof-of-principle studies involving perceptual stimuli. In both studies optimization algorithms of varying complexity were employed; the first involved a stochastic approximation method while the second incorporated a more sophisticated Bayesian optimization technique. In the first study, we achieved convergence for the hypothesized optimum in 11 out of 14 runs in less than 10 min. Results of the second study showed how our closed-loop framework accurately and with high efficiency estimated the underlying relationship between stimuli and neural responses for each subject in one to two runs: with each run lasting 6.3 min. Moreover, we demonstrate that using only the first run produced a reliable solution at a group-level. Supporting simulation analyses provided evidence on the robustness of the Bayesian optimization approach for scenarios with low contrast-to-noise ratio. This framework is generalizable to numerous applications, ranging from optimizing stimuli in neuroimaging pilot studies to tailoring clinical rehabilitation therapy to patients and can be used with multiple imaging modalities in humans and animals. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Bayesian inference for disease prevalence using negative binomial group testing
Pritchard, Nicholas A.; Tebbs, Joshua M.
2011-01-01
Group testing, also known as pooled testing, and inverse sampling are both widely used methods of data collection when the goal is to estimate a small proportion. Taking a Bayesian approach, we consider the new problem of estimating disease prevalence from group testing when inverse (negative binomial) sampling is used. Using different distributions to incorporate prior knowledge of disease incidence and different loss functions, we derive closed form expressions for posterior distributions and resulting point and credible interval estimators. We then evaluate our new estimators, on Bayesian and classical grounds, and apply our methods to a West Nile Virus data set. PMID:21259308
a Bayesian Synthesis of Predictions from Different Models for Setting Water Quality Criteria
NASA Astrophysics Data System (ADS)
Arhonditsis, G. B.; Ecological Modelling Laboratory
2011-12-01
Skeptical views of the scientific value of modelling argue that there is no true model of an ecological system, but rather several adequate descriptions of different conceptual basis and structure. In this regard, rather than picking the single "best-fit" model to predict future system responses, we can use Bayesian model averaging to synthesize the forecasts from different models. Hence, by acknowledging that models from different areas of the complexity spectrum have different strengths and weaknesses, the Bayesian model averaging is an appealing approach to improve the predictive capacity and to overcome the ambiguity surrounding the model selection or the risk of basing ecological forecasts on a single model. Our study addresses this question using a complex ecological model, developed by Ramin et al. (2011; Environ Modell Softw 26, 337-353) to guide the water quality criteria setting process in the Hamilton Harbour (Ontario, Canada), along with a simpler plankton model that considers the interplay among phosphate, detritus, and generic phytoplankton and zooplankton state variables. This simple approach is more easily subjected to detailed sensitivity analysis and also has the advantage of fewer unconstrained parameters. Using Markov Chain Monte Carlo simulations, we calculate the relative mean standard error to assess the posterior support of the two models from the existing data. Predictions from the two models are then combined using the respective standard error estimates as weights in a weighted model average. The model averaging approach is used to examine the robustness of predictive statements made from our earlier work regarding the response of Hamilton Harbour to the different nutrient loading reduction strategies. The two eutrophication models are then used in conjunction with the SPAtially Referenced Regressions On Watershed attributes (SPARROW) watershed model. The Bayesian nature of our work is used: (i) to alleviate problems of spatiotemporal resolution mismatch between watershed and receiving waterbody models; and (ii) to overcome the conceptual or scale misalignment between processes of interest and supporting information. The proposed Bayesian approach provides an effective means of empirically estimating the relation between in-stream measurements of nutrient fluxes and the sources/sinks of nutrients within the watershed, while explicitly accounting for the uncertainty associated with the existing knowledge from the system along with the different types of spatial correlation typically underlying the parameter estimation of watershed models. Our modelling exercise offers the first estimates of the export coefficients and the delivery rates from the different subcatchments and thus generates testable hypotheses regarding the nutrient export "hot spots" in the studied watershed. Finally, we conduct modeling experiments that evaluate the potential improvement of the model parameter estimates and the decrease of the predictive uncertainty, if the uncertainty associated with the contemporary nutrient loading estimates is reduced. The lessons learned from this study will contribute towards the development of integrated modelling frameworks.
NASA Astrophysics Data System (ADS)
Farrell, Kathryn; Oden, J. Tinsley; Faghihi, Danial
2015-08-01
A general adaptive modeling algorithm for selection and validation of coarse-grained models of atomistic systems is presented. A Bayesian framework is developed to address uncertainties in parameters, data, and model selection. Algorithms for computing output sensitivities to parameter variances, model evidence and posterior model plausibilities for given data, and for computing what are referred to as Occam Categories in reference to a rough measure of model simplicity, make up components of the overall approach. Computational results are provided for representative applications.
Model-based Bayesian filtering of cardiac contaminants from biomedical recordings.
Sameni, R; Shamsollahi, M B; Jutten, C
2008-05-01
Electrocardiogram (ECG) and magnetocardiogram (MCG) signals are among the most considerable sources of noise for other biomedical signals. In some recent works, a Bayesian filtering framework has been proposed for denoising the ECG signals. In this paper, it is shown that this framework may be effectively used for removing cardiac contaminants such as the ECG, MCG and ballistocardiographic artifacts from different biomedical recordings such as the electroencephalogram, electromyogram and also for canceling maternal cardiac signals from fetal ECG/MCG. The proposed method is evaluated on simulated and real signals.
Wang, Hongrui; Wang, Cheng; Wang, Ying; ...
2017-04-05
This paper presents a Bayesian approach using Metropolis-Hastings Markov Chain Monte Carlo algorithm and applies this method for daily river flow rate forecast and uncertainty quantification for Zhujiachuan River using data collected from Qiaotoubao Gage Station and other 13 gage stations in Zhujiachuan watershed in China. The proposed method is also compared with the conventional maximum likelihood estimation (MLE) for parameter estimation and quantification of associated uncertainties. While the Bayesian method performs similarly in estimating the mean value of daily flow rate, it performs over the conventional MLE method on uncertainty quantification, providing relatively narrower reliable interval than the MLEmore » confidence interval and thus more precise estimation by using the related information from regional gage stations. As a result, the Bayesian MCMC method might be more favorable in the uncertainty analysis and risk management.« less
Survival Bayesian Estimation of Exponential-Gamma Under Linex Loss Function
NASA Astrophysics Data System (ADS)
Rizki, S. W.; Mara, M. N.; Sulistianingsih, E.
2017-06-01
This paper elaborates a research of the cancer patients after receiving a treatment in cencored data using Bayesian estimation under Linex Loss function for Survival Model which is assumed as an exponential distribution. By giving Gamma distribution as prior and likelihood function produces a gamma distribution as posterior distribution. The posterior distribution is used to find estimatior {\\hat{λ }}BL by using Linex approximation. After getting {\\hat{λ }}BL, the estimators of hazard function {\\hat{h}}BL and survival function {\\hat{S}}BL can be found. Finally, we compare the result of Maximum Likelihood Estimation (MLE) and Linex approximation to find the best method for this observation by finding smaller MSE. The result shows that MSE of hazard and survival under MLE are 2.91728E-07 and 0.000309004 and by using Bayesian Linex worths 2.8727E-07 and 0.000304131, respectively. It concludes that the Bayesian Linex is better than MLE.
A Bayesian approach to tracking patients having changing pharmacokinetic parameters
NASA Technical Reports Server (NTRS)
Bayard, David S.; Jelliffe, Roger W.
2004-01-01
This paper considers the updating of Bayesian posterior densities for pharmacokinetic models associated with patients having changing parameter values. For estimation purposes it is proposed to use the Interacting Multiple Model (IMM) estimation algorithm, which is currently a popular algorithm in the aerospace community for tracking maneuvering targets. The IMM algorithm is described, and compared to the multiple model (MM) and Maximum A-Posteriori (MAP) Bayesian estimation methods, which are presently used for posterior updating when pharmacokinetic parameters do not change. Both the MM and MAP Bayesian estimation methods are used in their sequential forms, to facilitate tracking of changing parameters. Results indicate that the IMM algorithm is well suited for tracking time-varying pharmacokinetic parameters in acutely ill and unstable patients, incurring only about half of the integrated error compared to the sequential MM and MAP methods on the same example.
A Bayesian Approach to More Stable Estimates of Group-Level Effects in Contextual Studies.
Zitzmann, Steffen; Lüdtke, Oliver; Robitzsch, Alexander
2015-01-01
Multilevel analyses are often used to estimate the effects of group-level constructs. However, when using aggregated individual data (e.g., student ratings) to assess a group-level construct (e.g., classroom climate), the observed group mean might not provide a reliable measure of the unobserved latent group mean. In the present article, we propose a Bayesian approach that can be used to estimate a multilevel latent covariate model, which corrects for the unreliable assessment of the latent group mean when estimating the group-level effect. A simulation study was conducted to evaluate the choice of different priors for the group-level variance of the predictor variable and to compare the Bayesian approach with the maximum likelihood approach implemented in the software Mplus. Results showed that, under problematic conditions (i.e., small number of groups, predictor variable with a small ICC), the Bayesian approach produced more accurate estimates of the group-level effect than the maximum likelihood approach did.
Bayesian Analysis of Biogeography when the Number of Areas is Large
Landis, Michael J.; Matzke, Nicholas J.; Moore, Brian R.; Huelsenbeck, John P.
2013-01-01
Historical biogeography is increasingly studied from an explicitly statistical perspective, using stochastic models to describe the evolution of species range as a continuous-time Markov process of dispersal between and extinction within a set of discrete geographic areas. The main constraint of these methods is the computational limit on the number of areas that can be specified. We propose a Bayesian approach for inferring biogeographic history that extends the application of biogeographic models to the analysis of more realistic problems that involve a large number of areas. Our solution is based on a “data-augmentation” approach, in which we first populate the tree with a history of biogeographic events that is consistent with the observed species ranges at the tips of the tree. We then calculate the likelihood of a given history by adopting a mechanistic interpretation of the instantaneous-rate matrix, which specifies both the exponential waiting times between biogeographic events and the relative probabilities of each biogeographic change. We develop this approach in a Bayesian framework, marginalizing over all possible biogeographic histories using Markov chain Monte Carlo (MCMC). Besides dramatically increasing the number of areas that can be accommodated in a biogeographic analysis, our method allows the parameters of a given biogeographic model to be estimated and different biogeographic models to be objectively compared. Our approach is implemented in the program, BayArea. [ancestral area analysis; Bayesian biogeographic inference; data augmentation; historical biogeography; Markov chain Monte Carlo.] PMID:23736102
An efficient Bayesian meta-analysis approach for studying cross-phenotype genetic associations
Majumdar, Arunabha; Haldar, Tanushree; Bhattacharya, Sourabh; Witte, John S.
2018-01-01
Simultaneous analysis of genetic associations with multiple phenotypes may reveal shared genetic susceptibility across traits (pleiotropy). For a locus exhibiting overall pleiotropy, it is important to identify which specific traits underlie this association. We propose a Bayesian meta-analysis approach (termed CPBayes) that uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. This method uses a unified Bayesian statistical framework based on a spike and slab prior. CPBayes performs a fully Bayesian analysis by employing the Markov Chain Monte Carlo (MCMC) technique Gibbs sampling. It takes into account heterogeneity in the size and direction of the genetic effects across traits. It can be applied to both cohort data and separate studies of multiple traits having overlapping or non-overlapping subjects. Simulations show that CPBayes can produce higher accuracy in the selection of associated traits underlying a pleiotropic signal than the subset-based meta-analysis ASSET. We used CPBayes to undertake a genome-wide pleiotropic association study of 22 traits in the large Kaiser GERA cohort and detected six independent pleiotropic loci associated with at least two phenotypes. This includes a locus at chromosomal region 1q24.2 which exhibits an association simultaneously with the risk of five different diseases: Dermatophytosis, Hemorrhoids, Iron Deficiency, Osteoporosis and Peripheral Vascular Disease. We provide an R-package ‘CPBayes’ implementing the proposed method. PMID:29432419
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lu, Dan; Ricciuto, Daniel; Walker, Anthony
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this study, a Differential Evolution Adaptive Metropolis (DREAM) algorithm was used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The DREAM is a multi-chainmore » method and uses differential evolution technique for chain movement, allowing it to be efficiently applied to high-dimensional problems, and can reliably estimate heavy-tailed and multimodal distributions that are difficult for single-chain schemes using a Gaussian proposal distribution. The results were evaluated against the popular Adaptive Metropolis (AM) scheme. DREAM indicated that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identified one mode. The calibration of DREAM resulted in a better model fit and predictive performance compared to the AM. DREAM provides means for a good exploration of the posterior distributions of model parameters. Lastly, it reduces the risk of false convergence to a local optimum and potentially improves the predictive performance of the calibrated model.« less
Lu, Dan; Ricciuto, Daniel; Walker, Anthony; ...
2017-02-22
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this study, a Differential Evolution Adaptive Metropolis (DREAM) algorithm was used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The DREAM is a multi-chainmore » method and uses differential evolution technique for chain movement, allowing it to be efficiently applied to high-dimensional problems, and can reliably estimate heavy-tailed and multimodal distributions that are difficult for single-chain schemes using a Gaussian proposal distribution. The results were evaluated against the popular Adaptive Metropolis (AM) scheme. DREAM indicated that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identified one mode. The calibration of DREAM resulted in a better model fit and predictive performance compared to the AM. DREAM provides means for a good exploration of the posterior distributions of model parameters. Lastly, it reduces the risk of false convergence to a local optimum and potentially improves the predictive performance of the calibrated model.« less
The worth of data in predicting aquitard continuity in hydrogeological design
NASA Astrophysics Data System (ADS)
James, Bruce R.; Freeze, R. Allan
1993-07-01
A Bayesian decision framework is developed for addressing questions of hydrogeological data worth associated with engineering design at sites in heterogeneous geological environments. The specific case investigated is one of remedial contaminant containment in an aquifer underlain by an aquitard of uncertain continuity. The framework is used to evaluate the worth of hard and soft data in investigating the aquitard's continuity. The analysis consists of four modules: (1) an aquitard realization generator based on indicator kriging, (2) a procedure for the Bayesian updating of the uncertainty with respect to aquitard windows, (3) a Monte Carlo simulation model for advective contaminant transport, and (4) an economic decision model. A sensitivity analysis for a generic design example involving a design decision between a no-action alternative and a containment alternative indicates that the data worth of a single borehole providing a hard point datum was more sensitive to economic parameters than to hydrogeological or geostatistical parameters. For this case, data worth is very sensitive to the projected cost of containment, the discount rate, and the estimated cost of failure. When it comes to hydrogeological parameters, such as the representative hydraulic conductivity of the aquitard or underlying aquifer, the sensitivity analysis indicates that it is more important to know whether the field value is above or below some threshold value than it is to know its actual numerical value. A good conceptual understanding of the site geology is important in estimating prior uncertainties. The framework was applied in a retrospective fashion to the design of a remediation program for soil contaminated by radioactive waste disposal at the Savannah River site in South Carolina. The cost-effectiveness of different patterns of boreholes was studied. A contour map is presented for the net expected value of sample information (EVSI) for a single borehole. The net EVSI of patterns of precise point measurements is also compared to that of an imprecise seismic survey.
NASA Astrophysics Data System (ADS)
Smith, D. E.; Felizardo, C.; Minson, S. E.; Boese, M.; Langbein, J. O.; Murray, J. R.
2016-12-01
Finite-fault source algorithms can greatly benefit earthquake early warning (EEW) systems. Estimates of finite-fault parameters provide spatial information, which can significantly improve real-time shaking calculations and help with disaster response. In this project, we have focused on integrating a finite-fault seismic-geodetic algorithm into the West Coast ShakeAlert framework. The seismic part is FinDer 2, a C++ version of the algorithm developed by Böse et al. (2012). It interpolates peak ground accelerations and calculates the best fault length and strike from template matching. The geodetic part is a C++ version of BEFORES, the algorithm developed by Minson et al. (2014) that uses a Bayesian methodology to search for the most probable slip distribution on a fault of unknown orientation. Ultimately, these two will be used together where FinDer generates a Bayesian prior for BEFORES via the methodology of Minson et al. (2015), and the joint solution will generate estimates of finite-fault extent, strike, dip, best slip distribution, and magnitude. We have created C++ versions of both FinDer and BEFORES using open source libraries and have developed a C++ Application Protocol Interface (API) for them both. Their APIs allow FinDer and BEFORES to contribute to the ShakeAlert system via an open source messaging system, ActiveMQ. FinDer has been receiving real-time data, detecting earthquakes, and reporting messages on the development system for several months. We are also testing FinDer extensively with Earthworm tankplayer files. BEFORES has been tested with ActiveMQ messaging in the ShakeAlert framework, and works off a FinDer trigger. We are finishing the FinDer-BEFORES connections in this framework, and testing this system via seismic-geodetic tankplayer files. This will include actual and simulated data.
Word Learning as Bayesian Inference
ERIC Educational Resources Information Center
Xu, Fei; Tenenbaum, Joshua B.
2007-01-01
The authors present a Bayesian framework for understanding how adults and children learn the meanings of words. The theory explains how learners can generalize meaningfully from just one or a few positive examples of a novel word's referents, by making rational inductive inferences that integrate prior knowledge about plausible word meanings with…
A Bayesian Missing Data Framework for Generalized Multiple Outcome Mixed Treatment Comparisons
ERIC Educational Resources Information Center
Hong, Hwanhee; Chu, Haitao; Zhang, Jing; Carlin, Bradley P.
2016-01-01
Bayesian statistical approaches to mixed treatment comparisons (MTCs) are becoming more popular because of their flexibility and interpretability. Many randomized clinical trials report multiple outcomes with possible inherent correlations. Moreover, MTC data are typically sparse (although richer than standard meta-analysis, comparing only two…
ERIC Educational Resources Information Center
Kessler, Lawrence M.
2013-01-01
In this paper I propose Bayesian estimation of a nonlinear panel data model with a fractional dependent variable (bounded between 0 and 1). Specifically, I estimate a panel data fractional probit model which takes into account the bounded nature of the fractional response variable. I outline estimation under the assumption of strict exogeneity as…
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies conservation planning. Journal compilation © 2010 Society for Conservation Biology. No claim to original US government works.
Bayesian wavelet PCA methodology for turbomachinery damage diagnosis under uncertainty
NASA Astrophysics Data System (ADS)
Xu, Shengli; Jiang, Xiaomo; Huang, Jinzhi; Yang, Shuhua; Wang, Xiaofang
2016-12-01
Centrifugal compressor often suffers various defects such as impeller cracking, resulting in forced outage of the total plant. Damage diagnostics and condition monitoring of such a turbomachinery system has become an increasingly important and powerful tool to prevent potential failure in components and reduce unplanned forced outage and further maintenance costs, while improving reliability, availability and maintainability of a turbomachinery system. This paper presents a probabilistic signal processing methodology for damage diagnostics using multiple time history data collected from different locations of a turbomachine, considering data uncertainty and multivariate correlation. The proposed methodology is based on the integration of three advanced state-of-the-art data mining techniques: discrete wavelet packet transform, Bayesian hypothesis testing, and probabilistic principal component analysis. The multiresolution wavelet analysis approach is employed to decompose a time series signal into different levels of wavelet coefficients. These coefficients represent multiple time-frequency resolutions of a signal. Bayesian hypothesis testing is then applied to each level of wavelet coefficient to remove possible imperfections. The ratio of posterior odds Bayesian approach provides a direct means to assess whether there is imperfection in the decomposed coefficients, thus avoiding over-denoising. Power spectral density estimated by the Welch method is utilized to evaluate the effectiveness of Bayesian wavelet cleansing method. Furthermore, the probabilistic principal component analysis approach is developed to reduce dimensionality of multiple time series and to address multivariate correlation and data uncertainty for damage diagnostics. The proposed methodology and generalized framework is demonstrated with a set of sensor data collected from a real-world centrifugal compressor with impeller cracks, through both time series and contour analyses of vibration signal and principal components.
Zhang, Wangshu; Coba, Marcelo P; Sun, Fengzhu
2016-01-11
Protein domains can be viewed as portable units of biological function that defines the functional properties of proteins. Therefore, if a protein is associated with a disease, protein domains might also be associated and define disease endophenotypes. However, knowledge about such domain-disease relationships is rarely available. Thus, identification of domains associated with human diseases would greatly improve our understanding of the mechanism of human complex diseases and further improve the prevention, diagnosis and treatment of these diseases. Based on phenotypic similarities among diseases, we first group diseases into overlapping modules. We then develop a framework to infer associations between domains and diseases through known relationships between diseases and modules, domains and proteins, as well as proteins and disease modules. Different methods including Association, Maximum likelihood estimation (MLE), Domain-disease pair exclusion analysis (DPEA), Bayesian, and Parsimonious explanation (PE) approaches are developed to predict domain-disease associations. We demonstrate the effectiveness of all the five approaches via a series of validation experiments, and show the robustness of the MLE, Bayesian and PE approaches to the involved parameters. We also study the effects of disease modularization in inferring novel domain-disease associations. Through validation, the AUC (Area Under the operating characteristic Curve) scores for Bayesian, MLE, DPEA, PE, and Association approaches are 0.86, 0.84, 0.83, 0.83 and 0.79, respectively, indicating the usefulness of these approaches for predicting domain-disease relationships. Finally, we choose the Bayesian approach to infer domains associated with two common diseases, Crohn's disease and type 2 diabetes. The Bayesian approach has the best performance for the inference of domain-disease relationships. The predicted landscape between domains and diseases provides a more detailed view about the disease mechanisms.
NASA Astrophysics Data System (ADS)
Croce, Pierpaolo; Zappasodi, Filippo; Merla, Arcangelo; Chiarelli, Antonio Maria
2017-08-01
Objective. Electrical and hemodynamic brain activity are linked through the neurovascular coupling process and they can be simultaneously measured through integration of electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS). Thanks to the lack of electro-optical interference, the two procedures can be easily combined and, whereas EEG provides electrophysiological information, fNIRS can provide measurements of two hemodynamic variables, such as oxygenated and deoxygenated hemoglobin. A Bayesian sequential Monte Carlo approach (particle filter, PF) was applied to simulated recordings of electrical and neurovascular mediated hemodynamic activity, and the advantages of a unified framework were shown. Approach. Multiple neural activities and hemodynamic responses were simulated in the primary motor cortex of a subject brain. EEG and fNIRS recordings were obtained by means of forward models of volume conduction and light propagation through the head. A state space model of combined EEG and fNIRS data was built and its dynamic evolution was estimated through a Bayesian sequential Monte Carlo approach (PF). Main results. We showed the feasibility of the procedure and the improvements in both electrical and hemodynamic brain activity reconstruction when using the PF on combined EEG and fNIRS measurements. Significance. The investigated procedure allows one to combine the information provided by the two methodologies, and, by taking advantage of a physical model of the coupling between electrical and hemodynamic response, to obtain a better estimate of brain activity evolution. Despite the high computational demand, application of such an approach to in vivo recordings could fully exploit the advantages of this combined brain imaging technology.
Bayesian model selection applied to artificial neural networks used for water resources modeling
NASA Astrophysics Data System (ADS)
Kingston, Greer B.; Maier, Holger R.; Lambert, Martin F.
2008-04-01
Artificial neural networks (ANNs) have proven to be extremely valuable tools in the field of water resources engineering. However, one of the most difficult tasks in developing an ANN is determining the optimum level of complexity required to model a given problem, as there is no formal systematic model selection method. This paper presents a Bayesian model selection (BMS) method for ANNs that provides an objective approach for comparing models of varying complexity in order to select the most appropriate ANN structure. The approach uses Markov Chain Monte Carlo posterior simulations to estimate the evidence in favor of competing models and, in this study, three known methods for doing this are compared in terms of their suitability for being incorporated into the proposed BMS framework for ANNs. However, it is acknowledged that it can be particularly difficult to accurately estimate the evidence of ANN models. Therefore, the proposed BMS approach for ANNs incorporates a further check of the evidence results by inspecting the marginal posterior distributions of the hidden-to-output layer weights, which unambiguously indicate any redundancies in the hidden layer nodes. The fact that this check is available is one of the greatest advantages of the proposed approach over conventional model selection methods, which do not provide such a test and instead rely on the modeler's subjective choice of selection criterion. The advantages of a total Bayesian approach to ANN development, including training and model selection, are demonstrated on two synthetic and one real world water resources case study.
Hsieh, Chia-Hung; Ko, Chiun-Cheng; Chung, Cheng-Han; Wang, Hurng-Yi
2014-07-01
The sweet potato whitefly, Bemisia tabaci, is a highly differentiated species complex. Despite consisting of several morphologically indistinguishable entities and frequent invasions on all continents with important associated economic losses, the phylogenetic relationships, species status, and evolutionary history of this species complex is still debated. We sequenced and analyzed one mitochondrial and three single-copy nuclear genes from 9 of the 12 genetic groups of B. tabaci and 5 closely related species. Bayesian species delimitation was applied to investigate the speciation events of B. tabaci. The species statuses of the different genetic groups were strongly supported under different prior settings and phylogenetic scenarios. Divergence histories were estimated by a multispecies coalescence approach implemented in (*)BEAST. Based on mitochondrial locus, B. tabaci was originated 6.47 million years ago (MYA). Nevertheless, the time was 1.25MYA based on nuclear loci. According to the method of approximate Bayesian computation, this difference is probably due to different degrees of migration among loci; i.e., although the mitochondrial locus had differentiated, gene flow at nuclear loci was still possible, a scenario similar to parapatric mode of speciation. This is the first study in whiteflies using multilocus data and incorporating Bayesian coalescence approaches, both of which provide a more biologically realistic framework for delimiting species status and delineating the divergence history of B. tabaci. Our study illustrates that gene flow during species divergence should not be overlooked and has a great impact on divergence time estimation. Copyright © 2014 Elsevier Inc. All rights reserved.
Functional Multi-Locus QTL Mapping of Temporal Trends in Scots Pine Wood Traits
Li, Zitong; Hallingbäck, Henrik R.; Abrahamsson, Sara; Fries, Anders; Gull, Bengt Andersson; Sillanpää, Mikko J.; García-Gil, M. Rosario
2014-01-01
Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals. PMID:25305041
Functional multi-locus QTL mapping of temporal trends in Scots pine wood traits.
Li, Zitong; Hallingbäck, Henrik R; Abrahamsson, Sara; Fries, Anders; Gull, Bengt Andersson; Sillanpää, Mikko J; García-Gil, M Rosario
2014-10-09
Quantitative trait loci (QTL) mapping of wood properties in conifer species has focused on single time point measurements or on trait means based on heterogeneous wood samples (e.g., increment cores), thus ignoring systematic within-tree trends. In this study, functional QTL mapping was performed for a set of important wood properties in increment cores from a 17-yr-old Scots pine (Pinus sylvestris L.) full-sib family with the aim of detecting wood trait QTL for general intercepts (means) and for linear slopes by increasing cambial age. Two multi-locus functional QTL analysis approaches were proposed and their performances were compared on trait datasets comprising 2 to 9 time points, 91 to 455 individual tree measurements and genotype datasets of amplified length polymorphisms (AFLP), and single nucleotide polymorphism (SNP) markers. The first method was a multilevel LASSO analysis whereby trend parameter estimation and QTL mapping were conducted consecutively; the second method was our Bayesian linear mixed model whereby trends and underlying genetic effects were estimated simultaneously. We also compared several different hypothesis testing methods under either the LASSO or the Bayesian framework to perform QTL inference. In total, five and four significant QTL were observed for the intercepts and slopes, respectively, across wood traits such as earlywood percentage, wood density, radial fiberwidth, and spiral grain angle. Four of these QTL were represented by candidate gene SNPs, thus providing promising targets for future research in QTL mapping and molecular function. Bayesian and LASSO methods both detected similar sets of QTL given datasets that comprised large numbers of individuals. Copyright © 2014 Li et al.
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-07-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
Kwon, Deukwoo; Hoffman, F Owen; Moroz, Brian E; Simon, Steven L
2016-02-10
Most conventional risk analysis methods rely on a single best estimate of exposure per person, which does not allow for adjustment for exposure-related uncertainty. Here, we propose a Bayesian model averaging method to properly quantify the relationship between radiation dose and disease outcomes by accounting for shared and unshared uncertainty in estimated dose. Our Bayesian risk analysis method utilizes multiple realizations of sets (vectors) of doses generated by a two-dimensional Monte Carlo simulation method that properly separates shared and unshared errors in dose estimation. The exposure model used in this work is taken from a study of the risk of thyroid nodules among a cohort of 2376 subjects who were exposed to fallout from nuclear testing in Kazakhstan. We assessed the performance of our method through an extensive series of simulations and comparisons against conventional regression risk analysis methods. When the estimated doses contain relatively small amounts of uncertainty, the Bayesian method using multiple a priori plausible draws of dose vectors gave similar results to the conventional regression-based methods of dose-response analysis. However, when large and complex mixtures of shared and unshared uncertainties are present, the Bayesian method using multiple dose vectors had significantly lower relative bias than conventional regression-based risk analysis methods and better coverage, that is, a markedly increased capability to include the true risk coefficient within the 95% credible interval of the Bayesian-based risk estimate. An evaluation of the dose-response using our method is presented for an epidemiological study of thyroid disease following radiation exposure. Copyright © 2015 John Wiley & Sons, Ltd.
Sources of interference in item and associative recognition memory.
Osth, Adam F; Dennis, Simon
2015-04-01
A powerful theoretical framework for exploring recognition memory is the global matching framework, in which a cue's memory strength reflects the similarity of the retrieval cues being matched against the contents of memory simultaneously. Contributions at retrieval can be categorized as matches and mismatches to the item and context cues, including the self match (match on item and context), item noise (match on context, mismatch on item), context noise (match on item, mismatch on context), and background noise (mismatch on item and context). We present a model that directly parameterizes the matches and mismatches to the item and context cues, which enables estimation of the magnitude of each interference contribution (item noise, context noise, and background noise). The model was fit within a hierarchical Bayesian framework to 10 recognition memory datasets that use manipulations of strength, list length, list strength, word frequency, study-test delay, and stimulus class in item and associative recognition. Estimates of the model parameters revealed at most a small contribution of item noise that varies by stimulus class, with virtually no item noise for single words and scenes. Despite the unpopularity of background noise in recognition memory models, background noise estimates dominated at retrieval across nearly all stimulus classes with the exception of high frequency words, which exhibited equivalent levels of context noise and background noise. These parameter estimates suggest that the majority of interference in recognition memory stems from experiences acquired before the learning episode. (c) 2015 APA, all rights reserved).
Skew-t partially linear mixed-effects models for AIDS clinical studies.
Lu, Tao
2016-01-01
We propose partially linear mixed-effects models with asymmetry and missingness to investigate the relationship between two biomarkers in clinical studies. The proposed models take into account irregular time effects commonly observed in clinical studies under a semiparametric model framework. In addition, commonly assumed symmetric distributions for model errors are substituted by asymmetric distribution to account for skewness. Further, informative missing data mechanism is accounted for. A Bayesian approach is developed to perform parameter estimation simultaneously. The proposed model and method are applied to an AIDS dataset and comparisons with alternative models are performed.
Tools to estimate PM2.5 mass have expanded in recent years, and now include: 1) stationary monitor readings, 2) Community Multi-Scale Air Quality (CMAQ) model estimates, 3) Hierarchical Bayesian (HB) estimates from combined stationary monitor readings and CMAQ model output; and, ...
Unification of field theory and maximum entropy methods for learning probability densities
NASA Astrophysics Data System (ADS)
Kinney, Justin B.
2015-09-01
The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory thus provides a natural test of the maximum entropy null hypothesis and, furthermore, returns an alternative (lower entropy) density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided.
Unification of field theory and maximum entropy methods for learning probability densities.
Kinney, Justin B
2015-09-01
The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory thus provides a natural test of the maximum entropy null hypothesis and, furthermore, returns an alternative (lower entropy) density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided.